Text categorization with LLMs

Categorize text into one of the provided classes using an LLM.

yaml
type: "io.kestra.plugin.ai.completion.Classification"

Perform sentiment analysis of product reviews

yaml
id: text_categorization
namespace: company.ai

tasks:
  - id: categorize
    type: io.kestra.plugin.ai.completion.Classification
    prompt: "Categorize the sentiment of: I love this product!"
    classes:
      - positive
      - negative
      - neutral
    provider:
      type: io.kestra.plugin.ai.provider.GoogleGemini
      apiKey: "{{ kv('GEMINI_API_KEY') }}"
      modelName: gemini-2.5-flash
Properties
SubType string

Classification Options

The list of possible classification categories

Text prompt

The input text to classify

Language Model Provider

Default {}

Chat configuration

Classification Result

The classified category of the input text

Possible Values
STOPLENGTHTOOL_EXECUTIONCONTENT_FILTEROTHER

Finish reason

Token usage

API endpoint

The Azure OpenAI endpoint in the format: https://{resource}.openai.azure.com/

Model name

API Key

Client ID

Client secret

API version

Tenant ID

Endpoint URL

Project location

Model name

Project ID

API Key

Model name

API Key

Model name

API base URL

Model endpoint

Model name

API Key

Model name

API base URL

Log LLM requests

If true, prompts and configuration sent to the LLM will be logged at INFO level.

Log LLM responses

If true, raw responses from the LLM will be logged at INFO level.

Response format

Defines the expected output format. Default is plain text. Some providers allow requesting JSON or schema-constrained outputs, but support varies and may be incompatible with tool use. When using a JSON schema, the output will be returned under the key jsonOutput.

Seed

Optional random seed for reproducibility. Provide a positive integer (e.g., 42, 1234). Using the same seed with identical settings produces repeatable outputs.

Temperature

Controls randomness in generation. Typical range is 0.0–1.0. Lower values (e.g., 0.2) make outputs more focused and deterministic, while higher values (e.g., 0.7–1.0) increase creativity and variability.

Top-K

Limits sampling to the top K most likely tokens at each step. Typical values are between 20 and 100. Smaller values reduce randomness; larger values allow more diverse outputs.

Top-P (nucleus sampling)

Selects from the smallest set of tokens whose cumulative probability is ≤ topP. Typical values are 0.8–0.95. Lower values make the output more focused, higher values increase diversity.

API Key

Model name

Default https://api.deepseek.com/v1

API base URL

JSON Schema (used when type = JSON)

Provide a JSON Schema describing the expected structure of the response. In Kestra flows, define the schema in YAML (it is still a JSON Schema object). Example (YAML):

text
responseFormat: 
    type: JSON
    jsonSchema: 
      type: object
      required: ["category", "priority"]
      properties: 
        category: 
          type: string
          enum: ["ACCOUNT", "BILLING", "TECHNICAL", "GENERAL"]
        priority: 
          type: string
          enum: ["LOW", "MEDIUM", "HIGH"]

Note: Provider support for strict schema enforcement varies. If unsupported, guide the model about the expected output structure via the prompt and validate downstream.

Schema description (optional)

Natural-language description of the schema to help the model produce the right fields. Example: "Classify a customer ticket into category and priority."

Default TEXT
Possible Values
TEXTJSON

Response format type

Specifies how the LLM should return output. Allowed values:

  • TEXT (default): free-form natural language.
  • JSON: structured output validated against a JSON Schema.

AWS Access Key ID

Model name

AWS Secret Access Key

Default COHERE
Possible Values
COHERETITAN

Amazon Bedrock Embedding Model Type

API Key

Model name