LLM Actions in Agent Studio provide built-in capabilities to leverage large language model (LLM) functionalities directly within your Compound Actions and Conversational Processes. These actions enable tasks such as summarization, reasoning, classification, data extraction, and content generation, allowing you to build workflows without custom integrations.
This documentation focuses on two key LLM Actions: generate_text_action and generate_structured_value_action. These actions are designed to help you process unstructured data, generate insights, and structure outputs efficiently.
The generate_text_action invokes an LLM to produce free-form text output based on user-provided input. This action is ideal for tasks requiring natural language generation, such as summarizing documents, generating responses, or performing step-by-step reasoning.
Use this action when you need unstructured text results, like drafting emails, explaining concepts, or brainstorming ideas.
Here are practical examples demonstrating various LLM abilities. Each includes a sample request schema for integration into a Compound Action.
Summarize a lengthy article or user query into a concise overview.
Generate creative or instructional content, such as drafting a user email.
Guide the LLM through logical reasoning for problem-solving.
Extract text from an uploaded image, such as a receipt. The image parameter maps to a File object collected via a File Slot.
The generate_structured_value_action calls an LLM to extract or generate data in a predefined structured format (JSON schema). This is particularly useful for classification, entity extraction, or transforming unstructured input into queryable data.
Apply this action for tasks where output consistency is critical, such as tagging content, extracting key-value pairs, or categorizing user inputs.
️ additionalProperties: false must always be set in objects.
additionalProperties controls whether it is allowable for an object to contain additional keys / values that were not defined in the JSON Schema.
Examples illustrate extraction, classification, and more. Include request schemas for easy implementation.
Classify a research abstract into predefined topics.
Extract named entities like names, dates, and locations from text.
Classify text sentiment with confidence scores.
Image input is only supported via OpenAI Direct connections. Azure OpenAI connections do not support image input at this time. Supported image formats: PNG, JPEG/JPG, WEBP, and non-animated GIF.