Agent orchestration application - Alibaba Cloud Model Studio

You can create an agent orchestration application to build a group of autonomous agents that collaborate and plan automatically. This topics describes how to create an agent orchestration application and its advanced features.

Why use agent orchestration application

Agent orchestration applications offer several advantages over single-agent applications:

Collaboration and flexibility: Multiple agents can collaborate and share information with each other to complete complex tasks more effectively than a single agent. This capability is particularly beneficial in scenarios that demand multi-party cooperation and parallel processing.
Extensibility and robustness: You can adjust the number and roles of agents in an agent orchestration application, enhancing its flexibility and robustness.
Task decomposition and parallel processing: Agent orchestration applications can break down complex tasks into subtasks, which are then processed concurrently by different agents, improving efficiency.
Automatic planning: Agent orchestration applications can autonomously plan task execution flows and schedule sub-agents based on task requirements
Complete agent functionality: On top of all these capabilities, agent orchestration applications maintain all features of a single agent, including RAG and plug-ins.

Getting started

The following section uses a schedule management assistant as an example to get you started with agent orchestration applications.

Scenario description: Schedule management assistant

Demand

The users need an intelligent assistant to manage and organize their schedules. The objective is to design a system that processes schedule information and organizes it into a structured calendar.

Solution

The task is divided between two independent agents within one agent orchestration application, with each agent handling specific subtasks.

Roles and tasks

Role: Information collector
- Task: Collect schedule information from the users.
- Description: Receives user-input schedules, such as meetings, tasks, and appointments, and forwards this information to the data organizer.
- Sample input: "There is a team meeting at 10 AM tomorrow.""
- Sample output: {"date": "tomorrow", "time": "10 AM", "event": "team meeting"}
Role: Data organizer
- Task: Organize the collected schedule information into a calendar.
- Description: Processes structured data from the information collector, sorting the data by date and time to produce a user-friendly calendar.
- Sample input: {"date": "tomorrow", "time": "10 AM", "event": "team meeting"}
- Sample output: Incorporates the information into the user's calendar: {"2024-08-19": [{"time": "10 AM", "event": "team meeting"}]}

Sample

User input:
"There is a team meeting at 10 AM tomorrow."
Information collector processing:
Converts the input into structured data: {"date": "tomorrow", "time": "10 AM", "event": "team meeting"}

Data organizer processing:

Adds the structured data to the calendar, resulting in:

{
  "user calendar": {
    "2024-08-19": [
      {
        "time": "10 AM",
        "event": "team meeting"
      }
    ]
  }
}

Prompt design

Design separate prompts for the two agents. Sample prompts:

Information collector

You are an intelligent assistant responsible for collecting users' schedules. Users will input their schedule information in natural language. Your task is to parse these inputs and extract the date, time, and event. The output format should be a structured JSON object.
Sample input:
"There is a team meeting at 10 AM tomorrow."
Sample output:
{
  "date": "tomorrow",
  "time": "10 AM",
  "event": "team meeting"
}
When users input their schedules, please return the parsed data in the above format.

Data organizer

You are an intelligent assistant responsible for organizing users' schedules. You will receive structured data passed from another agent and add this data to the user's calendar. Please ensure the data is sorted by date and time and output the updated calendar.

Sample input:
{
  "date": "tomorrow",
  "time": "10 AM",
  "event": "team meeting"
}

Sample output:
{
  "user calendar": {
    "2024-08-19": [
      {
        "time": "10 AM",
        "event": "team meeting"
      }
    ]
  }
}

Please add the received data to the user's calendar and return the updated result.

Create application

Go to My Applications in the Model Studio console, choose Create Application > Agent Orchestration Application > Create Agent Orchestration Application.

Drag an Agent Group node into the canvas.
Remove the two parameters city and date from the Start node.
No input parameter is needed here.

bailian

Specify a group name and select a model.
Group Name: Schedule Management
Select Model: Qwen-Plus

Configure the sub-agents within the agent group.
- Information collector
  Agent Name: Information Collector
  Description: Receives user-input schedules, such as meetings, tasks, and appointments, and forwards this information to the data organizer.
  Model Configuration: Qwen-Plus
  Prompt: see Sample
- Data organizer
  Agent name: Data Organizer
  Description: Processes structured data from the information collector, sorting the data by date and time to produce a user-friendly calendar.
  Model Configuration: Qwen-Plus
  Prompt: see Sample

Connect the nodes and configure the input and output parameters as shown in the following screenshots.

Test, publish, and API call

Click Test on the upper-right corner of canvas to test the application.

Click Publish on the upper-right corner of canvas to publish the application.
Choose Sharing Method > API Call to view the API sample of the application. You can use the sample to call the application you just published.
For more information about the sharing methods, see Application sharing.

Python

from http import HTTPStatus
from dashscope import Application
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'


def call_agent_app():
    response = Application.call(app_id='YOUR_APP_ID',
                                prompt='Weekly meeting at 10 a.m. tomorrow',
                                api_key='YOUR_API_KEY',
                                # If you define any parameters in the Start node, pass them here
                                # biz_params = {
                                #    "city": "Singapore",
                                #    "date": "Tomorrow"
                                # }
                                )

    if response.status_code != HTTPStatus.OK:
        print('request_id=%s, code=%s, message=%s\n' % (response.request_id, response.status_code, response.message))
    else:
        print('request_id=%s\n output=%s\n usage=%s\n' % (response.request_id, response.output, response.usage))


if __name__ == '__main__':
    call_agent_app()

Java

import com.alibaba.dashscope.app.*;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import java.util.List;


public class Main{
      static {
          Constants.baseHttpApiUrl="https://dashscope-intl.aliyuncs.com/api/v1";
      }
      public static void callAgentApp()
            throws ApiException, NoApiKeyException, InputRequiredException {
        // If you define any parameters in the Start node, pass them here
        // String bizParams = "{\"city\":\"Singapore\",\"date\":\"Tomorrow\"}"
        ApplicationParam param = ApplicationParam.builder()
            .apiKey("YOUR_API_KEY")
                .appId("YOUR_APP_ID")
                .prompt("Weekly meeting at 10 a.m. tomorrow")
            //  .bizParams(JsonUtils.parse(bizParams))
                .build();

        Application application = new Application();
        ApplicationResult result = application.call(param);

        System.out.printf("requestId: %s, text: %s, finishReason: %s\n",
                result.getRequestId(), result.getOutput().getText(), result.getOutput().getFinishReason());
    }

    public static void main(String[] args) {
        try {
            callAgentApp();
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            System.out.printf("Exception: %s", e.getMessage());
        }
        System.exit(0);
    }  
}

curl

curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/apps/YOUR_APP_ID/completion' \
--header 'Authorization: Bearer {YOUR_API_KEY}' \
--header 'Content-Type: application/json' \
--data '{
    "input": {
        "prompt": "Weekly meeting at 10 a.m. tomorrow"
    },
# If you define any parameters in the Start node, pass them here
#   "biz_params":{
#       "city": "Singapore",
#       "date": "Tomorrow"  
#   },
    "parameters":  {},
    "debug": {}
}' --verbose

Advanced features

Node description

The following section describes the nodes, covering their capabilities, suitable scenarios, and how to configure them.

Start node

Accepts user input and allows for multiple variables.
Input: query is the default variable for user input.
Custom variables can be defined to capture user input.
Output: The input variables.
Other parameters: None.

End node

Outputs in text or JSON formats.
Input: Content returned for the user, which can include a mix of variables and texts.
Output: The input variables.
Other parameters: None.

Agent Application

An already created agent application from My Applications.
Usage: Import existing agents.
Input: Inputs for the agent.
Output: Content generated by the agent.
Other parameters: Can only be defined in the source agent application.

Create Agent

Creates a new agent that is available only within the canvas.
Usage: Create an agent that is only used in the current application.
Input: Inputs for the agent.
Output: Content generated by the agent.
Other parameters:
- Agent Name: The name of the agent.
  Model Configuration: The LLM of the agent.
  Prompt: The role and task of the agent in natural language.
  Knowledge Retrieval Augmentation: The RAG function. You can configure a knowledge base for the agent.
  Plug-in: The agent can use official or custom plug-ins.

Agent Group

Creates a group of multiple agents that collaborate to complete tasks.
Suggested usage: Ideal for tasks requiring intelligent planning. If you need to complete a complex project without a predefined process, this node is recommended.
Input: Inputs for the agent group.
Output:
- agResult: The content generated by the agent group.
- agProcess: The inference process.
Other parameters:
- Group Name: The name of the agent group.
  Select Model: The LLM for the group.
  Agent: Other parameters of the sub-agents are the same as the Create Agent node.

Decision Classification

Intelligently matches user input to subsequent actions, useful for classifying user intent or task scenarios.
Suggested usage: Ideal for tasks requiring smart decision-making. If you want to leverage an LLM for intelligent process determination, this node is recommended.
Input: Inputs for the decision-making model.
Output:
thought: The inference process.
subject: The matching category.
Other parameters:
- Category Configuration: Configure categories and enter descriptions. The model matches subsequent links based on the descriptions.
- Other Categories: If no other categories are matched, this link is matched.

Text Conversion

Uses a template to convert or process textual contents.
Suggested usage: Ideal for straightforward orchestration of content generated by agent nodes.
Input: Text requiring format conversion. Supports variable insertion and mixed typesetting.
Output: Formatted output content.
Other parameters: None.

Script Conversion

Uses scripts to convert or process textual contents.
Suggested usage: Ideal for orchestrating content generated by agent nodes in JSON format.
Input: Text requiring format conversion.
Output: Outputs in JSON Schema format
Other parameters:
- Code: Conversion code in Python or JavaScript.
  JSON Schema Generator: Generates a JSON Schema based on the target JSON structure.

Conditional Judgement

Perform conditional checks on parameters within the node, and then generate output responses through Text Conversion nodes in different branches.
Suggested usage: Ideal for scenarios where output is generated after conditional checks.
Input: Parameters that require conditional checks.
Output: Generating output through Text Conversion nodes in different branches.
Other parameters: None.

What to do next

For more information about agent applications, see Agent application.

For more information about workflow applications, see Workflow application.