All Products
Search
Document Center

Alibaba Cloud Model Studio:Text generation

Last Updated:Dec 16, 2024

Text generation is an AI technology that uses deep learning algorithms to create logical and coherent text content based on given prompts.

The prompt required for text generation can be simple keywords, one-sentence summaries, or more complex instructions and contextual information. Text generation large language models (LLMs) analyze large amounts of existing data to learn language patterns and can be used in the following scenarios:

  • Content creation: Generate news reports, product descriptions, short video scripts, and others.

  • Customer service: Work as chatbots to provide 24-hour customer support and answer frequently asked questions.

  • Text translation: Quickly and accurately translate texts from one language to another.

  • Summary generation: Generate summaries for long articles, reports, and customer emails.

  • Legal document drafting: Generate contract templates and the basic framework of legal opinions.

Example of text generation: Extract key information from customer emails based on requirements

Prompt

Please extract the following information from this customer feedback email: 1. Customer information 2. Software version 3. Error code 4. Problem description 5. Operating system 6. Other relevant details
---------------
Email content:
Subject: [Help Request] Smart Assistant Startup Issue
Dear Customer Service and Technical Support Team,
I am Tom from xxx Technology, and my user ID is 12345. I am using the Smart Assistant software version V3.1.8, and I am reporting a technical issue that severely affects work progress. I hope you can assist in resolving it as soon as possible.
Since yesterday afternoon, I have encountered a severe obstacle when starting the Smart Assistant. The software startup process is stuck on the initialization screen, with an error code of ERR-2007, and an error message "Database connection failed," causing all functional modules to be unusable. This situation has persisted until now, severely affecting our office efficiency.
I am currently using a server operating system of Windows 10 version 1909, optimized for 64-bit architecture. After encountering the issue, I have taken several preliminary troubleshooting measures, including completely shutting down and restarting the system to clear any potential temporary software conflicts or system hang states, as well as thoroughly uninstalling and reinstalling the relevant software to eliminate possible software corruption or configuration errors. However, despite these routine solutions, the issue persists without any improvement.
To facilitate a more in-depth diagnosis of the issue, I have provided detailed error screenshots and system log files as attachments. These materials should accurately demonstrate the software state, exception details, and any related error codes at the time of the error, providing key information for quickly locating and resolving the current technical obstacle.
Looking forward to your response!

Output

1. Customer information: Tom from xxx Technology (ID12345)
2. Software version: Smart Assistant V3.1.8
3. Error code: ERR-2007
4. Problem description: The Smart Assistant is stuck on the initialization screen during startup, with an error message "Database connection failed," causing all functional modules to be unusable. The issue has persisted since yesterday afternoon.
5. Operating system: Win10 1909 64-bit
6. Other relevant details: Attempted solutions include restarting the system and reinstalling the software, but the issue remains unresolved. Attachments include error screenshots and log files.

Text generation models

Alibaba Cloud Model Studio supports Qwen commercial models and open source models. For a detailed model list, see the text generation category of Model overview.

Commercial Qwen models

Qwen

通义new Qwen-Max

通义new Qwen-Plus

通义new Qwen-Turbo

通义new Qwen-VL

Context

(Tokens)

8,000

131,072

8,000

8,000 - 32,000 (including images)

Output Length

(Tokens)

2,000

8,192

2,000

2,000

Input unit price

(1,000 tokens)

$0.0100

$0.0030

$0.0004

Time-limited free trial

Output unit price

(1,000 tokens)

$0.0300

$0.0090

$0.0012

Time-limited free trial

  • Qwen-Max provides the best inference performance among Qwen models, especially for complex tasks.

  • Qwen-Plus provides a balanced combination of performance, speed, and cost.

  • Qwen-Turbo provides fast speed and low cost, suitable for simple tasks.

  • Qwen-VL is a Large Vision Language Model that supports images with over a million pixels and any aspect ratio as input.

Open-source Qwen models

Open-source Qwen

通义newQwen2

通义newQwen1.5

Context

(Tokens)

65,536 - 131,072

8,000

Output Length

(Tokens)

6,144

2,000

Parameter scale

(B: Billion)

7B - 72B

7B - 110B

How to choose

  • If you are not sure which model is most suitable, we recommend that you try Qwen-Max, the most powerful LLM launched by Alibaba, capable in complex business scenarios.

  • For simple task scenarios, you can try Qwen-Turbo, which is more cost-effective and responds faster.

  • Qwen-Plus is relatively balanced in terms of performance, speed, and cost, between Max and Turbo.

All these three models are compatible with OpenAI interfaces. For more information, see Call Qwen through OpenAI interfaces.

  • You can also fully try and evaluate the models against specific tasks and then make a decision.

    Quickly and intuitively compare model performance in Playground. You can select multiple text generation models and make a horizontal comparison of model capabilities by providing the same input.

How to use

Alibaba Cloud Model Studio supports the following types of connection: OpenAI SDK, DashScope SDK, and HTTP.

For a complete parameter of the OpenAI SDK, see OpenAI. For a complete parameter list of the DashScope SDK, see DashScope.

Message types

When you interact with a LLM through API, the input and output are called messages. Each message belongs to one of the following roles: system, user, and assistant.

  • System message (also known as system prompt): Tells the model the role to play or the behavior to make. The default value is "You are a helpful assistant." You can also place such instructions in the user message, but placing them in the system message is more effective.

  • User message: The text you input to the model.

  • Assistant message: The response from the model. You can also preset an assistant message as an example for subsequent assistant messages.

Single-round conversation

OpenAI compatible

You can use the OpenAI SDK or OpenAI-compatible HTTP method to call Qwen models and experience the single-round conversation feature.

For a complete parameter list, see OpenAI.

Python

Sample code

import os
from openai import OpenAI

try:
    client = OpenAI(
        # If the environment variable is not configured, replace the following line with: api_key="sk-xxx",
        api_key=os.getenv("DASHSCOPE_API_KEY"),
        base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
    )

    completion = client.chat.completions.create(
        model="qwen-plus",
        messages=[
            {'role': 'system', 'content': 'You are a helpful assistant.'},
            {'role': 'user', 'content': 'Who are you?'}
            ]
    )
    print(completion.choices[0].message.content)
except Exception as e:
    print(f"Error message: {e}")
    print("For more information, see: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code")

Sample response

I am a large language model developed by Alibaba Cloud, called Qwen.

cURL

Sample code

curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen-plus", 
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user", 
            "content": "Who are you?"
        }
    ]
}'

Sample response

{
    "choices": [
        {
            "message": {
                "role": "assistant",
                "content": "I am a large language model developed by Alibaba Cloud, called Qwen."
            },
            "finish_reason": "stop",
            "index": 0,
            "logprobs": null
        }
    ],
    "object": "chat.completion",
    "usage": {
        "prompt_tokens": 22,
        "completion_tokens": 17,
        "total_tokens": 39
    },
    "created": 1726127645,
    "system_fingerprint": null,
    "model": "qwen-plus",
    "id": "chatcmpl-81951b98-28b8-9659-ab07-cd30d25600e7"
}

Node.js

Sample code

import OpenAI from "openai";

const openai = new OpenAI(
    {
        // If the environment variable is not configured, replace the following line with: apiKey: "sk-xxx",
        apiKey: process.env.DASHSCOPE_API_KEY,
        baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
    }
);
const completion = await openai.chat.completions.create({
    model: "qwen-plus",  //Model list: https://www.alibabacloud.com/help/en/model-studio/getting-started/models
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "Who are you?" }
    ],
});
console.log(JSON.stringify(completion))

Sample response

{
    "choices": [
        {
            "message": {
                "role": "assistant",
                "content": "I am a large language model developed by Alibaba Cloud, called Qwen."
            },
            "finish_reason": "stop",
            "index": 0,
            "logprobs": null
        }
    ],
    "object": "chat.completion",
    "usage": {
        "prompt_tokens": 22,
        "completion_tokens": 17,
        "total_tokens": 39
    },
    "created": 1728455191,
    "system_fingerprint": null,
    "model": "qwen-plus",
    "id": "chatcmpl-3a8c00cc-9c9f-9aba-b2d9-dc431e27d1b5"
}

DashScope

You can use the DashScope SDK or HTTP method to call Qwen models and experience the single-round conversation feature.

For a complete parameter list, see DashScope.

Python

Sample code

import os
from dashscope import Generation
import dashscope
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

messages = [
    {'role': 'system', 'content': 'You are a helpful assistant.'},
    {'role': 'user', 'content': 'Who are you?'}
    ]
response = Generation.call(
    # If the environment variable is not configured, replace the following line with: api_key = "sk-xxx",
    api_key=os.getenv("DASHSCOPE_API_KEY"), 
    model="qwen-plus",
    messages=messages,
    result_format="message"
)

if response.status_code == 200:
    print(response.output.choices[0].message.content)
else:
    print(f"HTTP return code: {response.status_code}")
    print(f"Error code: {response.code}")
    print(f"Error message: {response.message}")
    print("For more information, see: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code")

Sample response

I am Qwen, an AI assistant developed by Alibaba Cloud. I am designed to answer various questions, provide information, and engage in conversations with users. How can I assist you?

Java

Sample code

import java.util.Arrays;
import java.lang.System;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.protocol.Protocol;

public class Main {
    public static GenerationResult callWithMessage() throws ApiException, NoApiKeyException, InputRequiredException {
        Generation gen = new Generation(Protocol.HTTP.getValue(), "https://dashscope-intl.aliyuncs.com/api/v1");
        Message systemMsg = Message.builder()
                .role(Role.SYSTEM.getValue())
                .content("You are a helpful assistant.")
                .build();
        Message userMsg = Message.builder()
                .role(Role.USER.getValue())
                .content("Who are you?")
                .build();
        GenerationParam param = GenerationParam.builder()
                // If the environment variable is not configured, replace the following line with: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("qwen-plus")
                .messages(Arrays.asList(systemMsg, userMsg))
                .resultFormat(GenerationParam.ResultFormat.MESSAGE)
                .build();
        return gen.call(param);
    }
    public static void main(String[] args) {
        try {
            GenerationResult result = callWithMessage();
            System.out.println(result.getOutput().getChoices().get(0).getMessage().getContent());
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            System.err.println("Error message: "+e.getMessage());
            System.out.println("For more information, see: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code");
        }
        System.exit(0);
    }
}

Sample response

I am a large language model developed by Alibaba Cloud, called Qwen.

cURL

Sample code

curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen-plus",
    "input":{
        "messages":[      
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Who are you?"
            }
        ]
    },
    "parameters": {
        "result_format": "message"
    }
}'

Sample response

{
    "output": {
        "choices": [
            {
                "finish_reason": "stop",
                "message": {
                    "role": "assistant",
                    "content": "I am a large language model developed by Alibaba Cloud, called Qwen."
                }
            }
        ]
    },
    "usage": {
        "total_tokens": 38,
        "output_tokens": 16,
        "input_tokens": 22
    },
    "request_id": "09dceb20-ae2e-999b-85f9-c5ab266198c0"
}

Multi-round conversation

In multi-round conversations, the LLM can reference the conversation history, making it more similar to everyday communication scenarios. To implement multi-round conversation, you need to maintain an array that stores conversation history. For every new round, the array is updated and sent to the model again so that the model can reference conversation history. In the following samples, the conversation history is added to the messages array.

OpenAI compatible

You can use the OpenAI SDK or OpenAI-compatible HTTP method to call Qwen models and experience the multi-round conversation feature.

For a complete parameter list, see OpenAI.

Python

import os
from openai import OpenAI


def get_response(messages):
    client = OpenAI(
        # If the environment variable is not configured, replace the following line with: api_key="sk-xxx",
        api_key=os.getenv("DASHSCOPE_API_KEY"),
        base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
    )
    completion = client.chat.completions.create(model="qwen-plus", messages=messages)
    return completion


messages = [
    {
        "role": "system",
        "content": """You are a clerk at the Bailian Mobile Store, responsible for recommending phones to users. Phones have two parameters: screen size (including 6.1 inches, 6.5 inches, 6.7 inches), resolution (including 2K, 4K).
        You can only ask the user one parameter at a time. If the user does not provide complete information, you need to ask them to provide the missing parameter. Once the parameters are collected, you should say: I have understood your purchase intention, please wait a moment.""",
    }
]
assistant_output = "Welcome to the Bailian Mobile Store, what size phone do you need to buy?"
print(f"Model output: {assistant_output}\n")
while "I have understood your purchase intention" not in assistant_output:
    user_input = input("Please enter: ")
    # Add user question information to the messages list
    messages.append({"role": "user", "content": user_input})
    assistant_output = get_response(messages).choices[0].message.content
    # Add the large model's reply information to the messages list
    messages.append({"role": "assistant", "content": assistant_output})
    print(f"Model output: {assistant_output}")
    print("\n")

cURL

curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen-plus",
    "messages":[      
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": "Hello"
        },
        {
            "role": "assistant",
            "content": "Hello, I am Qwen."
        },
        {
            "role": "user",
            "content": "What skills do you have?"
        }
    ]
}'

Node.js

import OpenAI from "openai";
import { createInterface } from 'readline/promises';

// Define constants
const BASE_URL = "https://dashscope-intl.aliyuncs.com/compatible-mode/v1";
const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: BASE_URL
});

async function getResponse(messages) {
    try {
        const completion = await openai.chat.completions.create({
            model: "qwen-plus",
            messages: messages,
        });
        return completion.choices[0].message.content;
    } catch (error) {
        console.error("Error fetching response:", error);
        throw error;  // Re-throw the exception for upper-level handling
    }
}

// Initialize messages
const messages = [
    {
        "role": "system",
        "content": `You are a clerk at the Bailian Mobile Store, responsible for recommending phones to users. Phones have two parameters: screen size (including 6.1 inches, 6.5 inches, 6.7 inches), resolution (including 2K, 4K).
        You can only ask the user one parameter at a time. If the user does not provide complete information, you need to ask them to provide the missing parameter. Once the parameters are collected, you should say: I have understood your purchase intention, please wait a moment.`,
    }
];

let assistant_output = "Welcome to the Bailian Mobile Store, what size phone do you need to buy?";
console.log(assistant_output);


const readline = createInterface({
    input: process.stdin,
    output: process.stdout
});

(async () => {
    while (!assistant_output.includes("I have understood your purchase intention")) {
        const user_input = await readline.question("Please enter: ");
        messages.push({ role: "user", content: user_input});
        try {
            const response = await getResponse(messages);
            assistant_output = response;
            messages.push({ role: "assistant", content: assistant_output });
            console.log(assistant_output);
            console.log("\n");
        } catch (error) {
            console.error("Error fetching response:", error);
        }
    }
    readline.close();
})();

DashScope

You can use the DashScope SDK or HTTP method to call Qwen models and experience the multi-round conversation feature.

For a complete parameter list, see DashScope.

Python

import os
from dashscope import Generation
import dashscope
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

def get_response(messages):
    response = Generation.call(
        # If the environment variable is not configured, replace the following line with: api_key="sk-xxx",
        api_key=os.getenv("DASHSCOPE_API_KEY"),
        model="qwen-plus",
        messages=messages,
        result_format="message",
    )
    return response


messages = [
    {
        "role": "system",
        "content": """You are a clerk at the Bailian Mobile Store, responsible for recommending phones to users. Phones have two parameters: screen size (including 6.1 inches, 6.5 inches, 6.7 inches), resolution (including 2K, 4K).
        You can only ask the user one parameter at a time. If the user does not provide complete information, you need to ask them to provide the missing parameter. Once the parameters are collected, you should say: I have understood your purchase intention, please wait a moment.""",
    }
]

assistant_output = "Welcome to the Bailian Mobile Store, what size phone do you need to buy?"
print(f"Model output: {assistant_output}\n")
while "I have understood your purchase intention" not in assistant_output:
    user_input = input("Please enter: ")
    # Add user question information to the messages list
    messages.append({"role": "user", "content": user_input})
    assistant_output = get_response(messages).output.choices[0].message.content
    # Add the large model's reply information to the messages list
    messages.append({"role": "assistant", "content": assistant_output})
    print(f"Model output: {assistant_output}")
    print("\n")

Java

import java.util.ArrayList;
import java.util.List;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import java.util.Scanner;
import com.alibaba.dashscope.protocol.Protocol;

public class Main {
    public static GenerationParam createGenerationParam(List<Message> messages) {
        return GenerationParam.builder()
                // If the environment variable is not configured, replace the following line with: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("qwen-plus")
                .messages(messages)
                .resultFormat(GenerationParam.ResultFormat.MESSAGE)
                .build();
    }
    public static GenerationResult callGenerationWithMessages(GenerationParam param) throws ApiException, NoApiKeyException, InputRequiredException {
        Generation gen = new Generation(Protocol.HTTP.getValue(), "https://dashscope-intl.aliyuncs.com/api/v1");
        return gen.call(param);
    }
    public static void main(String[] args) {
        try {
            List<Message> messages = new ArrayList<>();
            messages.add(createMessage(Role.SYSTEM, "You are a helpful assistant."));
            for (int i = 0; i < 3;i++) {
                Scanner scanner = new Scanner(System.in);
                System.out.print("Please enter: ");
                String userInput = scanner.nextLine();
                if ("exit".equalsIgnoreCase(userInput)) {
                    break;
                }
                messages.add(createMessage(Role.USER, userInput));
                GenerationParam param = createGenerationParam(messages);
                GenerationResult result = callGenerationWithMessages(param);
                System.out.println("Model output: "+result.getOutput().getChoices().get(0).getMessage().getContent());
                messages.add(result.getOutput().getChoices().get(0).getMessage());
            }
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            e.printStackTrace();
        }
        System.exit(0);
    }
    private static Message createMessage(Role role, String content) {
        return Message.builder().role(role.getValue()).content(content).build();
    }
}

cURL

curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen-plus",
    "input":{
        "messages":[      
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Hello"
            },
            {
                "role": "assistant",
                "content": "Hello, I am Qwen."
            },
            {
                "role": "user",
                "content": "What skills do you have?"
            }
        ]
    }
}'

Streaming output

In streaming output mode, the LLM generates and returns intermediate results in real-time instead of one final response. This reduces the wait time for the complete response.

OpenAI compatible

You can use the OpenAI SDK or OpenAI-compatible HTTP method to call Qwen models and experience the stream output feature.

For a complete parameter list, see OpenAI.

Python

import os
from openai import OpenAI

client = OpenAI(
    # If the environment variable is not configured, replace the following line with: api_key="sk-xxx",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
completion = client.chat.completions.create(
    model="qwen-plus",
    messages=[
        {'role': 'system', 'content': 'You are a helpful assistant.'},
        {'role': 'user', 'content': 'Who are you?'}
        ],
    stream=True
    )
full_content = ""
print("Streaming output content is:")
for chunk in completion:
    full_content += chunk.choices[0].delta.content
    print(chunk.choices[0].delta.content)
print(f"Full content is: {full_content}")

Sample response

Streaming output content is:

I am a
large
language model
from Alibaba Cloud
. I am
called
Qwen

Full content is: 
I am a large language model from Alibaba Cloud. I am called Qwen.

cURL

curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen-plus",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user", 
            "content": "Who are you?"
        }
    ],
    "stream":true,
    "stream_options":{
        "include_usage":true
    }
}'

Sample response

data: {"choices":[{"delta":{"content":"","role":"assistant"},"index":0,"logprobs":null,"finish_reason":null}],"object":"chat.completion.chunk","usage":null,"created":1726132850,"system_fingerprint":null,"model":"qwen-max","id":"chatcmpl-428b414f-fdd4-94c6-b179-8f576ad653a8"}

data: {"choices":[{"finish_reason":null,"delta":{"content":"I am"},"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1726132850,"system_fingerprint":null,"model":"qwen-max","id":"chatcmpl-428b414f-fdd4-94c6-b179-8f576ad653a8"}

data: {"choices":[{"delta":{"content":"a"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1726132850,"system_fingerprint":null,"model":"qwen-max","id":"chatcmpl-428b414f-fdd4-94c6-b179-8f576ad653a8"}

data: {"choices":[{"delta":{"content":"large language"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1726132850,"system_fingerprint":null,"model":"qwen-max","id":"chatcmpl-428b414f-fdd4-94c6-b179-8f576ad653a8"}

data: {"choices":[{"delta":{"content":"model from"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1726132850,"system_fingerprint":null,"model":"qwen-max","id":"chatcmpl-428b414f-fdd4-94c6-b179-8f576ad653a8"}

data: {"choices":[{"delta":{"content":"Alibaba Cloud"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1726132850,"system_fingerprint":null,"model":"qwen-max","id":"chatcmpl-428b414f-fdd4-94c6-b179-8f576ad653a8"}

data: {"choices":[{"delta":{"content":", called Qwen."},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1726132850,"system_fingerprint":null,"model":"qwen-max","id":"chatcmpl-428b414f-fdd4-94c6-b179-8f576ad653a8"}

data: {"choices":[{"finish_reason":"stop","delta":{"content":""},"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1726132850,"system_fingerprint":null,"model":"qwen-max","id":"chatcmpl-428b414f-fdd4-94c6-b179-8f576ad653a8"}

data: {"choices":[],"object":"chat.completion.chunk","usage":{"prompt_tokens":22,"completion_tokens":17,"total_tokens":39},"created":1726132850,"system_fingerprint":null,"model":"qwen-max","id":"chatcmpl-428b414f-fdd4-94c6-b179-8f576ad653a8"}

data: [DONE]

Node.js

Sample code

import OpenAI from "openai";

const openai = new OpenAI(
    {
        // If the environment variable is not configured, replace the following line with: apiKey: "sk-xxx",
        apiKey: process.env.DASHSCOPE_API_KEY,
        baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
    }
);

const completion = await openai.chat.completions.create({
    model: "qwen-plus",
    messages: [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who are you?"}
    ],
    stream: true,
});

let fullContent = "";
console.log("Streaming output content is:")
for await (const chunk of completion) {
    fullContent = fullContent + chunk.choices[0].delta.content;
    console.log(chunk.choices[0].delta.content);
}
console.log("\nFull content is:")
console.log(fullContent);

Sample response

Streaming output content is:

I am a
large
language model
from Alibaba Cloud
. I am
called
Qwen

Full content is: 
I am a large language model from Alibaba Cloud. I am called Qwen.

DashScope

You can use the DashScope SDK or HTTP method to call Qwen models and experience the stream output feature.

For a complete parameter list, see DashScope.

Python

import os
from dashscope import Generation
import dashscope
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

messages = [
    {'role':'system','content':'you are a helpful assistant'},
    {'role': 'user','content': 'Who are you?'}]
responses = Generation.call(
    # If the environment variable is not configured, replace the following line with: api_key="sk-xxx",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model="qwen-plus",
    messages=messages,
    result_format='message',
    stream=True,
    # Incremental stream output
    incremental_output=True
    )
full_content = ""
print("Streaming output content is:")
for response in responses:
    full_content += response.output.choices[0].message.content
    print(response.output.choices[0].message.content)
print(f"Full content is: {full_content}")

Sample response

Streaming output content is:

I am a
large
language model
from Alibaba Cloud
. I am
called
Qwen

Full content is: 
I am a large language model from Alibaba Cloud. I am called Qwen.

Java

import java.util.Arrays;
import java.lang.System;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import io.reactivex.Flowable;
import com.alibaba.dashscope.protocol.Protocol;

public class Main {
    private static final Logger logger = LoggerFactory.getLogger(Main.class);
    private static StringBuilder fullContent = new StringBuilder();
    private static void handleGenerationResult(GenerationResult message) {
        String content = message.getOutput().getChoices().get(0).getMessage().getContent();
        fullContent.append(content);
        System.out.println(content);
    }
    public static void streamCallWithMessage(Generation gen, Message userMsg)
            throws NoApiKeyException, ApiException, InputRequiredException {
        GenerationParam param = buildGenerationParam(userMsg);
        System.out.println("Streaming output content is:");
        Flowable<GenerationResult> result = gen.streamCall(param);
        result.blockingForEach(message -> handleGenerationResult(message));
        System.out.println("Full content is: " + fullContent.toString());
    }
    private static GenerationParam buildGenerationParam(Message userMsg) {
        return GenerationParam.builder()
                // If the environment variable is not configured, replace the following line with: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .model("qwen-plus")
                .messages(Arrays.asList(userMsg))
                .resultFormat(GenerationParam.ResultFormat.MESSAGE)
                .incrementalOutput(true)
                .build();
    }
    public static void main(String[] args) {
        try {
            Generation gen = new Generation(Protocol.HTTP.getValue(), "https://dashscope-intl.aliyuncs.com/api/v1");
            Message userMsg = Message.builder().role(Role.USER.getValue()).content("Who are you?").build();
            streamCallWithMessage(gen, userMsg);
        } catch (ApiException | NoApiKeyException | InputRequiredException  e) {
            logger.error("An exception occurred: {}", e.getMessage());
        }
        System.exit(0);
    }
}

Sample response

Streaming output content is:

I am a
large
language model
from Alibaba Cloud
. I am
called
Qwen

Full content is: 
I am a large language model from Alibaba Cloud. I am called Qwen.

cURL

curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "X-DashScope-SSE: enable" \
-d '{
    "model": "qwen-plus",
    "input":{
        "messages":[      
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Who are you?"
            }
        ]
    },
    "parameters": {
        "result_format": "message",
        "incremental_output":true
    }
}'

Sample response

id:1
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"I am","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":23,"input_tokens":22,"output_tokens":1},"request_id":"xxx"}
id:2
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"Qwen","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":24,"input_tokens":22,"output_tokens":2},"request_id":"xxx"}
id:3
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":", an","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":25,"input_tokens":22,"output_tokens":3},"request_id":"xxx"}
id:4
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"AI","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":30,"input_tokens":22,"output_tokens":8},"request_id":"xxx"}
id:5
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"assistant developed by Alibaba","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":38,"input_tokens":22,"output_tokens":16},"request_id":"xxx"}
id:6
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"Cloud. I am designed to answer various questions, provide information","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":46,"input_tokens":22,"output_tokens":24},"request_id":"xxx"}
id:7
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"and engage in conversations with users. How can I","role":"assistant"},"finish_reason":"null"}]},"usage":{"total_tokens":54,"input_tokens":22,"output_tokens":32},"request_id":"xxx"}
id:8
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":"assist you?","role":"assistant"},"finish_reason":"stop"}]},"usage":{"total_tokens":58,"input_tokens":22,"output_tokens":36},"request_id":"xxx"}

Function calling

The LLM may not perform well when dealing with real-time information, private domain knowledge, or math calculations. You can use the function calling feature to allow the model to use external tools. When calling the model, you can pass in the name, description, input parameters, and other information of tools through the tools parameter. The following flow chart shows how function calling works.

image

Function calling requires the LLM to parse the parameters effectively. We recommend that you use Qwen-Plus.

Note

The streaming output mode does not support function calling.

OpenAI compatibe

You can use the OpenAI SDK or OpenAI-compatible HTTP method to call Qwen models and experience the function calling feature.

Python

Sample code

from openai import OpenAI
from datetime import datetime
import json
import os

client = OpenAI(
    # If the environment variable is not configured, replace the following line with: api_key="sk-xxx",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # Replace with the base_url of the DashScope SDK
)

# Define a tool list. The model selects a tool based on the name and description of the tool
tools = [
    # Tool 1: obtain the current time
    {
        "type": "function",
        "function": {
            "name": "get_current_time",
            "description": "This tool can help you query the current time.",
            # No request parameter is needed. The parameters parameter is left empty
            "parameters": {}
        }
    },  
    # Tool 2: obtain the weather of a specific city
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "This tool can help you query the weather of a city.",
            "parameters": {  
                "type": "object",
                "properties": {
                    # The parameters parameter is set to location, which specifies the location whose weather you want to query
                    "location": {
                        "type": "string",
                        "description": "A city, county, or district, such as Beijing, Hangzhou, or Yuhang."
                    }
                }
            },
            "required": [
                "location"
            ]
        }
    }
]

# Simulate the weather query tool. Sample response: "It is rainy today in Beijing."
def get_current_weather(location):
    return f"It's raining in {location}."

# Simulate the time query tool. Sample response: "Current time: 2024-04-15 17:15:18."
def get_current_time():
    # Obtain the current date and time
    current_datetime = datetime.now()
    # Format the current date and time
    formatted_time = current_datetime.strftime('%Y-%m-%d %H:%M:%S')
    # Return the formatted current date and time
    return f"Current time: {formatted_time}."

# Encapsulate the response function of the model
def get_response(messages):
    completion = client.chat.completions.create(
        model="qwen-plus",
        messages=messages,
        tools=tools
        )
    return completion.model_dump()

def call_with_messages():
    print('\n')
    messages = [
            {
                "content": input('Enter: '),  # Sample questions: "What time is it now?" "What time will it be in an hour?" "What is the weather like in Beijing?"
                "role": "user"
            }
    ]
    print("-"*60)
    # Call the model in the first round
    i = 1
    first_response = get_response(messages)
    assistant_output = first_response['choices'][0]['message']
    print(f"\nOutput of the model in round {i}:{first_response}\n")
    if  assistant_output['content'] is None:
        assistant_output['content'] = ""
    messages.append(assistant_output)
    # If the model need not to call the tools, a response is returned directly
    if assistant_output['tool_calls'] == None:  # If the model determines that the tools are not needed, the response is directly printed. The model will not be called in the second round
        print(f"Without the need to call the tools, I can answer directly:{assistant_output['content']}")
        return
    # If the model need to call the tools, it is called for multiple rounds until the model determines that the tools is not needed
    while assistant_output['tool_calls'] != None:
        # If the model determines that the weather query tool is needed, run the weather query tool
        if assistant_output['tool_calls'][0]['function']['name'] == 'get_current_weather':
            tool_info = {"name": "get_current_weather", "role":"tool"}
            # Location is provided
            location = json.loads(assistant_output['tool_calls'][0]['function']['arguments'])['location']
            tool_info['content'] = get_current_weather(location)
        # If the model determines that the time query tool is needed, run the time query tool
        elif assistant_output['tool_calls'][0]['function']['name'] == 'get_current_time':
            tool_info = {"name": "get_current_time", "role":"tool"}
            tool_info['content'] = get_current_time()
        print(f"Tool output: {tool_info['content']}\n")
        print("-"*60)
        messages.append(tool_info)
        assistant_output = get_response(messages)['choices'][0]['message']
        if  assistant_output['content'] is None:
            assistant_output['content'] = ""
        messages.append(assistant_output)
        i += 1
        print(f"Output of the model in round {i}:{assistant_output}\n")
    print(f"Final answer:{assistant_output['content']}")

if __name__ == '__main__':
    call_with_messages()

Response

When you input Singapore weather, function calling is initiated and the tool_calls parameter is returned. When you input Hello, the model does not call tools and the the tool_calls parameter is not returned. Sample response:

Input: Singapore weather

{
    'id': 'chatcmpl-e2f045fd-2604-9cdb-bb61-37c805ecd15a',
    'choices': [
        {
            'finish_reason': 'tool_calls',
            'index': 0,
            'logprobs': None,
            'message': {
                'content': '',
                'role': 'assistant',
                'function_call': None,
                'tool_calls': [
                    {
                        'id': 'call_7a33ebc99d5342969f4868',
                        'function': {
                            'arguments': '{
                                "location": "Singapore"
                            }',
                            'name': 'get_current_weather'
                        },
                        'type': 'function',
                        'index': 0
                    }
                ]
            }
        }
    ],
    'created': 1726049697,
    'model': 'qwen-max',
    'object': 'chat.completion',
    'service_tier': None,
    'system_fingerprint': None,
    'usage': {
        'completion_tokens': 18,
        'prompt_tokens': 217,
        'total_tokens': 235
    }
}

Input: Hello

{
    'id': 'chatcmpl-5d890637-9211-9bda-b184-961acf3be38d',
    'choices': [
        {
            'finish_reason': 'stop',
            'index': 0,
            'logprobs': None,
            'message': {
                'content': 'Hello! How can I help you?',
                'role': 'assistant',
                'function_call': None,
                'tool_calls': None
            }
        }
    ],
    'created': 1726049765,
    'model': 'qwen-max',
    'object': 'chat.completion',
    'service_tier': None,
    'system_fingerprint': None,
    'usage': {
        'completion_tokens': 7,
        'prompt_tokens': 216,
        'total_tokens': 223
    }
}

HTTP

Sample code

import requests
import os
from datetime import datetime
import json

# Define a tool list. The model selects a tool based on the name and description of the tool
tools = [
    # Tool 1: obtain the current time
    {
        "type": "function",
        "function": {
            "name": "get_current_time",
            "description": "This tool can help you query the current time.",
            "parameters": {}  # No request parameter is needed. The parameters parameter is left empty
        }
    },  
    # Tool 2: obtain the weather of a specific city
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "This tool can help you query the weather of a city.",
            "parameters": {  # The parameters parameter is set to location, which specifies the location whose weather you want to query
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "A city, county, or district, such as Beijing, Hangzhou, or Yuhang."
                    }
                }
            },
            "required": [
                "location"
            ]
        }
    }
]

# Simulate the weather query tool. Sample response: "It is sunny today in Beijing."
def get_current_weather(location):
    return f"It's raining in {location}. "

# Simulate the time query tool. Sample response: "Current time: 2024-04-15 17:15:18."
def get_current_time():
    # Obtain the current date and time
    current_datetime = datetime.now()
    # Format the current date and time
    formatted_time = current_datetime.strftime('%Y-%m-%d %H:%M:%S')
    # Return the formatted current date and time
    return f"Current time: {formatted_time}."

def get_response(messages):
    # If the environment variable is not configured, replace the following line with: api_key="sk-xxx",
    api_key = os.getenv("DASHSCOPE_API_KEY")
    url = 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions'
    headers = {'Content-Type': 'application/json',
            'Authorization':f'Bearer {api_key}'}
    body = {
        'model': 'qwen-plus',
        "messages": messages,
        "tools":tools
    }

    response = requests.post(url, headers=headers, json=body)
    return response.json()


def call_with_messages():
    messages = [
            {
                "content": input('Enter: '),  # Sample questions: "What time is it now?" "What time will it be in an hour?" "What is the weather like in Beijing?"
                "role": "user"
            }
    ]
    
    # Call the model in the first round
    first_response = get_response(messages)
    print(f"\nFirst round result: {first_response}")
    assistant_output = first_response['choices'][0]['message']
    if  assistant_output['content'] is None:
        assistant_output['content'] = ""
    messages.append(assistant_output)
    if 'tool_calls' not in assistant_output:  # If the model determines that the tools are not needed, the response is directly printed. The model will not be called in the second round
        print(f"Final answer: {assistant_output['content']}")
        return
    # If the model selects the get_current_weather tool
    elif assistant_output['tool_calls'][0]['function']['name'] == 'get_current_weather':
        tool_info = {"name": "get_current_weather", "role":"tool"}
        location = json.loads(assistant_output['tool_calls'][0]['function']['arguments'])['location']
        tool_info['content'] = get_current_weather(location)
    # If the model selects the get_current_time tool
    elif assistant_output['tool_calls'][0]['function']['name'] == 'get_current_time':
        tool_info = {"name": "get_current_time", "role":"tool"}
        tool_info['content'] = get_current_time()
    print(f"Tool output: {tool_info['content']}")
    messages.append(tool_info)

    # Call the model in the second round to summarize the tool output
    second_response = get_response(messages)
    print(f"Second round result: {second_response}")
    print(f"Final answer: {second_response['choices'][0]['message']['content']}")

if __name__ == '__main__':
    call_with_messages()

Response

When you input Singapore weather, function calling is initiated and the tool_calls parameter is returned. When you input Hello, the model does not call tools and the the tool_calls parameter is not returned. Sample response:

Input: Singapore weather

{
    'choices': [
        {
            'message': {
                'content': '',
                'role': 'assistant',
                'tool_calls': [
                    {
                        'function': {
                            'name': 'get_current_weather',
                            'arguments': '{
                                "location": "Singapore"
                            }'
                        },
                        'index': 0,
                        'id': 'call_416cd81b8e7641edb654c4',
                        'type': 'function'
                    }
                ]
            },
            'finish_reason': 'tool_calls',
            'index': 0,
            'logprobs': None
        }
    ],
    'object': 'chat.completion',
    'usage': {
        'prompt_tokens': 217,
        'completion_tokens': 18,
        'total_tokens': 235
    },
    'created': 1726050222,
    'system_fingerprint': None,
    'model': 'qwen-max',
    'id': 'chatcmpl-61e30855-ee69-93ab-98d5-4194c51a9980'
}

Input: Hello

{
    'choices': [
        {
            'message': {
                'content': 'Hello! How can I help you?',
                'role': 'assistant'
            },
            'finish_reason': 'stop',
            'index': 0,
            'logprobs': None
        }
    ],
    'object': 'chat.completion',
    'usage': {
        'prompt_tokens': 216,
        'completion_tokens': 7,
        'total_tokens': 223
    },
    'created': 1726050238,
    'system_fingerprint': None,
    'model': 'qwen-max',
    'id': 'chatcmpl-2f2f86d1-bc4e-9494-baca-aac5b0555091'
}

Node.js

Sample code

import OpenAI from "openai";
import { format } from 'date-fns';
import readline from 'readline';

function getCurrentWeather(location) {
    return `It's raining in ${location}.`;
}
function getCurrentTime() {
    // Obtain the current date and time
    const currentDatetime = new Date();
    // Format the current date and time
    const formattedTime = format(currentDatetime, 'yyyy-MM-dd HH:mm:ss');
    // Return the formatted current date and time
    return `Current time: ${formattedTime}.`;
}
const openai = new OpenAI(
    {
        // If the environment variable is not configured, replace the following line with: apiKey: "sk-xxx",
        apiKey: process.env.DASHSCOPE_API_KEY,
        baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
    }
);
const tools = [
// Tool 1: obtain the current time
{
    "type": "function",
    "function": {
        "name": "getCurrentTime",
        "description": "This tool can help you query the current time.",
        // No request parameter is needed. The parameters parameter is left empty
        "parameters": {}  
    }
},  
// Tool 2: obtain the weather of a specific city
{
    "type": "function",
    "function": {
        "name": "getCurrentWeather",
        "description": "This tool can help you query the weather of a city.",
        "parameters": {  
            "type": "object",
            "properties": {
                // The parameters parameter is set to location, which specifies the location whose weather you want to query
                "location": {
                    "type": "string",
                    "description": "A city, county, or district, such as Beijing, Hangzhou, or Yuhang."
                }
            },
            "required": ["location"]
        }
    }
}
];
async function getResponse(messages) {
    const response = await openai.chat.completions.create({
        model: "qwen-plus",
        messages: messages,
        tools: tools,
    });
    return response;
}
const rl = readline.createInterface({
    input: process.stdin,
    output: process.stdout
});
rl.question("user: ", async (question) => {
    const messages = [{"role": "user","content": question}];
    let i = 1;
    const firstResponse = await getResponse(messages);
    let assistantOutput = firstResponse.choices[0].message;    
    console.log(`Output of the model in round ${i}:${JSON.stringify(assistantOutput)}`);
    if (Object.is(assistantOutput.content,null)){
        assistantOutput.content = "";
    }
    messages.push(assistantOutput);
    if (! ("tool_calls" in assistantOutput)) {
        console.log(`Without the need to call the tools, I can answer directly:${assistantOutput.content}`);
        rl.close();
    } else{
        while ("tool_calls" in assistantOutput) {
            let toolInfo = {};
            if (assistantOutput.tool_calls[0].function.name == "getCurrentWeather" ) {
                toolInfo = {"role": "tool"};
                let location = JSON.parse(assistantOutput.tool_calls[0].function.arguments)["location"];
                toolInfo["content"] = getCurrentWeather(location);
            } else if (assistantOutput.tool_calls[0].function.name == "getCurrentTime" ) {
                toolInfo = {"role":"tool"};
                toolInfo["content"] = getCurrentTime();
            }
            console.log(`Tool output: ${JSON.stringify(toolInfo)}`);
            console.log("=".repeat(100));
            messages.push(toolInfo);
            assistantOutput = (await getResponse(messages)).choices[0].message;
            if (Object.is(assistantOutput.content,null)){
                assistantOutput.content = "";
            }
            messages.push(assistantOutput);
            i += 1;
            console.log(`Output of the model in round ${i}:${JSON.stringify(assistantOutput)}`)
    }
    console.log("=".repeat(100));
    console.log(`Final output of the model: ${JSON.stringify(assistantOutput.content)}`);
    rl.close();
    }});

Response

When you input What is the weather like in Singapore, New York City, London, and Paris?, the output is:

Output of the model in round 1:{"content":"","role":"assistant","tool_calls":[{"function":{"name":"getCurrentWeather","arguments":"{\"location\": \"Singapore\"}"},"index":0,"id":"call_d2aff21240b24c7291db6d","type":"function"}]}
Tool output:{"role":"tool","content":"It is rainy today in Singapore."}
====================================================================================================
Output of the model in round 2:{"content":"","role":"assistant","tool_calls":[{"function":{"name":"getCurrentWeather","arguments":"{\"location\": \"New York City\"}"},"index":0,"id":"call_bdcfa937e69b4eae997b5e","type":"function"}]}
Tool output:{"role":"tool","content":"It is rainy today in New York City."}
====================================================================================================
Output of the model in round 3:{"content":"","role":"assistant","tool_calls":[{"function":{"name":"getCurrentWeather","arguments":"{\"location\": \"London\"}"},"index":0,"id":"call_bbf22d017e8e439e811974","type":"function"}]}
Tool output:{"role":"tool","content":"It is rainy today in London."}
====================================================================================================
Output of the model in round 4:{"content":"","role":"assistant","tool_calls":[{"function":{"name":"getCurrentWeather","arguments":"{\"location\": \"Paris\"}"},"index":0,"id":"call_f4f8e149af01492fb60162","type":"function"}]}
Tool output:{"role":"tool","content":"It is rainy today in Paris."}
====================================================================================================
Output of the model in round 5:{"content":"It is rainy today in all four cities. Don't forget your umbrella!","role":"assistant"}
====================================================================================================
Final output of the model:"It is rainy today in all four cities. Don't forget your umbrella!"

DashScope

You can use the DashScope SDK or HTTP method to call Qwen models and experience the function calling feature.

Python

Sample code

import os
from dashscope import Generation
from datetime import datetime
import random
import json
import dashscope
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

# Define a tool list. The model selects a tool based on the name and description of the tool
tools = [
    # Tool 1: obtain the current time
    {
        "type": "function",
        "function": {
            "name": "get_current_time",
            "description": "This tool can help you query the current time.",
            "parameters": {}  # No request parameter is needed. The parameters parameter is left empty
        }
    },
    # Tool 2: obtain the weather of a specific city
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "This tool can help you query the weather of a city.",
            "parameters": {
                # The parameters parameter is set to location, which specifies the location whose weather you want to query
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "A city, county, or district, such as Beijing, Hangzhou, or Yuhang."
                    }
                }
            },
            "required": [
                "location"
            ]
        }
    }
]


# Simulate the weather query tool. Sample response: "It is sunny today in Beijing."
def get_current_weather(location):
    return f"It's raining in {location}. "


# Simulate the time query tool. Sample response: "Current time: 2024-04-15 17:15:18."
def get_current_time():
    # Obtain the current date and time
    current_datetime = datetime.now()
    # Format the current date and time
    formatted_time = current_datetime.strftime('%Y-%m-%d %H:%M:%S')
    # Return the formatted current date and time
    return f"Current time: {formatted_time}."


# Encapsulate the response function of the model
def get_response(messages):
    response = Generation.call(
        # If the environment variable is not configured, replace the following line with: api_key="sk-xxx",
        api_key=os.getenv("DASHSCOPE_API_KEY"),
        model='qwen-plus',
        messages=messages,
        tools=tools,
        seed=random.randint(1, 10000),  # Set the random number seed. If not set, the default is 1234
        result_format='message'  # Set the output format to message
    )
    return response


def call_with_messages():
    print('\n')
    messages = [
            {
                "content": input('Enter: '),  # Sample questions: "What time is it now?" "What time will it be in an hour?" "What is the weather like in Beijing?"
                "role": "user"
            }
    ]
   
    # Call the model in the first round
    first_response = get_response(messages)
    assistant_output = first_response.output.choices[0].message
    print(f"\nOutput of the model in the first round:{first_response}\n")
    messages.append(assistant_output)
    if 'tool_calls' not in assistant_output:  # If the model determines that the tools are not needed, the response is directly printed. The model will not be called in the second round
        print(f"Final answer: {assistant_output.content}")
        return
    # If the model selects the get_current_weather tool
    elif assistant_output.tool_calls[0]['function']['name'] == 'get_current_weather':
        tool_info = {"name": "get_current_weather", "role":"tool"}
        location = json.loads(assistant_output.tool_calls[0]['function']['arguments'])['location']
        tool_info['content'] = get_current_weather(location)
    # If the model selects the get_current_time tool
    elif assistant_output.tool_calls[0]['function']['name'] == 'get_current_time':
        tool_info = {"name": "get_current_time", "role":"tool"}
        tool_info['content'] = get_current_time()
    print(f"Tool output: {tool_info['content']}\n")
    messages.append(tool_info)

    # Call the model in the second round to summarize the tool output
    second_response = get_response(messages)
    print(f"Output of the model in the second round:{second_response}\n")
    print(f"Final answer: {second_response.output.choices[0].message['content']}")

if __name__ == '__main__':
    call_with_messages()

Response

When you input Singapore weather, function calling is initiated and the tool_calls parameter is returned. When you input Hello, the model does not call tools and the the tool_calls parameter is not returned. Sample response:

Input: Singapore weather

{
  "status_code": 200,
  "request_id": "33cf0a53-ea38-9f47-8fce-b93b55d86573",
  "code": "",
  "message": "",
  "output": {
    "text": null,
    "finish_reason": null,
    "choices": [
      {
        "finish_reason": "tool_calls",
        "message": {
          "role": "assistant",
          "content": "",
          "tool_calls": [
            {
              "function": {
                "name": "get_current_weather",
                "arguments": "{\"location\": \"Singapore\"}"
              },
              "index": 0,
              "id": "call_9f62f52f3a834a8194f634",
              "type": "function"
            }
          ]
        }
      }
    ]
  },
  "usage": {
    "input_tokens": 217,
    "output_tokens": 18,
    "total_tokens": 235
  }
}

Input: Hello

{
  "status_code": 200,
  "request_id": "4818ce03-e7c9-96de-a7bc-781649d98465",
  "code": "",
  "message": "",
  "output": {
    "text": null,
    "finish_reason": null,
    "choices": [
      {
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Hello! How can I help you?"
        }
      }
    ]
  },
  "usage": {
    "input_tokens": 216,
    "output_tokens": 7,
    "total_tokens": 223
  }
}

Java

Sample code

// Copyright (c) Alibaba, Inc. and its affiliates.
// version >= 2.12.0
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import com.alibaba.dashscope.aigc.conversation.ConversationParam.ResultFormat;
import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationOutput.Choice;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.tools.FunctionDefinition;
import com.alibaba.dashscope.tools.ToolCallBase;
import com.alibaba.dashscope.tools.ToolCallFunction;
import com.alibaba.dashscope.tools.ToolFunction;
import com.alibaba.dashscope.utils.JsonUtils;
import com.fasterxml.jackson.databind.node.ObjectNode;
import com.github.victools.jsonschema.generator.Option;
import com.github.victools.jsonschema.generator.OptionPreset;
import com.github.victools.jsonschema.generator.SchemaGenerator;
import com.github.victools.jsonschema.generator.SchemaGeneratorConfig;
import com.github.victools.jsonschema.generator.SchemaGeneratorConfigBuilder;
import com.github.victools.jsonschema.generator.SchemaVersion;

import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.util.Scanner;
import com.alibaba.dashscope.protocol.Protocol;

public class Main {

    public class GetWhetherTool {
        private String location;

        public GetWhetherTool(String location) {
            this.location = location;
        }

        public String call() {
            return location+" is raining";
        }
    }

    public class GetTimeTool {

        public GetTimeTool() {
        }

        public String call() {
            LocalDateTime now = LocalDateTime.now();
            DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss");
            String currentTime = "Current time: " + now.format(formatter) + ".";
            return currentTime;
        }
    }

    public static void SelectTool()
            throws NoApiKeyException, ApiException, InputRequiredException {

        SchemaGeneratorConfigBuilder configBuilder =
                new SchemaGeneratorConfigBuilder(SchemaVersion.DRAFT_2020_12, OptionPreset.PLAIN_JSON);
        SchemaGeneratorConfig config = configBuilder.with(Option.EXTRA_OPEN_API_FORMAT_VALUES)
                .without(Option.FLATTENED_ENUMS_FROM_TOSTRING).build();
        SchemaGenerator generator = new SchemaGenerator(config);


        ObjectNode jsonSchema_whether = generator.generateSchema(GetWhetherTool.class);
        ObjectNode jsonSchema_time = generator.generateSchema(GetTimeTool.class);


        FunctionDefinition fd_whether = FunctionDefinition.builder().name("get_current_whether").description("Obtain the weather of a specified region")
                .parameters(JsonUtils.parseString(jsonSchema_whether.toString()).getAsJsonObject()).build();

        FunctionDefinition fd_time = FunctionDefinition.builder().name("get_current_time").description("Obtain the current time")
                .parameters(JsonUtils.parseString(jsonSchema_time.toString()).getAsJsonObject()).build();

        Message systemMsg = Message.builder().role(Role.SYSTEM.getValue())
                .content("You are a helpful assistant. When asked a question, use tools wherever possible.")
                .build();

        Scanner scanner = new Scanner(System.in);
        System.out.print("\nEnter: ");
        String userInput = scanner.nextLine();
        Message userMsg =
                Message.builder().role(Role.USER.getValue()).content(userInput).build();

        List<Message> messages = new ArrayList<>();
        messages.addAll(Arrays.asList(systemMsg, userMsg));

        GenerationParam param = GenerationParam.builder()
                .model("qwen-plus")
                // If the environment variable is not configured, replace the following line with: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .messages(messages).resultFormat(ResultFormat.MESSAGE)
                .tools(Arrays.asList(ToolFunction.builder().function(fd_whether).build(),ToolFunction.builder().function(fd_time).build())).build();
        // Call the model in the first round
        Generation gen = new Generation(Protocol.HTTP.getValue(), "https://dashscope-intl.aliyuncs.com/api/v1");
        GenerationResult result = gen.call(param);

        System.out.println("\nOutput of the model in the first round:"+JsonUtils.toJson(result));

        for (Choice choice : result.getOutput().getChoices()) {
            messages.add(choice.getMessage());
            // If the tools are needed
            if (result.getOutput().getChoices().get(0).getMessage().getToolCalls() != null) {
                for (ToolCallBase toolCall : result.getOutput().getChoices().get(0).getMessage()
                        .getToolCalls()) {
                    if (toolCall.getType().equals("function")) {
                        // Obtain the tool function name and input parameter
                        String functionName = ((ToolCallFunction) toolCall).getFunction().getName();
                        String functionArgument = ((ToolCallFunction) toolCall).getFunction().getArguments();
                        // If the model determines that the weather query tool is needed
                        if (functionName.equals("get_current_whether")) {
                            GetWhetherTool GetWhetherFunction =
                                    JsonUtils.fromJson(functionArgument, GetWhetherTool.class);
                            String whether = GetWhetherFunction.call();
                            Message toolResultMessage = Message.builder().role("tool")
                                    .content(String.valueOf(whether)).toolCallId(toolCall.getId()).build();
                            messages.add(toolResultMessage);
                            System.out.println("\nTool output:"+whether);
                        }
                        // If the model determines that the time query tool is needed
                        else if (functionName.equals("get_current_time")) {
                            GetTimeTool GetTimeFunction =
                                    JsonUtils.fromJson(functionArgument, GetTimeTool.class);
                            String time = GetTimeFunction.call();
                            Message toolResultMessage = Message.builder().role("tool")
                                    .content(String.valueOf(time)).toolCallId(toolCall.getId()).build();
                            messages.add(toolResultMessage);
                            System.out.println("\nTool output:"+time);
                        }
                    }
                }
            }
            // If the tools are not needed, directly output the response of the model
            else {
                System.out.println("\nFinal answer:"+result.getOutput().getChoices().get(0).getMessage().getContent());
                return;
            }
        }
        // Call the model in the second round, including the tool output
        param.setMessages(messages);
        result = gen.call(param);
        System.out.println("\nOutput of the model in the second round:"+JsonUtils.toJson(result));
        System.out.println(("\nFinal answer:"+result.getOutput().getChoices().get(0).getMessage().getContent()));
    }


    public static void main(String[] args) {
        try {
            SelectTool();
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            System.out.println(String.format("Exception %s", e.getMessage()));
        }
        System.exit(0);
    }
}

Response

When you input Singapore weather, function calling is initiated and the tool_calls parameter is returned. When you input Hello, the model does not call tools and the the tool_calls parameter is not returned. Sample response:

Input: Singapore weather

{
    "requestId": "e2faa5cf-1707-973b-b216-36aa4ef52afc",
    "usage": {
        "input_tokens": 254,
        "output_tokens": 19,
        "total_tokens": 273
    },
    "output": {
        "choices": [
            {
                "finish_reason": "tool_calls",
                "message": {
                    "role": "assistant",
                    "content": "",
                    "tool_calls": [
                        {
                            "type": "function",
                            "id": "",
                            "function": {
                                "name": "get_current_whether",
                                "arguments": "{\"location\": \"Singapore\"}"
                            }
                        }
                    ]
                }
            }
        ]
    }
}

Input: Hello

{
    "requestId": "f6ca3828-3b5f-99bf-8bae-90b4aa88923f",
    "usage": {
        "input_tokens": 253,
        "output_tokens": 7,
        "total_tokens": 260
    },
    "output": {
        "choices": [
            {
                "finish_reason": "stop",
                "message": {
                    "role": "assistant",
                    "content": "Hello! How can I help you?"
                }
            }
        ]
    }
}

HTTP

Sample code

import requests
import os
from datetime import datetime
import json
import dashscope
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
# Define a tool list. The model selects a tool based on the name and description of the tool
tools = [
    # Tool 1: obtain the current time
    {
        "type": "function",
        "function": {
            "name": "get_current_time",
            "description": "This tool can help you query the current time.",
            "parameters": {}  # No request parameter is needed. The parameters parameter is left empty
        }
    },  
    # Tool 2: obtain the weather of a specific city
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "This tool can help you query the weather of a city.",
            "parameters": {  # The parameters parameter is set to location, which specifies the location whose weather you want to query
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "A city, county, or district, such as Beijing, Hangzhou, or Yuhang."
                    }
                }
            },
            "required": [
                "location"
            ]
        }
    }
]

# Simulate the weather query tool. Sample response: "It is sunny today in Beijing."
def get_current_weather(location):
    return f"It's raining in {location}. "

# Simulate the time query tool. Sample response: "Current time: 2024-04-15 17:15:18."
def get_current_time():
    # Obtain the current date and time
    current_datetime = datetime.now()
    # Format the current date and time
    formatted_time = current_datetime.strftime('%Y-%m-%d %H:%M:%S')
    # Return the formatted current date and time
    return f"Current time: {formatted_time}."

def get_response(messages):
    api_key = os.getenv("DASHSCOPE_API_KEY")
    url = 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation'
    headers = {'Content-Type': 'application/json',
            'Authorization':f'Bearer {api_key}'}
    body = {
        'model': 'qwen-plus',
        "input": {
            "messages": messages
        },
        "parameters": {
            "result_format": "message",
            "tools": tools
        }
    }

    response = requests.post(url, headers=headers, json=body)
    return response.json()

messages = [
    {
        "role": "user",
        "content": "What's the weather like today?"
    }
]

def call_with_messages():
    messages = [
            {
                "content": input('Enter: '),  # Sample questions: "What time is it now?" "What time will it be in an hour?" "What is the weather like in Beijing?"
                "role": "user"
            }
    ]
    
    # Call the model in the first round
    first_response = get_response(messages)
    print(f"\nFirst round result: {first_response}")
    assistant_output = first_response['output']['choices'][0]['message']
    messages.append(assistant_output)
    if 'tool_calls' not in assistant_output:  # If the model determines that the tools are not needed, the response is directly printed. The model will not be called in the second round
        print(f"Final answer: {assistant_output['content']}")
        return
    # If the model selects the get_current_weather tool
    elif assistant_output['tool_calls'][0]['function']['name'] == 'get_current_weather':
        tool_info = {"name": "get_current_weather", "role":"tool"}
        location = json.loads(assistant_output['tool_calls'][0]['function']['arguments'])['location']
        tool_info['content'] = get_current_weather(location)
    # If the model selects the get_current_time tool
    elif assistant_output['tool_calls'][0]['function']['name'] == 'get_current_time':
        tool_info = {"name": "get_current_time", "role":"tool"}
        tool_info['content'] = get_current_time()
    print(f"Tool output: {tool_info['content']}")
    messages.append(tool_info)

    # Call the model in the second round to summarize the tool output
    second_response = get_response(messages)
    print(f"Second round result: {second_response}")
    print(f"Final answer: {second_response['output']['choices'][0]['message']['content']}")

if __name__ == '__main__':
    call_with_messages()
import java.io.BufferedReader;
import java.io.DataOutputStream;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.nio.charset.StandardCharsets;
import java.util.Scanner;
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import org.json.JSONArray;
import org.json.JSONObject;

public class Main {
    private static final String userAGENT = "Java-HttpURLConnection/1.0";
    public static void main(String[] args) throws Exception {
        // User input question
        Scanner scanner = new Scanner(System.in);
        System.out.println("Enter: ");
        String UserInput = scanner.nextLine();
        // Initialize messages
        JSONArray messages = new JSONArray();
        // Define system info
        JSONObject systemMessage = new JSONObject();
        systemMessage.put("role","system");
        systemMessage.put("content","You are a helpful assistant.");
        // Construct user_message based on user input
        JSONObject userMessage = new JSONObject();
        userMessage.put("role","user");
        userMessage.put("content",UserInput);
        // Add system_message and user_message to messages in order
        messages.put(systemMessage);
        messages.put(userMessage);
        // Call the model in the first round and print the result
        JSONObject responseJson = getResponse(messages);
        System.out.println("First round result:"+responseJson);
        // Obtain assistant_message
        JSONObject assistantMessage = responseJson.getJSONObject("output").getJSONArray("choices").getJSONObject(0).getJSONObject("message");
        // Initialize tool_message
        JSONObject toolMessage = new JSONObject();

        // If assistant_message does not have the tool_calls parameter, directly print the response in assistant_message and return
        if (! assistantMessage.has("tool_calls")){
            System.out.println("Final answer:"+assistantMessage.get("content"));
            return;
        }
        // If assistant_message has the tool_calls parameter, the model determines that the tools are needed
        else {
            // Add assistant_message to messages
            messages.put(assistantMessage);
            // If the model determines that the get_current_weather function is needed
            if (assistantMessage.getJSONArray("tool_calls").getJSONObject(0).getJSONObject("function").getString("name").equals("get_current_weather")) {
                // Obtain the arguments information and extract the location parameter
                JSONObject argumentsJson = new JSONObject(assistantMessage.getJSONArray("tool_calls").getJSONObject(0).getJSONObject("function").getString("arguments"));
                String location = argumentsJson.getString("location");
                // Run the tool function to obtain the tool output and print it
                String toolOutput = getCurrentWeather(location);
                System.out.println("Tool output:"+toolOutput);
                // Construct tool_message
                toolMessage.put("name","get_current_weather");
                toolMessage.put("role","tool");
                toolMessage.put("content",toolOutput);
            }
            // If the model determines that the get_current_time function is needed
            if (assistantMessage.getJSONArray("tool_calls").getJSONObject(0).getJSONObject("function").getString("name").equals("get_current_time")) {
                // Run the tool function to obtain the tool output and print it
                String toolOutput = getCurrentTime();
                System.out.println("Tool output:"+toolOutput);
                // Construct tool_message
                toolMessage.put("name","get_current_time");
                toolMessage.put("role","tool");
                toolMessage.put("content",toolOutput);
            }
        }
        // Add tool_message to messages
        messages.put(toolMessage);
        // Call the model in the second round and print the result
        JSONObject secondResponse = getResponse(messages);
        System.out.println("Second round result:"+secondResponse);
        System.out.println("Final answer:"+secondResponse.getJSONObject("output").getJSONArray("choices").getJSONObject(0).getJSONObject("message").getString("content"));
    }
    // Define the function to obtain the weather
    public static String getCurrentWeather(String location) {
        return location+" is raining.";
    }
    // Define the function to obtain the current time
    public static String getCurrentTime() {
        LocalDateTime now = LocalDateTime.now();
        DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss");
        String currentTime = "Current time: " + now.format(formatter) + ".";
        return currentTime;
    }
    // Encapsulate the response function of the model. Input: messages, Output: JSON-formatted HTTP response
    public static JSONObject getResponse(JSONArray messages) throws Exception{
        // Initialize the toolkit
        JSONArray tools = new JSONArray();
        // Define tool 1: obtain the current time
        String jsonStringTime = "{\"type\": \"function\", \"function\": {\"name\": \"get_current_time\", \"description\": \"This tool can help you query the current time.\", \"parameters\": {}}}";
        JSONObject getCurrentTimeJson = new JSONObject(jsonStringTime);
        // Define tool 2: obtain the weather of a specific region
        String jsonString_weather = "{\"type\": \"function\", \"function\": {\"name\": \"get_current_weather\", \"description\": \"This tool can help you query the weather of a city.\", \"parameters\": {\"type\": \"object\", \"properties\": {\"location\": {\"type\": \"string\", \"description\": \"A city, county, or district, such as Beijing, Hangzhou, or Yuhang.\"}}}, \"required\": [\"location\"]}}";
        JSONObject getCurrentWeatherJson = new JSONObject(jsonString_weather);
        // Add the two tools to the toolkit
        tools.put(getCurrentTimeJson);
        tools.put(getCurrentWeatherJson);
        String toolsString = tools.toString();
        // API call URL
        String urlStr = "https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation";
        // Obtain DASHSCOPE_API_KEY through environment variables
        String apiKey = System.getenv("DASHSCOPE_API_KEY");

        URL url = new URL(urlStr);
        HttpURLConnection connection = (HttpURLConnection) url.openConnection();
        connection.setRequestMethod("POST");
        // Define request header information
        connection.setRequestProperty("Content-Type", "application/json");
        connection.setRequestProperty("Authorization", "Bearer " + apiKey);
        connection.setDoOutput(true);
        // Define request body information
        String jsonInputString = String.format("{\"model\": \"qwen-max\", \"input\": {\"messages\":%s}, \"parameters\": {\"result_format\": \"message\",\"tools\":%s}}",messages.toString(),toolsString);

        // Obtain the HTTP response
        try (DataOutputStream wr = new DataOutputStream(connection.getOutputStream())) {
            wr.write(jsonInputString.getBytes(StandardCharsets.UTF_8));
            wr.flush();
        }
        StringBuilder response = new StringBuilder();
        try (BufferedReader in = new BufferedReader(
                new InputStreamReader(connection.getInputStream()))) {
            String inputLine;
            while ((inputLine = in.readLine()) != null) {
                response.append(inputLine);
            }
        }
        connection.disconnect();
        // Return the JSON-formatted response
        return new JSONObject(response.toString());
    }
}

Response

When you input Singapore weather, function calling is initiated and the tool_calls parameter is returned. When you input Hello, the model does not call tools and the tool_calls parameter is not returned. Sample response:

Input: Singapore weather

{
    'output': {
        'choices': [
            {
                'finish_reason': 'tool_calls',
                'message': {
                    'role': 'assistant',
                    'tool_calls': [
                        {
                            'function': {
                                'name': 'get_current_weather',
                                'arguments': '{
                                    "location": "Singapore"
                                }'
                            },
                            'index': 0,
                            'id': 'call_240d6341de4c484384849d',
                            'type': 'function'
                        }
                    ],
                    'content': ''
                }
            }
        ]
    },
    'usage': {
        'total_tokens': 235,
        'output_tokens': 18,
        'input_tokens': 217
    },
    'request_id': '235ed6a4-b6c0-9df0-aa0f-3c6dce89f3bd'
}

Input: Hello

{
    'output': {
        'choices': [
            {
                'finish_reason': 'stop',
                'message': {
                    'role': 'assistant',
                    'content': 'Hello! How can I help you?'
                }
            }
        ]
    },
    'usage': {
        'total_tokens': 223,
        'output_tokens': 7,
        'input_tokens': 216
    },
    'request_id': '42c42853-3caf-9815-96e8-9c950f4c26a0'
}

Asynchronous calling

You can use the Asyncio interface to implement concurrency and improve the efficiency of the program. Sample code:

OpenAI SDK

Sample code

import os
import asyncio
from openai import AsyncOpenAI
import platform

# Create an asynchronous client instance
client = AsyncOpenAI(
    # If the environment variable is not configured, replace the following line with the API Key: api_key="sk-xxx",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
)

# Define a list of asynchronous tasks
async def task(question):
    print(f"Sending question: {question}")
    response = await client.chat.completions.create(
        messages=[
            {"role": "user", "content": question}
        ],
        model="qwen-plus",
    )
    print(f"Received answer: {response.choices[0].message.content}")

# Main asynchronous function
async def main():
    questions = ["Who are you?", "What can you do?", "What is the weather like?"]
    tasks = [task(q) for q in questions]
    await asyncio.gather(*tasks)

if __name__ == '__main__':
    # Set the event loop policy
    if platform.system() == 'Windows':
        asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
    # Run the main coroutine
    asyncio.run(main(), debug=False)
    

DashScope SDK

Sample code

Your DashScope Python SDK version must be no less than 1.19.0.
import asyncio
import platform
from dashscope.aigc.generation import AioGeneration
import os
import dashscope
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

# Define a list of asynchronous tasks
async def task(question):
    print(f"Sending question: {question}")
    response = await AioGeneration.call(
        # If the environment variable is not configured, replace the following line with the API Key: api_key="sk-xxx",
        api_key=os.getenv("DASHSCOPE_API_KEY"),
        model="qwen-plus",
        prompt=question
        )
    print(f"Received answer: {response.output.text}")

# Main asynchronous function
async def main():
    questions = ["Who are you?", "What can you do?", "What is the weather like?"]
    tasks = [task(q) for q in questions]
    await asyncio.gather(*tasks)

if __name__ == '__main__':
    # Set the event loop policy
    if platform.system() == 'Windows':
        asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
    # Run the main coroutine
    asyncio.run(main(), debug=False)

Advanced parameters

temperature and top_p

These parameters are used to control the diversity of text generated by the model. The higher the temperature or top_p, the more diverse the generated text. The lower the values, the more deterministic the text.

  • Diverse texts are suitable for scenarios such as creative writing (novels or advertisement), brainstorming, chat applications, and others.

  • Deterministic texts are suitable for scenarios with clear answers (such as problem analysis, multiple-choice questions, factual queries) or requiring precise wording (such as technical documents, legal texts, news reports, academic papers).

How they works

temperature

  • The higher the temperature parameter, the flatter the probability distribution of tokens is (The chance of high-probability token decreases, and the chance of low-probability increases. , making the model more random in selecting the next token.

  • The lower the temperature parameter, the steeper the probability distribution of tokens is (The chance of high-probability tokens increases, and the chance of low-probability tokens decreases), making the model more inclined to choose a few high-probability tokens.

top_p

Top-p sampling refers to sampling from the set of tokens with the highest probabilities (the core set). This sampling method sorts all possible tokens by probability from high to low, then accumulates probabilities starting from the highest probability token until the cumulative probability reaches a threshold (For example, if top_p is set to 0.8, the threshold is 80%). Finally, the model randomly selects one token from these high-probability tokens for output.

  • The higher the top_p parameter, the more tokens are considered, resulting in more diverse generated text.

  • The lower the top_p parameter, the fewer tokens are considered, resulting in more focused and deterministic generated text.

API reference

For a complete parameter list for the OpenAI interface, see OpenAI. For a complete parameter list for the DashScope SDK, see DashScope.

Learn more

Prompt engineering

A prompt is a textual input given to an LLM about the problem to be solved or the task to be completed. Prompt is the foundation for the LLM to comprehend user requirements and generate relevant and precise responses. The process of designing and optimizing prompts to enhance LLM response is called prompt engineering.

If you are interested in prompt engineering, see Best practices for prompt engineering to learn how to build effective prompts to enhance model performance.

You can also go to the Prompt Engineering page of the Model Studio console to quickly use templates for text generation.

Plug-in calling

You can integrate plug-ins with LLMs to enhance their functionality against complex tasks, such as obtaining the latest information, avoiding hallucinations, and perform precise calculations.

For example, when the model can call the calculator plug-in, it can obtain the correct results of complex calculations.

  • Sample input

    12313 x 13232 = ?

  • Without plug-in

    The application cannot accurately solve the problem and may return incorrect answers. In this case, the correct answer is 162,925,616.

    计算器错.jpeg

  • With plug-in

    The application now has robust calculation capability and delivers the correct answer.

    image

  • Alibaba Cloud Model Studio provides the following official plug-ins: calculator, Python code interpreter, and image generation. You can also create custom plug-ins based on your business requirements. For more information about the plug-ins, see Plug-in overview.

Multimodal capabilities

Multimodal capability refers to the ability of a model to process and integrate various types of data modalities (such as text, images, audio, and video) for understanding, processing, and generating information. This capability enables the model to understand and generate content more comprehensively, enhance contextual understanding, and improve model performance.

Alibaba Cloud Model Studio supports:

  • Qwen-VL (Text + Image -> Text): A Qwen model with image understanding capabilities that can perform tasks such as OCR, visual reasoning, and text understanding. It supports resolutions of over a million pixels and images with any aspect ratio.

FAQ

  1. What do the suffixes of text generation models, such as -chat and -instruct, mean? Will they affect the performance of my models?

    These suffixes indicate that the model has undergone fine-tuning and reinforcement learning, with specialization in specific scenarios. You can choose the appropriate model based on your business scenario.

    • -chat indicates that the model is designed specifically for handling human-computer interaction, excelling at understanding context and generating coherent and contextually relevant responses. It is suitable for dialog tasks, such as chatbots, virtual assistants, or customer support scenarios, and is adept at providing natural, fluent, and conversationally appropriate responses.

    • -Instruct indicates that the model can understand and execute complex natural language instructions and has strong tool calling capabilities, suitable for executing specific instructions, such as answering questions, generating texts, translating, and other tasks.