The Qwen model in Alibaba Cloud Model Studio supports the OpenAI-compatible Responses API, an evolution of the Chat Completions API that provides native agent features more concisely.
Advantages over the OpenAI Chat Completions API:
Built-in tools: Includes web search, code interpreter, and web extractor. Use these tools together for best performance against complex tasks. See Call built-in tools.
More flexible input: Supports passing a string directly as model input. Compatible with message arrays in the Chat format.
Simplified context management: Pass the
previous_response_idfrom the previous response instead of manually building the complete message history array.
Prerequisites
Get an API key and export the API key as an environment variable. To use the SDK, install the SDK.
Availability
Currently, only qwen3-max-2026-01-23 is supported.
Endpoints
Currently, only the Singapore region is supported.
Singapore
base_url for SDK: https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1.
HTTP endpoint: POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses.
China (Beijing)
The base_url for SDK: https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1.
HTTP endpoint: POST https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses.
Code examples
Basic
Send a message and receive response.
Python
import os
from openai import OpenAI
client = OpenAI(
# If environment variable is not set, replace with: api_key="sk-xxx"
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)
response = client.responses.create(
model="qwen3-max-2026-01-23",
input="Hello, please introduce yourself in one sentence."
)
# Get model response
# print(response.model_dump_json())
print(response.output_text)Node.js
import OpenAI from "openai";
const openai = new OpenAI({
// If environment variable is not set, replace with: apiKey: "sk-xxx"
apiKey: process.env.DASHSCOPE_API_KEY,
baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});
async function main() {
const response = await openai.responses.create({
model: "qwen3-max-2026-01-23",
input: "Hello, please introduce yourself in one sentence."
});
// Get model response
console.log(response.output_text);
}
main();curl
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-max-2026-01-23",
"input": "Hello, please introduce yourself in one sentence."
}'Response example
The following shows the complete API response.
{
"id": "b8104cd2-ee57-903d-aae0-93d99254axxx",
"created_at": 1769084048.0,
"model": "qwen3-max-2026-01-23",
"object": "response",
"status": "completed",
"output": [
{
"id": "msg_1eb85c78-a627-4c7e-aac6-22235c173xxx",
"type": "message",
"role": "assistant",
"status": "completed",
"content": [
{
"type": "output_text",
"text": "Hello! I am Qwen, a large-scale language model developed by Tongyi Lab. I can answer questions, create text, write code, and express opinions. I am committed to providing you with accurate, useful, and friendly help.",
"annotations": []
}
]
}
],
"usage": {
"input_tokens": 39,
"output_tokens": 46,
"total_tokens": 85,
"input_tokens_details": {
"cached_tokens": 0
},
"output_tokens_details": {
"reasoning_tokens": 0
}
}
}Multi-turn conversation
Use the previous_response_id parameter to automatically link context without manually building a message history. Currently, each id is valid for 7 days.
Python
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)
# First turn
response1 = client.responses.create(
model="qwen3-max-2026-01-23",
input="My name is John, please remember it."
)
print(f"First response: {response1.output_text}")
# Second turn - use previous_response_id to link context
# The response id expires in 7 days
response2 = client.responses.create(
model="qwen3-max-2026-01-23",
input="Do you remember my name?",
previous_response_id=response1.id
)
print(f"Second response: {response2.output_text}")Node.js
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY,
baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});
async function main() {
// First turn
const response1 = await openai.responses.create({
model: "qwen3-max-2026-01-23",
input: "My name is John, please remember it."
});
console.log(`First response: ${response1.output_text}`);
// Second turn - use previous_response_id to link context
// The response id expires in 7 days
const response2 = await openai.responses.create({
model: "qwen3-max-2026-01-23",
input: "Do you remember my name?",
previous_response_id: response1.id
});
console.log(`Second response: ${response2.output_text}`);
}
main();curl
# First turn
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-max-2026-01-23",
"input": "My name is John, please remember it."
}'
# Second turn - use the id from first response as previous_response_id
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-max-2026-01-23",
"input": "Do you remember my name?",
"previous_response_id": "response_id_from_first_round"
}'Response example for the second turn
{
"id": "4730f70e-6aa3-9315-b4d1-c43c8e509xxx",
"created_at": 1769173209.0,
"model": "qwen3-max-2026-01-23",
"object": "response",
"status": "completed",
"output": [
{
"id": "msg_869508e7-590f-46c0-bd8d-e3b5e970exxx",
"type": "message",
"role": "assistant",
"status": "completed",
"content": [
{
"type": "output_text",
"text": "Yes, John! I remember your name. How can I assist you today?",
"annotations": []
}
]
}
],
"usage": {
"input_tokens": 78,
"output_tokens": 16,
"total_tokens": 94,
"input_tokens_details": {
"cached_tokens": 0
},
"output_tokens_details": {
"reasoning_tokens": 0
}
}
}Note: The input_tokens for the second turn of conversation is 78. This value includes the context from the first turn, and the model successfully remembered the name "John".
Streaming output
Receive model-generated content in real-time, suitable for long text generation scenarios.
Python
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)
stream = client.responses.create(
model="qwen3-max-2026-01-23",
input="Please briefly introduce artificial intelligence.",
stream=True
)
print("Receiving stream output:")
for event in stream:
# print(event.model_dump_json()) # Uncomment to see raw event response
if event.type == 'response.output_text.delta':
print(event.delta, end='', flush=True)
elif event.type == 'response.completed':
print("\nStream completed")
print(f"Total tokens: {event.response.usage.total_tokens}")Node.js
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY,
baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});
async function main() {
const stream = await openai.responses.create({
model: "qwen3-max-2026-01-23",
input: "Please briefly introduce artificial intelligence.",
stream: true
});
console.log("Receiving stream output:");
for await (const event of stream) {
// console.log(JSON.stringify(event)); // Uncomment to see raw event response
if (event.type === 'response.output_text.delta') {
process.stdout.write(event.delta);
} else if (event.type === 'response.completed') {
console.log("\nStream completed");
console.log(`Total tokens: ${event.response.usage.total_tokens}`);
}
}
}
main();curl
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-max-2026-01-23",
"input": "Please briefly introduce artificial intelligence.",
"stream": true
}'Response example
{"response":{"id":"47a71e7d-868c-4204-9693-ef8ff9058xxx","created_at":1769417481.0,"error":null,"incomplete_details":null,"instructions":null,"metadata":null,"model":"","object":"response","output":[],"parallel_tool_calls":false,"temperature":null,"tool_choice":"auto","tools":[],"top_p":null,"background":null,"completed_at":null,"conversation":null,"max_output_tokens":null,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"prompt_cache_key":null,"prompt_cache_retention":null,"reasoning":null,"safety_identifier":null,"service_tier":null,"status":"queued","text":null,"top_logprobs":null,"truncation":null,"usage":null,"user":null},"sequence_number":0,"type":"response.created"}
{"response":{"id":"47a71e7d-868c-4204-9693-ef8ff9058xxx","created_at":1769417481.0,"error":null,"incomplete_details":null,"instructions":null,"metadata":null,"model":"","object":"response","output":[],"parallel_tool_calls":false,"temperature":null,"tool_choice":"auto","tools":[],"top_p":null,"background":null,"completed_at":null,"conversation":null,"max_output_tokens":null,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"prompt_cache_key":null,"prompt_cache_retention":null,"reasoning":null,"safety_identifier":null,"service_tier":null,"status":"in_progress","text":null,"top_logprobs":null,"truncation":null,"usage":null,"user":null},"sequence_number":1,"type":"response.in_progress"}
{"item":{"id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","content":[],"role":"assistant","status":"in_progress","type":"message"},"output_index":0,"sequence_number":2,"type":"response.output_item.added"}
{"content_index":0,"item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","output_index":0,"part":{"annotations":[],"text":"","type":"output_text","logprobs":null},"sequence_number":3,"type":"response.content_part.added"}
{"content_index":0,"delta":"Artificial intelligence","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":4,"type":"response.output_text.delta"}
{"content_index":0,"delta":" (Art","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":5,"type":"response.output_text.delta"}
{"content_index":0,"delta":"ificial Intelligence,","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":6,"type":"response.output_text.delta"}
{"content_index":0,"delta":" or AI for short)","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":7,"type":"response.output_text.delta"}
... (intermediate events omitted) ...
{"content_index":0,"delta":" fields, and is profoundly changing our","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":38,"type":"response.output_text.delta"}
{"content_index":0,"delta":" lives and work","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":39,"type":"response.output_text.delta"}
{"content_index":0,"delta":".","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":40,"type":"response.output_text.delta"}
{"content_index":0,"item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":41,"text":"Artificial intelligence (AI) is the technology and science of computer systems that simulate human intelligent behavior. xxxx","type":"response.output_text.done"}
{"content_index":0,"item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","output_index":0,"part":{"annotations":[],"text":"Artificial intelligence (AI) is the technology and science of computer systems that simulate human intelligent behavior. xxx","type":"output_text","logprobs":null},"sequence_number":42,"type":"response.content_part.done"}
{"item":{"id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","content":[{"annotations":[],"text":"Artificial intelligence (AI) is the technology and science of computer systems that simulate human intelligent behavior. It aims to enable machines to perform tasks that typically require human intelligence, such as:\n\n- **Learning** (such as training models with data)\n- **Reasoning** (such as logical judgment and problem-solving)\n- **Perception** (such as recognizing images, speech, or text)\n- **Understanding language** (such as natural language processing)\n- **Decision-making** (such as making optimal choices in complex environments)\n\nArtificial intelligence can be divided into **weak AI** (focused on specific tasks, such as voice assistants and recommendation systems) and **strong AI** (possessing general intelligence similar to humans, which has not yet been achieved).\n\nCurrently, AI is widely used in many fields, such as healthcare, finance, transportation, education, and entertainment, and is profoundly changing the way we live and work.","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"},"output_index":0,"sequence_number":43,"type":"response.output_item.done"}
{"response":{"id":"47a71e7d-868c-4204-9693-ef8ff9058xxx","created_at":1769417481.0,"error":null,"incomplete_details":null,"instructions":null,"metadata":null,"model":"qwen3-max-2026-01-23","object":"response","output":[{"id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","content":[{"annotations":[],"text":"Artificial intelligence (AI) is xxxxxx","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"}],"parallel_tool_calls":false,"temperature":null,"tool_choice":"auto","tools":[],"top_p":null,"background":null,"completed_at":null,"conversation":null,"max_output_tokens":null,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"prompt_cache_key":null,"prompt_cache_retention":null,"reasoning":null,"safety_identifier":null,"service_tier":null,"status":"completed","text":null,"top_logprobs":null,"truncation":null,"usage":{"input_tokens":37,"input_tokens_details":{"cached_tokens":0},"output_tokens":166,"output_tokens_details":{"reasoning_tokens":0},"total_tokens":203},"user":null},"sequence_number":44,"type":"response.completed"}Call built-in tools
Enable all built-in tools at the same time for best results against complex tasks. The web extractor and code interpreter tools are currently free for a limited time. For more information, see Web search, Web extractor, and Code interpreter.
Python
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)
response = client.responses.create(
model="qwen3-max-2026-01-23",
input="Find the Alibaba Cloud website and extract key information",
# For best results, enable all the built-in tools
tools=[
{"type": "web_search"},
{"type": "code_interpreter"},
{"type": "web_extractor"}
],
extra_body={"enable_thinking": True}
)
# Uncomment the line below to see the intermediate output
# print(response.output)
print(response.output_text)
Node.js
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: process.env.DASHSCOPE_API_KEY,
baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});
async function main() {
const response = await openai.responses.create({
model: "qwen3-max-2026-01-23",
input: "Find the Alibaba Cloud website and extract key information",
tools: [
{ type: "web_search" },
{ type: "code_interpreter" },
{ type: "web_extractor" }
],
enable_thinking: true
});
for (const item of response.output) {
if (item.type === "reasoning") {
console.log("Model is thinking...");
} else if (item.type === "web_search_call") {
console.log(`Search query: ${item.action.query}`);
} else if (item.type === "web_extractor_call") {
console.log("Extracting web content...");
} else if (item.type === "message") {
console.log(`Response: ${item.content[0].text}`);
}
}
}
main();curl
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-max-2026-01-23",
"input": "Find the Alibaba Cloud website and extract key information",
"tools": [
{
"type": "web_search"
},
{
"type": "code_interpreter"
},
{
"type": "web_extractor"
}
],
"enable_thinking": true
}'Response example
{
"id": "69258b21-5099-9d09-92e8-8492b1955xxx",
"object": "response",
"status": "completed",
"output": [
{
"type": "reasoning",
"summary": [
{
"type": "summary_text",
"text": "The user wants to find the Alibaba Cloud official website and extract information..."
}
]
},
{
"type": "web_search_call",
"status": "completed",
"action": {
"query": "Alibaba Cloud official website",
"type": "search",
"sources": [
{
"type": "url",
"url": "https://cn.aliyun.com/"
},
{
"type": "url",
"url": "https://www.alibabacloud.com/zh"
}
]
}
},
{
"type": "reasoning",
"summary": [
{
"type": "summary_text",
"text": "The search results show the URL of the Alibaba Cloud official website..."
}
]
},
{
"type": "web_extractor_call",
"status": "completed",
"goal": "Extract key information from the Alibaba Cloud official website home page",
"output": "Qwen LLM, complete product system, AI solutions...",
"urls": [
"https://cn.aliyun.com/"
]
},
{
"type": "message",
"role": "assistant",
"status": "completed",
"content": [
{
"type": "output_text",
"text": "Key information from the Alibaba Cloud official website: Qwen LLM, cloud computing services..."
}
]
}
],
"usage": {
"input_tokens": 40836,
"output_tokens": 2106,
"total_tokens": 42942,
"output_tokens_details": {
"reasoning_tokens": 677
},
"x_tools": {
"web_extractor": {
"count": 1
},
"web_search": {
"count": 1
}
}
}
}Migrate from Chat Completions API to Responses API
If you use the OpenAI Chat Completions API, follow these steps to migrate to the Responses API. The Responses API is compatible with the Chat Completions API but offers a simpler interface and more powerful features.
1. Update the endpoint and base_url
Update both of the following:
Endpoint path: Update the path from
/v1/chat/completionsto/v1/responses.base_url:
Singapore: Update the URL from
https://dashscope-intl.aliyuncs.com/compatible-mode/v1tohttps://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1.
Python
# Chat Completions API
completion = client.chat.completions.create(
model="qwen3-max-2026-01-23",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
)
print(completion.choices[0].message.content)
# Responses API - can use the same message format
response = client.responses.create(
model="qwen3-max-2026-01-23",
input=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
)
print(response.output_text)
# Responses API - or use a more concise format
response = client.responses.create(
model="qwen3-max-2026-01-23",
input="Hello!"
)
print(response.output_text)Node.js
// Chat Completions API
const completion = await client.chat.completions.create({
model: "qwen3-max-2026-01-23",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hello!" }
]
});
console.log(completion.choices[0].message.content);
// Responses API - can use the same message format
const response = await client.responses.create({
model: "qwen3-max-2026-01-23",
input: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hello!" }
]
});
console.log(response.output_text);
// Responses API - or use a more concise format
const response2 = await client.responses.create({
model: "qwen3-max-2026-01-23",
input: "Hello!"
});
console.log(response2.output_text);curl
# Chat Completions API
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-max-2026-01-23",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
}'
# Responses API - use a more concise format
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-max-2026-01-23",
"input": "Hello!"
}'2. Update response handling
The response structure of the Responses API is different. Use the output_text shortcut method to retrieve the text output or access the details through the output array.
Response comparison
| |
3. Simplify multi-turn management
In the Chat Completions API, you must manually manage the message history array. In contrast, the Responses API provides previous_response_id to automatically link the context. Currently, each id is valid for 7 days.
Python
| |
Node.js
| |
4. Use built-in tools
The Responses API has multiple built-in tools that you do not need to implement. You can specify them in the tools parameter. The code interpreter and web extractor tools are currently free for a limited time. For more information, see Web search, Code interpreter, and Web extractor.
Python
| |
Node.js
| |
curl
| |
FAQ
Q: How to pass the context for a multi-turn conversation?
A: To start a new turn in the conversation, just pass the id from the previous response as previous_response_id.