OpenAI Responses API Reference - Alibaba Cloud Model Studio

Alibaba Cloud Model Studio’s Qwen models support the OpenAI-compatible Responses API. As an evolution of the Chat Completions API, the Responses API delivers native agent capabilities in a simpler way.

Advantages over the OpenAI Chat Completions API:

Built-in tools: Includes built-in tools such as web search, web scraping, code interpreter, text-to-image search, and image-to-image search. These tools deliver better results for complex tasks. For details, see Call built-in tools.
More flexible input: Accepts a plain string as model input and also supports an array of messages in Chat format.
Simplified context management: Pass the previous_response_id from the previous response instead of manually constructing a full message history array.

For input and output parameter details, see OpenAI Responses API reference.

Prerequisites

First, get an API key and set the API key as an environment variable. If you call the API using the OpenAI SDK, install the SDK.

Supported models

qwen3-max, qwen3-max-2026-01-23, qwen3.5-plus, qwen3.5-plus-2026-02-15, qwen3.5-flash, qwen3.5-flash-2026-02-23, qwen3.5-397b-a17b, qwen3.5-122b-a10b, qwen3.5-27b, qwen3.5-35b-a3b, qwen-plus, qwen-flash, qwen3-coder-plus, qwen3-coder-flash.

Service endpoints

Singapore

base_url for SDK: https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1

HTTP endpoint: POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses

China (Beijing)

base_url for SDK: https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1

HTTP endpoint: POST https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses

Code examples

Basic call

The simplest way to call the API: send one message and obtain the model’s reply.

Python

import os
from openai import OpenAI

client = OpenAI(
    # If environment variable is not set, replace with: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3.5-plus",
    input="What can you do?"
)

# Get model response
# print(response.model_dump_json())
print(response.output_text)

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    // If environment variable is not set, replace with: apiKey: "sk-xxx"
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "What can you do?"
    });

    // Get model response
    console.log(response.output_text);
}

main();

curl

curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "What can you do?"
}'

Response example

Below is the full API response.

{
    "created_at": 1771226624,
    "id": "bf0d5c2e-f14b-9ad7-bc0d-ee0c8c9ee2d8",
    "model": "qwen3-max-2026-01-23",
    "object": "response",
    "output": [
        {
            "content": [
                {
                    "annotations": [],
                    "text": "Hi there!  I'm actually quite ......",
                    "type": "output_text"
                }
            ],
            "id": "msg_1e17fdb2-5fc3-4c78-a9e9-cbd78eb043f0",
            "role": "assistant",
            "status": "completed",
            "type": "message"
        }
    ],
    "parallel_tool_calls": false,
    "status": "completed",
    "tool_choice": "auto",
    "tools": [],
    "usage": {
        "input_tokens": 37,
        "input_tokens_details": {
            "cached_tokens": 0
        },
        "output_tokens": 220,
        "output_tokens_details": {
            "reasoning_tokens": 0
        },
        "total_tokens": 257,
        "x_details": [
            {
                "input_tokens": 37,
                "output_tokens": 220,
                "total_tokens": 257,
                "x_billing_type": "response_api"
            }
        ]
    }
}

Multi-turn conversation

Use the previous_response_id parameter to automatically link context. You don’t need to build the message history manually. The current response id is valid for 7 days.

id (f0dbb153-117f-9bbf-8176-5284b47f3xxx, in UUID format) from the previous response as previous_response_id. Do not use the id (msg_56c860c4-3ad8-4a96-8553-d2f94c259xxx) of a message inside the output array.

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

# First round
response1 = client.responses.create(
    model="qwen3.5-plus",
    input="My name is John, please remember it."
)
print(f"First response: {response1.output_text}")

# Second round - use previous_response_id to link context
# The response id expires in 7 days
response2 = client.responses.create(
    model="qwen3.5-plus",
    input="Do you remember my name?",
    previous_response_id=response1.id
)
print(f"Second response: {response2.output_text}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    // First round
    const response1 = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "My name is John, please remember it."
    });
    console.log(`First response: ${response1.output_text}`);

    // Second round - use previous_response_id to link context
    // The response id expires in 7 days
    const response2 = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "Do you remember my name?",
        previous_response_id: response1.id
    });
    console.log(`Second response: ${response2.output_text}`);
}

main();

curl

# First round
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "My name is John, please remember it."
}'

# Second round - use the id from first response as previous_response_id
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "Do you remember my name?",
    "previous_response_id": "response_id_from_first_round"
}'

Second-round response example

{
  "id": "f0dbb153-117f-9bbf-8176-5284b47f3xxx",
  "created_at": 1769173209.0,
  "model": "qwen3.5-plus",
  "object": "response",
  "status": "completed",
  "output": [
    {
      "id": "msg_56c860c4-3ad8-4a96-8553-d2f94c259xxx",
      "type": "message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "Yes, John! I remember your name. How can I assist you today?",
          "annotations": []
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 78,
    "output_tokens": 16,
    "total_tokens": 94,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens_details": {
      "reasoning_tokens": 0
    }
  }
}

Note: The second-round input_tokens count is 78, which includes the context from the first round. The model successfully remembered the name "John."

Streaming output

Use streaming output to receive model-generated content in real time. This is ideal for long-text generation scenarios.

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

stream = client.responses.create(
    model="qwen3.5-plus",
    input="Please briefly introduce artificial intelligence.",
    stream=True
)

print("Receiving stream output:")
for event in stream:
    # print(event.model_dump_json())  # Uncomment to see raw event response
    if event.type == 'response.output_text.delta':
        print(event.delta, end='', flush=True)
    elif event.type == 'response.completed':
        print("\nStream completed")
        print(f"Total tokens: {event.response.usage.total_tokens}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const stream = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "Please briefly introduce artificial intelligence.",
        stream: true
    });

    console.log("Receiving stream output:");
    for await (const event of stream) {
        // console.log(JSON.stringify(event));  // Uncomment to see raw event response
        if (event.type === 'response.output_text.delta') {
            process.stdout.write(event.delta);
        } else if (event.type === 'response.completed') {
            console.log("\nStream completed");
            console.log(`Total tokens: ${event.response.usage.total_tokens}`);
        }
    }
}

main();

curl

curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "Please briefly introduce artificial intelligence.",
    "stream": true
}'

Response example

{"response":{"id":"47a71e7d-868c-4204-9693-ef8ff9058xxx","created_at":1769417481.0,"error":null,"incomplete_details":null,"instructions":null,"metadata":null,"model":"","object":"response","output":[],"parallel_tool_calls":false,"temperature":null,"tool_choice":"auto","tools":[],"top_p":null,"background":null,"completed_at":null,"conversation":null,"max_output_tokens":null,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"prompt_cache_key":null,"prompt_cache_retention":null,"reasoning":null,"safety_identifier":null,"service_tier":null,"status":"queued","text":null,"top_logprobs":null,"truncation":null,"usage":null,"user":null},"sequence_number":0,"type":"response.created"}
{"response":{"id":"47a71e7d-868c-4204-9693-ef8ff9058xxx","created_at":1769417481.0,"error":null,"incomplete_details":null,"instructions":null,"metadata":null,"model":"","object":"response","output":[],"parallel_tool_calls":false,"temperature":null,"tool_choice":"auto","tools":[],"top_p":null,"background":null,"completed_at":null,"conversation":null,"max_output_tokens":null,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"prompt_cache_key":null,"prompt_cache_retention":null,"reasoning":null,"safety_identifier":null,"service_tier":null,"status":"in_progress","text":null,"top_logprobs":null,"truncation":null,"usage":null,"user":null},"sequence_number":1,"type":"response.in_progress"}
{"item":{"id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","content":[],"role":"assistant","status":"in_progress","type":"message"},"output_index":0,"sequence_number":2,"type":"response.output_item.added"}
{"content_index":0,"item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","output_index":0,"part":{"annotations":[],"text":"","type":"output_text","logprobs":null},"sequence_number":3,"type":"response.content_part.added"}
{"content_index":0,"delta":"Artificial intelligence","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":4,"type":"response.output_text.delta"}
{"content_index":0,"delta":" (Art","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":5,"type":"response.output_text.delta"}
{"content_index":0,"delta":"ificial Intelligence, ","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":6,"type":"response.output_text.delta"}
{"content_index":0,"delta":"or AI)","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":7,"type":"response.output_text.delta"}
... (intermediate events omitted) ...
{"content_index":0,"delta":"fields, profoundly changing our","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":38,"type":"response.output_text.delta"}
{"content_index":0,"delta":" work and daily lives","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":39,"type":"response.output_text.delta"}
{"content_index":0,"delta":".","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":40,"type":"response.output_text.delta"}
{"content_index":0,"item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":41,"text":"Artificial intelligence (Artificial Intelligence, or AI) refers to the technology and science of simulating human-like intelligent behavior using computer systems. xxxx","type":"response.output_text.done"}
{"content_index":0,"item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","output_index":0,"part":{"annotations":[],"text":"Artificial intelligence (Artificial Intelligence, or AI) refers to the technology and science of simulating human-like intelligent behavior using computer systems. xxx","type":"output_text","logprobs":null},"sequence_number":42,"type":"response.content_part.done"}
{"item":{"id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","content":[{"annotations":[],"text":"Artificial intelligence (Artificial Intelligence, or AI) refers to the technology and science of simulating human-like intelligent behavior using computer systems. It aims to enable machines to perform tasks that typically require human intelligence, such as:\n\n- **Learning** (e.g., training models with data)  \n- **Reasoning** (e.g., logical judgment and problem solving)  \n- **Perception** (e.g., recognizing images, speech, or text)  \n- **Language understanding** (e.g., natural language processing)  \n- **Decision-making** (e.g., making optimal choices in complex environments)\n\nAI can be categorized into **weak AI** (focused on specific tasks, like voice assistants or recommendation systems) and **strong AI** (possessing general intelligence similar to humans, which has not yet been achieved).\n\nToday, AI is widely used in healthcare, finance, transportation, education, entertainment, and many other fields, profoundly changing our work and daily lives.","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"},"output_index":0,"sequence_number":43,"type":"response.output_item.done"}
{"response":{"id":"47a71e7d-868c-4204-9693-ef8ff9058xxx","created_at":1769417481.0,"error":null,"incomplete_details":null,"instructions":null,"metadata":null,"model":"qwen3.5-plus","object":"response","output":[{"id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","content":[{"annotations":[],"text":"Artificial intelligence (Artificial Intelligence, or AI) isxxxxxx","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"}],"parallel_tool_calls":false,"temperature":null,"tool_choice":"auto","tools":[],"top_p":null,"background":null,"completed_at":null,"conversation":null,"max_output_tokens":null,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"prompt_cache_key":null,"prompt_cache_retention":null,"reasoning":null,"safety_identifier":null,"service_tier":null,"status":"completed","text":null,"top_logprobs":null,"truncation":null,"usage":{"input_tokens":37,"input_tokens_details":{"cached_tokens":0},"output_tokens":166,"output_tokens_details":{"reasoning_tokens":0},"total_tokens":203},"user":null},"sequence_number":44,"type":"response.completed"}

Call built-in tools

Enable built-in tools to achieve better results for complex tasks. Web scraping and the code interpreter are currently free for a limited time. For a list of supported tools, see Tool calling.

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3.5-plus",
    input="Find the Alibaba Cloud website and extract key information",
    # For best results, enable all the built-in tools
    tools=[
        {"type": "web_search"},
        {"type": "code_interpreter"},
        {"type": "web_extractor"}
    ],
    extra_body={"enable_thinking": True}
)

# Uncomment the line below to see the intermediate output
# print(response.output)
print(response.output_text)

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "Find the Alibaba Cloud website and extract key information",
        tools: [
            { type: "web_search" },
            { type: "code_interpreter" },
            { type: "web_extractor" }
        ],
        enable_thinking: true
    });

    for (const item of response.output) {
        if (item.type === "reasoning") {
            console.log("Model is thinking...");
        } else if (item.type === "web_search_call") {
            console.log(`Search query: ${item.action.query}`);
        } else if (item.type === "web_extractor_call") {
            console.log("Extracting web content...");
        } else if (item.type === "message") {
            console.log(`Response: ${item.content[0].text}`);
        }
    }
}

main();

curl

curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "Find the Alibaba Cloud website and extract key information",
    "tools": [
        {
            "type": "web_search"
        },
        {
            "type": "code_interpreter"
        },
        {
            "type": "web_extractor"
        }
    ],
    "enable_thinking": true
}'

Response example

{
    "id": "69258b21-5099-9d09-92e8-8492b1955xxx",
    "object": "response",
    "status": "completed",
    "output": [
        {
            "type": "reasoning",
            "summary": [
                {
                    "type": "summary_text",
                    "text": "The user asked to find the Alibaba Cloud website and extract information..."
                }
            ]
        },
        {
            "type": "web_search_call",
            "status": "completed",
            "action": {
                "query": "Alibaba Cloud official website",
                "type": "search",
                "sources": [
                    {
                        "type": "url",
                        "url": "https://cn.aliyun.com/"
                    },
                    {
                        "type": "url",
                        "url": "https://www.alibabacloud.com/zh"
                    }
                ]
            }
        },
        {
            "type": "reasoning",
            "summary": [
                {
                    "type": "summary_text",
                    "text": "Search results show the Alibaba Cloud website URLs..."
                }
            ]
        },
        {
            "type": "web_extractor_call",
            "status": "completed",
            "goal": "Extract key information from the Alibaba Cloud homepage",
            "output": "Qwen large models, complete product portfolio, AI solutions...",
            "urls": [
                "https://cn.aliyun.com/"
            ]
        },
        {
            "type": "message",
            "role": "assistant",
            "status": "completed",
            "content": [
                {
                    "type": "output_text",
                    "text": "Key information from the Alibaba Cloud website: Qwen large models, cloud computing services..."
                }
            ]
        }
    ],
    "usage": {
        "input_tokens": 40836,
        "output_tokens": 2106,
        "total_tokens": 42942,
        "output_tokens_details": {
            "reasoning_tokens": 677
        },
        "x_tools": {
            "web_extractor": {
                "count": 1
            },
            "web_search": {
                "count": 1
            }
        }
    }
}

Session cache

In multi-turn conversation scenarios, session cache allows the server to automatically cache the context. This reduces inference latency and usage costs. You do not need to manage the cache manually. Simply make calls as you would in a normal multi-turn conversation.

Usage: To enable the session cache, add x-dashscope-session-cache: enable to the request header. To disable it, set the value to disable.

Supported models: qwen3-max, qwen3.5-plus, qwen3.5-flash, qwen-plus, qwen-flash, qwen3-coder-plus, and qwen3-coder-flash

The minimum prompt length for the session cache is 1,024 tokens, and the cache validity period is 5 minutes. The constraints are the same as those for explicit caching.

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
    # Enable session cache through default_headers
    default_headers={"x-dashscope-session-cache": "enable"}
)

# Construct a long text of over 1,024 tokens to ensure cache creation. If the text is less than 1,024 tokens, the cache will be created when the accumulated context exceeds 1,024 tokens.
long_context = "Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence." * 50

# First round
response1 = client.responses.create(
    model="qwen3.5-plus",
    input=long_context + "\n\nBased on the background knowledge above, please briefly introduce the random forest algorithm in machine learning.",
)
print(f"First response: {response1.output_text}")

# Second round: Link context using previous_response_id. The cache is handled automatically by the server.
response2 = client.responses.create(
    model="qwen3.5-plus",
    input="What are the main differences between it and GBDT?",
    previous_response_id=response1.id,
)
print(f"Second response: {response2.output_text}")

# Check the cache hit status
usage = response2.usage
print(f"Input Tokens: {usage.input_tokens}")
print(f"Cached Tokens: {usage.input_tokens_details.cached_tokens}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
    // Enable session cache through defaultHeaders
    defaultHeaders: {"x-dashscope-session-cache": "enable"}
});

// Construct a long text of over 1,024 tokens to ensure cache creation. If the text is less than 1,024 tokens, the cache will be created when the accumulated context exceeds 1,024 tokens.
const longContext = "Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence.".repeat(50);

async function main() {
    // First round
    const response1 = await openai.responses.create({
        model: "qwen3.5-plus",
        input: longContext + "\n\nBased on the background knowledge above, please briefly introduce the random forest algorithm in machine learning, including its basic principles and application scenarios."
    });
    console.log(`First response: ${response1.output_text}`);

    // Second round: Link context using previous_response_id. The cache is handled automatically by the server.
    const response2 = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "What are the main differences between it and GBDT?",
        previous_response_id: response1.id
    });
    console.log(`Second response: ${response2.output_text}`);

    // Check the cache hit status
    console.log(`Input Tokens: ${response2.usage.input_tokens}`);
    console.log(`Cached Tokens: ${response2.usage.input_tokens_details.cached_tokens}`);
}

main();

curl

# First round
# Replace the input with a long text of over 1,024 tokens to ensure cache creation.
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "x-dashscope-session-cache: enable" \
-d '{
    "model": "qwen3.5-plus",
    "input": "Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence.\n\nBased on the background knowledge above, please briefly introduce the random forest algorithm in machine learning, including its basic principles and application scenarios."
}'

# Second round - use the ID from the first response as previous_response_id
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "x-dashscope-session-cache: enable" \
-d '{
    "model": "qwen3.5-plus",
    "input": "What are the main differences between it and GBDT?",
    "previous_response_id": "response_id_from_first_round"
}'

Migrate from Chat Completions to Responses API

If you currently use the OpenAI Chat Completions API, follow these steps to migrate to the Responses API. The Responses API offers a simpler interface and more powerful features while maintaining compatibility with Chat Completions.

1. Update endpoint URL and base_url

Update both of the following:

Endpoint path: Change from /v1/chat/completions to /v1/responses
base_url:
- China (Beijing): Change from https://dashscope.aliyuncs.com/compatible-mode/v1 to https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1
- Singapore: Change from https://dashscope-intl.aliyuncs.com/compatible-mode/v1 to https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1

Python

# Chat Completions API
completion = client.chat.completions.create(
    model="qwen3.5-plus",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)
print(completion.choices[0].message.content)

# Responses API - can use the same message format
response = client.responses.create(
    model="qwen3.5-plus",
    input=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)
print(response.output_text)

# Responses API - or use a more concise format
response = client.responses.create(
    model="qwen3.5-plus",
    input="Hello!"
)
print(response.output_text)

Node.js

// Chat Completions API
const completion = await client.chat.completions.create({
    model: "qwen3.5-plus",
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "Hello!" }
    ]
});
console.log(completion.choices[0].message.content);

// Responses API - can use the same message format
const response = await client.responses.create({
    model: "qwen3.5-plus",
    input: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "Hello!" }
    ]
});
console.log(response.output_text);

// Responses API - or use a more concise format
const response2 = await client.responses.create({
    model: "qwen3.5-plus",
    input: "Hello!"
});
console.log(response2.output_text);

curl

# Chat Completions API
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
}'

# Responses API - use a more concise format
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "Hello!"
}'

2. Update response handling

The Responses API uses a different response structure. Use the output_text shortcut to obtain text output, or access detailed information through the output array.

Response comparison

# Chat Completions Response
{
  "id": "chatcmpl-416b0ea5-e362-9fec-97c5-0a60b5d7xxx",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "Hello! I'm happy to see you~  How can I help you?",
        "refusal": null,
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1769416269,
  "model": "qwen3.5-plus",
  "object": "chat.completion",
  "service_tier": null,
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 14,
    "prompt_tokens": 22,
    "total_tokens": 36,
    "prompt_tokens_details": {
      "cached_tokens": 0
    }
  }
}

# Responses API Response
{
  "id": "d69c735d-0f5e-4b6c-9c2a-8cab5eb14xxx",
  "created_at": 1769416269.0,
  "model": "qwen3.5-plus",
  "object": "response",
  "status": "completed",
  "output": [
    {
      "id": "msg_3426d3e5-8da7-4dd8-a6a5-7c2cd866xxx",
      "type": "message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "Hello! Today is Monday, January 26, 2026. How can I help you? ",
          "annotations": []
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 34,
    "output_tokens": 25,
    "total_tokens": 59,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens_details": {
      "reasoning_tokens": 0
    }
  }
}

3. Simplify multi-turn conversation management

Chat Completions requires manual management of the message history array. The Responses API provides the previous_response_id parameter to automatically link context. The current response id is valid for 7 days.

Python

# Chat Completions - manual message history management
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
]
res1 = client.chat.completions.create(
    model="qwen3.5-plus",
    messages=messages
)

# Manually add response to history
messages.append(res1.choices[0].message)
messages.append({"role": "user", "content": "What is its population?"})

res2 = client.chat.completions.create(
    model="qwen3.5-plus",
    messages=messages
)

# Responses API - automatic linking with previous_response_id
res1 = client.responses.create(
    model="qwen3.5-plus",
    input="What is the capital of France?"
)

# Just pass the previous response ID
res2 = client.responses.create(
    model="qwen3.5-plus",
    input="What is its population?",
    previous_response_id=res1.id
)

Node.js

// Chat Completions - manual message history management
let messages = [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "What is the capital of France?" }
];
const res1 = await client.chat.completions.create({
    model: "qwen3.5-plus",
    messages
});

// Manually add response to history
messages = messages.concat([res1.choices[0].message]);
messages.push({ role: "user", content: "What is its population?" });

const res2 = await client.chat.completions.create({
    model: "qwen3.5-plus",
    messages
});

// Responses API - automatic linking with previous_response_id
const res1 = await client.responses.create({
    model: "qwen3.5-plus",
    input: "What is the capital of France?"
});

// Just pass the previous response ID
const res2 = await client.responses.create({
    model: "qwen3.5-plus",
    input: "What is its population?",
    previous_response_id: res1.id
});

4. Use built-in tools

The Responses API includes multiple built-in tools. You don’t need to implement them yourself. Simply specify them in the tools parameter. The code interpreter and web scraper are currently free for a limited time. For details, see Tool calling.

Python

# Chat Completions - need to implement tool functions yourself
def web_search(query):
    # Need to implement web search logic yourself
    import requests
    r = requests.get(f"https://api.example.com/search?q={query}")
    return r.json().get("results", [])

completion = client.chat.completions.create(
    model="qwen3.5-plus",
    messages=[{"role": "user", "content": "Who is the current president of France?"}],
    functions=[{
        "name": "web_search",
        "description": "Search the web for information",
        "parameters": {
            "type": "object",
            "properties": {"query": {"type": "string"}},
            "required": ["query"]
        }
    }]
)

# Responses API - use built-in tools directly
response = client.responses.create(
    model="qwen3.5-plus",
    input="Who is the current president of France?",
    tools=[{"type": "web_search"}]  # Enable web search directly
)
print(response.output_text)

Node.js

// Chat Completions - need to implement tool functions yourself
async function web_search(query) {
    const fetch = (await import('node-fetch')).default;
    const res = await fetch(`https://api.example.com/search?q=${query}`);
    const data = await res.json();
    return data.results;
}

const completion = await client.chat.completions.create({
    model: "qwen3.5-plus",
    messages: [{ role: "user", content: "Who is the current president of France?" }],
    functions: [{
        name: "web_search",
        description: "Search the web for information",
        parameters: {
            type: "object",
            properties: { query: { type: "string" } },
            required: ["query"]
        }
    }]
});

// Responses API - use built-in tools directly
const response = await client.responses.create({
    model: "qwen3.5-plus",
    input: "Who is the current president of France?",
    tools: [{ type: "web_search" }]  // Enable web search directly
});
console.log(response.output_text);

curl

# Chat Completions - need to implement tools yourself
# Example of calling an external search API
curl https://api.example.com/search \
  -G \
  --data-urlencode "q=current president of France" \
  --data-urlencode "key=$SEARCH_API_KEY"

# Responses API - use built-in tools directly
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "Who is the current president of France?",
    "tools": [{"type": "web_search"}]
}'

FAQ

Q: How do I pass context for multi-turn conversations?

A: When starting a new conversation turn, pass the id from the previous successful model response as the previous_response_id parameter.

Q: Why does printing output_text fail?

A: This error occurs because the property was erroneously removed in some versions of the OpenAI Python SDK, such as 1.99.2. To resolve this issue, update the SDK to the latest version.

Prerequisites

Supported models

Service endpoints

Singapore

China (Beijing)

Code examples

Basic call

Python

Node.js

curl

Multi-turn conversation

Python

Node.js

curl

Streaming output

Python

Node.js

curl

Call built-in tools

Python

Node.js

curl

Session cache

Python

Node.js

curl

Migrate from Chat Completions to Responses API

1. Update endpoint URL and base_url

Python

Node.js

curl

2. Update response handling

3. Simplify multi-turn conversation management

Python

Node.js

4. Use built-in tools

Python

Node.js

curl

FAQ

Q: How do I pass context for multi-turn conversations?

Q: Why does printing output_text fail?

References