All Products
Search
Document Center

Platform For AI:Appendix: Status codes

Last Updated:Feb 05, 2024

This topic describes the status codes returned by Machine Learning Platform for AI (PAI).

Status code

Description

200

The request is successful.

400

The processor cannot process the request.

Causes:

  • The request format is invalid. This may occur with general processors such as TensorFlow and PyTorch.

  • The request encounters an exception. This may occur with custom processors.

    Note

    You can also customize status codes for custom processors.

404

The server cannot find the requested resource.

Causes: The service name or service request uses an invalid endpoint.

401

The service fails to be authenticated.

Causes: The token for accessing the service is not specified or is invalid.

Solutions

  1. Use the token that was automatically generated or manually specified when the service was deployed. When a service is deployed, the system automatically generates a token that you can use to access the service. To manually specify a token the first time you deploy a service, add the Token parameter to the service configuration files to specify a fixed token value. This way, the token value does not change even if you update the service.

  2. Obtain the token value in the console. Log on to the PAI console, find the workspace that you want to manage, and go to the details page of the workspace. In the left-side navigation pane, click Model Serving (EAS). On the Elastic Algorithm Service page, find the service that you want to manage, click Debug in the Operating column, and then obtain the token value of the service. You can also run an eascmd desc service_name command to obtain the token value.

  3. Access the service by using the obtained token value.

    • Add the token value to the request by using an HTTP header when you run a curl command to test the service. Sample command: curl -H 'Authorization: NWMyN2UzNjBiZmI2YT***' http:// xxx.cn-shanghai.aliyuncs.com/api/predict/echo

    • Invoke a SetToken() function when you use an SDK to access the service. For more information, see SDK for Java.

450

The request is denied because its header fields are too large.

Causes: The header fields of the request are too large. When an instance receives a request, it puts the request in a queue. The request is processed when instance workers become available. By default, each instance can have a maximum of five workers. This limit can be modified by using the metadata.rpc.worker_threads parameter in the JSON file used in the create command. For more information about the create command, see Create a service. If workers are occupied for extended periods of time, requests in the queue begin to pile up. When the queue reaches its upper limit, new requests are denied and status code 450 is returned. This ensures service availability by keeping the requests to a manageable number and avoiding runaway response times. By default, each queue can hold a maximum number of 64 requests. This limit can be modified by using the metadata.rpc.max_queue_size parameter in the JSON file used in the create command. For more information about the create command, see Create a service.

Note

To some extent, the upper limit imposed on a queue serves as a throttling measure to prevent cascading failures caused by traffic spikes.

Solutions

  • Reschedule the request to an idle instance to prevent service interruption. This solution is applicable to scenarios where status code 450 is returned for a small number of requests. If you use this method on a large number of requests, the throttling measure fails to serve its purpose and may lead to cascading failures.

  • Debug code for the processor. If an exception occurs on the processor due to a code error, all workers of an instance become unavailable due to their deadlock states and status code 450 is returned for all requests.

408

The request times out.

Causes: The requests failed to be processed within the specified timeout duration. All requests are given the same timeout duration. If a request is not processed within the specified time, the request times out, the TCP connection is closed, and status code 408 is returned. By default, the timeout period is set to 5 seconds. This limit can be modified by using the metadata.rpc.keepalive parameter in the JSON file used in the create command. For more information about the create command, see Create a service.

Note

The request processing time includes the amount of time the processor spends performing computing operations, the system spends receiving network packets, and the request spends queuing.

499

The client closes the connection.

Causes: The client closes the connection and causes the processing of specific requests to stop. When a client closes a connection, this status code is only recorded in the server and not returned to the client. For example, assume that the timeout period for receiving an HTTP response is set to 30 milliseconds in a client and the timeout period for processing an HTTP request is set to 50 milliseconds in a server. When the client receives no response 30 milliseconds after it sends a request, the connection is closed. In this case, status code 499 is recorded in the server.

429

The request triggers throttling because a large number of requests are sent.

Causes: Elastic Algorithm Service (EAS) provides a QPS-based throttling feature. When the feature is enabled, if the number of requests that can be concurrently processed by the server exceeds the upper limit, subsequent requests are denied and status code 429 is returned. You can use the metadata.rpc.rate_limit parameter in the JSON file used in the create command to enable the feature. For more information about the create command, see Create a service.

503

The service is unavailable.

Causes: The server is not ready to handle the request. When you use a gateway to access a service and all backend service instances are not ready, status code 503 is returned by the gateway.

You may also encounter the following common scenario: A service is deployed and enters the Running state, and all instances of the service are ready. However, after a request is initiated, status code 503 is returned. In most cases, the reason is that the request is abnormal and triggers bugs in code. The bugs cause backend service instances to crash, and the request cannot be processed. In this case, the gateway returns status code 503 to the client.