Multi-language services provided by Voice Moderation 2.0 - Content Moderation

Voice Moderation 2.0 optimizes voice moderation algorithms. It can moderate Chinese audios, English audios, and audios that contain both Chinese and English content. In consideration of the characteristics of global business, Voice Moderation 2.0 adopts separate moderation policies and a separate internationalization labeling system. This topic describes the details of the multi-language services provided by Voice Moderation 2.0 and how to use them.

Feature description

Compared with Voice Moderation 1.0, Voice Moderation 2.0 utilizes separate moderation policies and a separate internationalization labeling system to meet the requirements of global business. In addition, it provides more features to simplify service usage and assist in manual review.

Comparison item	Voice Moderation 1.0	Voice Moderation 2.0
Multi-language capability	Supports only Chinese audios by default.	Supports Chinese audios, English audios, and audios that contain both Chinese and English content.
Moderation capability	Utilizes a single model and incorporates language characteristics. The moderation policies take into account accuracy and recall. Does not support moaning models by default.	Utilizes multiple models and incorporates language and regional characteristics. The moderation policies are more accurate. Adds a moaning model to detect non-semantic characteristics.
Labeling system	Inherits the labeling system from earlier versions and supports only one single risk label for a moderation task.	Utilizes a separate internationalization labeling system for global business, adds internationalization labels such as profanities and regions, and supports multiple risk labels and subcategory labels for a moderation task.
Operation	Uses a semantic segmentation scheme. The duration of audio segments usually ranges from several seconds to dozens of seconds. Returns information about only audio segments that may violate content policies and does not provide temporary access addresses of audio segments.	Uses a segmentation scheme that is adjustable. The duration of audio segments is fixed. The fixed duration can improve the efficiency of manual review. Returns information about all audio segments, text obtained after transcription, and temporary access addresses of audio segments. Manual review requires those addresses.

Internationalization labels

The multi-language services of Voice Moderation 2.0 adopt an internationalization labeling system. If content contains multiple types of risks, multiple labels can be returned. Label categories include, but are not limited to, those listed in the following table.

Label type	Category
Level-1 labels	violence: violence contraband: contraband sexuality: pornography profanity: profanity and abuse pullinTraffic: advertising diversion regional: regional opposition C_customized: custom library
Subcategory labels (riskTips)	Subcategory labels are returned in the `xxx_yyy` format. Example: `contraband_Drugs`.

Service performance

Voice Moderation 2.0 uses a high-performance core engine that can schedule dozens of models and policies at high concurrency to return results in a more timely manner.

Service performance	Description
File size	In Voice Moderation 2.0, the upper limit of the audio file size is increased from 200 MB to 500 MB.
File format	The following audio file formats are supported: MP3, WAV, AAC, WMA, OGG, M4A, and AMR. The following video file formats are supported: AVI, FLV, MP4, MPG, ASF, WMV, MOV, RMVB, and RM.
Queries per second (QPS)	The upper limit of QPS is increased from 50 times per second to 100 times per second.
Maximum number of concurrent moderation tasks	In Voice Moderation 2.0, the default upper limit of concurrent moderation tasks is increased from 20 to 50.

Note

The QPS of voice moderation refers to the number of API requests allowed per second. The number of concurrent moderation tasks refers to the number of audio files or audio streams that can be concurrently moderated by the moderation service.

Billing methods

Voice Moderation 2.0 supports the pay-as-you-go billing method.

Pay-as-you-go

After you activate the Voice Moderation 2.0 service, the default billing method is pay-as-you-go. Fees are calculated daily based on the actual usage. If the service is not called, no fees are incurred.

Moderation type	Supported moderation service	Unit price
Standard voice moderation (audio_standard)	Multi-language voice moderation of audio and video files: audio_multilingual_global	USD 9.0 per 1,000 minutes

Access guide

Step 1: Activate Voice Moderation 2.0

Open the service activation page to activate the Voice Moderation 2.0 service.

Step 2: Grant permissions to a RAM user

Before you call API operations or use SDKs as a RAM user, you must grant permissions to the RAM user. You can create an AccessKey pair for your Alibaba Cloud account and the RAM user. When you call API operations, you must use the AccessKey pair to complete identity verification. For information about how to obtain an AccessKey pair, see Obtain an AccessKey pair.

Log on to the RAM console by using an Alibaba Cloud account or a RAM user that has administrative rights.
Create a RAM user.

For more information, see Create a RAM user.
Grant the AliyunYundunGreenWebFullAccess system policy to the RAM user.

For more information, see Grant permissions to a RAM user.

After completing the preceding operations, you can call the Content Moderation API as the RAM user.

Step 3: Install and use SDKs

The following table provides the supported regions.

Region	Public endpoint	Internal endpoint
Singapore	green-cip.ap-southeast-1.aliyuncs.com	green-cip-vpc.ap-southeast-1.aliyuncs.com

Note

If you need SDK sample code in other programming languages, you can call operations in OpenAPI Explorer. OpenAPI Explorer dynamically generates the sample code of the operations for different SDKs.

API

Usage notes

Service address: https://green-cip.{region}.aliyuncs.com.

You can call this operation to create a voice moderation task. For more information about how to construct an HTTPS request, see Calls over HTTPS. You can also select an existing HTTPS request. For more information, see Guide of using Voice Moderation 2.0.

Operations
- Submit a moderation task: VoiceModeration
- Query the results of a moderation task: VoiceModerationResult
Billing method
You are charged for calling the VoiceModeration operation. You are charged by using the pay-as-you-go billing method only for requests whose HTTP status code is 200.

Submit a moderation task

Request parameters

Parameter

Type

Required

Example

Description

Service

String

Yes

audio_multilingual_global

The type of the moderation service. Valid value:

audio_multilingual_global

ServiceParameters

JSONString

Yes

The parameters required by the moderation service. The value is a JSON string. For more information about the description of each string, see ServiceParameters.

Table 1 ServiceParameters

Parameter	Type	Required	Example	Description
url	String	Yes	http://aliyundoc.com/test.flv	The HTTP or HTTPS URL of the object that you want to moderate
callback	String	No	http://aliyundoc.com	The callback URL for notifying you of moderation results. HTTP and HTTPS URLs are supported. If you do not set this parameter, you must poll moderation results periodically. If you set the callback parameter in the moderation request, make sure that the specified HTTP or HTTPS URL meets the following requirements: supports the POST method, uses UTF-8 to encode the transmitted data, and supports the checksum and content parameters. To send moderation results to the specified callback URL, Content Moderation returns the checksum and content parameters in callback notifications based on the following rules and format: checksum: a string in the `UID + seed + content` format that is generated by the Secure Hash Algorithm 256 (SHA-256) algorithm. UID indicates the user ID of your Alibaba Cloud account. You can query the ID in the Alibaba Cloud Management Console. To prevent data tampering, you can use the SHA-256 algorithm to generate a string when your server receives a callback notification and verify the string against the received checksum parameter. Note UID must be the user ID of your Alibaba Cloud account, but not the ID of a RAM user. content: a JSON-formatted string. You can convert the string to a JSON object. For more information about the format of the content parameter, see the sample success responses of each operation that you can call to query moderation results. Note If your server successfully receives a callback notification, the server sends an HTTP 200 status code to Content Moderation. If your server fails to receive a callback notification, the server sends other HTTP status codes to Content Moderation. If your server fails to receive a callback notification, Content Moderation continues to push the callback notification until your server receives it. Content Moderation can push a callback notification repeatedly up to 16 times. After 16 times, Content Moderation stops pushing the callback notification. In this case, we recommend that you check the status of the callback URL.
seed	String	No	abc****	A random string that is used to generate a signature for the callback notification request. The string can be up to 64 characters in length and can contain letters, digits, and underscores (_). You can customize this string. It is used to verify the callback notification request when Content Moderation pushes callback notifications to your server. Note This parameter is required if you set the callback parameter.
cryptType	String	No	SHA256	The encryption algorithm that is used to encrypt the callback notification content when you enable callback notification. Content Moderation encrypts the returned string based on the encryption algorithm that you specify and then sends the encrypted string to the callback URL. The returned string is in the UID + seed + content format. Valid values: SHA256 (default): The SHA256 encryption algorithm is used. SM3: The HMAC-SM3 encryption algorithm is used, and a hexadecimal string is returned. The string consists of lowercase letters and digits. For example, 66c7f0f462eeedd9d1f2d46bdc10e4e24167c4875cf2f7a2297da02b8f4ba8e0 is returned after you encrypt abc by using the HMAC-SM3 encryption algorithm.

Response parameters

Parameter	Type	Example	Description
Code	Integer	200	The returned HTTP status code. For more information, see Response errors.
Data	JSONObject	{"taskId": "AAAAA-BBBBB"}	The moderation results.
Message	String	OK	The message that is returned in response to the request.
RequestId	String	AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****	The request ID.

Example

Sample requests

{
  "service":"audio_multilingual_global",
  "serviceParameters":"{\"cryptType\":\"SHA256\",\"seed\":\"abc***123\",\"callback\":\"https://aliyun.com/callback\",\"url\":\"http://aliyundoc.com/test.flv"}"
}

Sample success responses

{
  "code":200,
  "data":{
    "taskId":"AAAAA-BBBBB"
  },
  "message":"SUCCESS",
  "requestId":"AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}

Query the results of a moderation task

After a moderation task is complete, information about all audio segments is returned when you query the results of the moderation task.

Request parameters

Parameter	Type	Required	Example	Description
Service	String	Yes	audio_multilingual_global	The type of the moderation service.
ServiceParameters	JSONString	Yes		The parameters required by the moderation service. The value is a JSON string. For more information about the description of each string, see ServiceParameters.

Table 2 ServiceParameters

Parameter	Type	Required	Example	Description
taskId	String	Yes	AAAAA-BBBBB	The ID returned by the operation of submitting the moderation task.

Response parameters

Parameter	Type	Example	Description
Code	Integer	200	The returned HTTP status code. For more information, see Response errors.
Data	JSONObject	{"url":xxxx,"results":xxx}	The response parameters returned in JSON format.
Message	String	OK	The message that is returned in response to the request.
RequestId	String	AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****	The request ID.

Table 3 Data

Parameter	Type	Example	Description
url	String	https://aliyundoc.com	The URL of the moderation object.
sliceDetails	JSONArray		The details about the audio segments. For more information, see sliceDetails.

Table 4 sliceDetails

Parameter	Type	Example	Description
startTime	Integer	0	The start time of the text after audio-to-text conversion. Unit: seconds.
endTime	Integer	4065	The end time of the text after audio-to-text conversion. Unit: seconds.
startTimestamp	Integer	1678854649720	The start timestamp of the segment. Unit: milliseconds.
endTimestamp	Integer	1678854649720	The end timestamp of the segment. Unit: milliseconds.
text	String	Disgusting	The text converted from voice.
url	String	https://aliyundoc.com	The temporary access address of the audio segment. The validity period of the URL is 30 minutes. You must prepare another URL to store the audio segment at the earliest opportunity.
labels	String	pullinTraffic	The details of the labels. Multiple labels are separated by commas (,). Valid values: violence: violence contraband: contraband sexuality: pornography profanity: profanity and abuse pullinTraffic: advertising diversion regional: regional opposition C_customized: custom library
riskWords	String	AAA, BBB, CCC	The risk words that are hit. Multiple words are separated by commas (,).
riskTips	String	sexuality_Suggestive	Subcategory labels. Multiple labels are separated by commas (,).
extend	String	{\"riskTips\":\"sexuality_Suggestive\",\"riskWords\":\"pxxxxy\"}	A reserved parameter.

Example

Sample requests

{
  "service":"audio_multilingual_global",
  "serviceParameters":"{\"taskId\":\"AAAAA-BBBBB"}"
}

Sample success responses

{
  "code":200,
  "data":{
    "sliceDetails":[
      {
        "endTime":4065,
        "labels":"pullinTraffic",
        "startTime":0,
        "text":"pxxxxy xxxxxx",
        "riskTips":"sexuality_Suggestive",
        "riskWords":"pxxxxy",
        "url":"https://aliyundoc.com"
      }
    ]
  },
  "message":"OK",
  "requestId":"AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}

Callback notification parameters

The callback notification is in JSON format. The following table describes the related fields.

Field	Data type	Description
checksum	String	The verification code. It is a string in the `UID + seed + content` format and is generated by using the SHA256 algorithm. UID indicates the user ID of your Alibaba Cloud account. You can query the ID in the Alibaba Cloud Management Console. To prevent data tampering, you can use the SHA-256 algorithm to generate a string when your server receives a callback notification and verify the string against the received checksum parameter. Note UID must be the user ID of an Alibaba Cloud account, but not the ID of a RAM user.
taskId	String	The task ID of the callback response.
content	String	The serialized moderation results. The results are a JSON string. You can parse and convert the results into a JSON object. The format of the results indicated by the content field is the same as that of the results returned by the result query task.

Response codes

The following table describes the response codes. You are charged by using the pay-as-you-go billing method only for requests whose response code is 200.

Code	Description
200	The request is successful.
280	The moderation is in progress.
400	Not all request parameters are configured.
401	The values specified for one or more request parameters are invalid.
402	Invalid value length of request parameters. Check and modify them and try again.
403	The QPS of requests exceeds the upper limit. Check and modify the number of requests.
404	The specified file failed to be downloaded. Check the URL of the file or try again.
405	Downloading the specified file timed out. The possible cause is that the file cannot be accessed. Check and adjust the file and try again.
406	The specified file is excessively large. Check and change the file size and try again.
407	The format of the specified file is not supported. Check and change the file format and try again.
408	You do not have the required permissions. The possible cause is that this account is not activated, has overdue payments, or is not authorized to call this API operation.
480	The number of concurrent moderation tasks exceeds the upper limit. Check and change the number of concurrent moderation tasks.
500	A system exception occurred.