User guide for x-oss-process - Intelligent Media Management

Intelligent Media Management (IMM) capabilities are integrated into Object Storage Service (OSS), allowing you to utilize various data processing and analysis features of IMM within OSS. This topic describes how to use the x-oss-process capability of OSS to access IMM features.

Prerequisites

IMM-related features are available only in regions where IMM is available. For more information, see Endpoints.
Note
The APIs and SDKs for the new data processing feature are available in all regions that support IMM. Access to features of the new version from the console is in phased testing and is available only to select users in the Germany (Frankfurt) and China (Qingdao) regions.
IMM is activated.

Billing

Using IMM features incurs costs. For more information, see Billing overview.

Procedure

Step 1: Create an IMM project

Log on to the IMM Console.
In the lower part of the left-side navigation pane, click Try New Version. If Switch to Old Version is displayed in the lower part of the navigation pane, skip this step.
In the left-side navigation pane, click Project List.
Click Create Project.

Step 2: Bind an OSS bucket

Call the AttachOSSBucket operation to bind the project to an OSS bucket. In this example, the operation is called in OpenAPI Explorer to bind the project to an OSS bucket.

Log on to OpenAPI Explorer. In the upper-left corner of the page, select the IMM API version and region.

image..png

Important

To reduce cross-network latency and costs, bind the project only to an OSS bucket in the same region.

On the Parameters tab, enter the name of the project in the ProjectName field and the OSSBucket to bind, then click Initiate Call.
View the response on the right.

image..png

Step 3: Grant permissions

For more information about permissions, see Permissions.

Note

Permission configurations are required only for a RAM user or RAM role. If you are using an Alibaba Cloud account, skip this step.

Supported features

Note

Anonymous access is not supported. For more information about how to generate signed URLs for access, see sign.

Media processing

For more information, see Audio and video processing.

Feature	Parameter	Description
Video transcoding	video/convert	Converts video files in OSS to the required format.
Video-to-animated-image conversion	video/animation	Converts video files in OSS to an animated image format, such as GIF or WebP.
Generate CSS sprites from video snapshots	video/sprite	Captures frames from a video file in OSS, stitches them into a sprite sheet, and converts it to the required image format.
Frame capture	video/snapshots	Captures frames from a video file in OSS and converts them to the required image format.
Video merging	video/concat	Concatenates video files in OSS into a single video and converts it to the required format.
Audio transcoding	audio/convert	Converts audio files in OSS to the required format.
Audio merging	audio/concat	Concatenates audio files in OSS into a single audio file and converts it to the required format.
Extract audio information	audio/info	Extracts media format and stream information from an audio file in OSS.
Extract video information	video/info	Extracts media format and stream information from a video file in OSS.

Document processing

For more information about the parameters, see Document processing and Intelligent document processing.

Feature	Parameter	Description
Online document preview with WebOffice	doc/preview	Previews documents in OSS.
WebOffice online editing	doc/edit	Collaboratively edits documents in OSS.
Document snapshot	doc/convert	Creates a snapshot of a document in OSS.
Document format conversion	doc/convert	Converts the format of documents in OSS.
Intelligent document translation	`doc/translate`	Translate text into multiple languages, such as Chinese and English.
Intelligent document polishing	`doc/polish`	Polishes the content of a document.
Intelligent document summarization	`doc/summarize`	Automatically generates a brief summary of a document.
Intelligent document continuation	`doc/continue`	Automatically generates coherent and logical follow-up content based on a given starting text, topic, and style.
Intelligent document enrichment	`doc/enrich`	Optimizes the language and style of an existing document.
Intelligent document tone rephrasing	`doc/rephrase`	Adjusts the tone and optimizes the expression of a document.

Image intelligence

For more information about the parameters, see IMG parameters.

Feature	Parameter	Description
Face detection	image/faces	Detects the locations of faces in an image and analyzes facial attributes.
Human body detection	image/bodies	Detects the locations of human bodies in an image.
Vehicle detection	image/cars	Detects and analyzes vehicles and license plates in an image.
QR code recognition	image/codes	Recognizes QR codes in an image.
Image label detection	image/labels	Recognizes labels for scenes, objects, and events in an image.
Image quality assessment	image/score	Provides a comprehensive score for the aesthetic quality (such as color and saturation) of an image.
Blind watermarking	image/blindwatermark	Adds a text-based blind watermark to an image.
Blind watermarking	image/deblindwatermark	Extracts a text-based blind watermark from an image.