In audio and video systems, audio and video transcoding consumes computing power. You can use Function Compute and CloudFlow to build an elastic and highly available audio and video processing system in a serverless architecture. This topic describes the differences between serverless audio and video processing systems and conventional solutions in engineering efficiency, O&M, performance, and costs. This topic also describes how to build and use a serverless audio and video processing system.
Background
You can use dedicated cloud-based transcoding services. However, you may want to build your own transcoding service in scenarios to meet the following requirements:
Auto scaling
You need to improve elasticity of video processing.
For example, you have deployed a video processing service on a virtual machine or container platform by using FFmpeg, and you want to improve the elasticity and availability of the service.
Engineering efficiency
You need to process multiple video files at a time.
Batch processing of a large number of oversized videos with high efficiency
For example, hundreds of 1080p videos, each of which is more than 4 GB in size, are regularly generated every Friday. The videos need to be processed within a few hours.
Custom file processing
You have advanced custom processing requirements.
For example, you want to record the transcoding details in your database each time a video is transcoded. Or you may want popular videos to be automatically prefetched to Alibaba Cloud CDN (CDN) points of presence (PoPs) after the videos are transcoded to relieve pressure on origin servers.
You want to convert formats of audio files, customize the sample rates of audio streams, or reduce noise in audio streams.
You need to directly read and process source files.
For example, your video files are stored in File Storage NAS (NAS) or on disks that are attached to Elastic Compute Service (ECS) instances. If you build a custom video processing system, the system can directly read and process your video files, without migrating them to Object Storage Service (OSS).
You want to convert the formats of videos and add more features.
For example, your video processing system can be used to transcode videos, add watermarks to videos, and generate GIF images based on video thumbnails. You want to add more features to your video processing system, such as adjusting the parameters used for transcoding. You also hope that the existing online services provided by the system are not affected when new features are released.
Cost control
You want to obtain high cost efficiency in simple transcoding or lightweight media processing scenarios.
For example, you want to generate a GIF image based on the first few frames of a video, or query the duration of an audio file or a video file. In this case, building a custom media processing system is cost-effective.
You can use a traditional self-managed architecture or the serverless solution to build your own transcoding services. This topic compares the two solutions and the procedure of the serverless solution.
Solutions
Conventional self-managed architecture
As the IT technology develops, the services of cloud vendors are constantly improving. You can use Alibaba Cloud services, such as ECS, OSS, and CDN, to build an audio and video processing system to store audio and video files and accelerate playbacks of the files.
Serverless architecture
Simple video processing system
The following figure shows the architecture of a solution that you can use to perform simple processing on videos.
When a user uploads a video to OSS, an OSS trigger automatically triggers a function that calls FFmpeg to transcode the video and then saves the transcoded video to OSS.
For more information about the demo and procedure of the simple video processing system, see Simple video processing system.
Video processing workflow system
If you need to accelerate the transcoding of large videos or perform complex operations on videos, use CloudFlow orchestration functions to build a powerful video processing system. The following figure shows the architecture of the solution.
When a user uploads an .mp4 video to OSS, OSS triggers the execution of a function, and then the function invokes CloudFlow to transcode the format of the video. You can use this solution to meet the following business requirements:
A video can be transcoded to various formats at the same time and processed based on custom requirements, such as watermarking or synchronizing the updated information about the processed video to a database.
When multiple files are uploaded to OSS at the same time, Function Compute performs auto scaling to process the files in parallel and transcodes the files into multiple formats at the same time.
You can transcode excessively large videos by using NAS and video segmentation. To transcode an excessively large video, the video is first segmented, transcoded in parallel, and then merged. Proper segmentation intervals greatly improve the transcoding speed of larger videos.
NoteVideo segmentation refers to splitting a video stream into multiple segments at specified time intervals and recording the information about the segments in a generated index file.
For more information about the demo and procedure of the video processing workflow system, see Video processing workflow system.
Benefits of a serverless solution
Improved engineering efficiency
Item | Serverless solution | Conventional self-managed architecture |
Infrastructure | None | You must purchase and manage infrastructure resources. |
Development efficiency | You can focus only on the development of business logic and use Serverless Devs to orchestrate and deploy resources. | In addition to the development of business logic, you must build an online runtime environment, which includes the installation of software, service configurations, and security updates. |
Parallel and distributed video processing | You can use CloudFlow resource orchestration to process multiple videos in parallel and process a large video in a distributed manner. Alibaba Cloud is responsible for the stability and monitoring. | Strong development capabilities and a sound monitoring system are required to ensure the stability of the video processing system. |
Training costs | You need to only write code in programming languages based on your business requirements and be familiar with FFmpeg. | In addition to programming languages and FFmpeg, you may also need to use Kubernetes, ECS. In this case, you must learn more about other services, terms, and parameters. |
Business cycle | The solution consumes approximately three man-days, including two man-days for development and debugging and one man-day for stress testing and checking. | The time required to complete the following items is approximately 30 man-days (excluding business logic development): hardware procurement, software and environment configurations, system development, testing, monitoring and alerting, and canary release of the system. |
Auto scaling and O&M-free
Item | Serverless solution | Conventional self-managed architecture |
Elasticity and high availability | Function Compute can implement auto scaling in milliseconds to quickly scale out the underlying resources to respond to traffic surges. Function Compute can also provide excellent transcoding performance, and eliminate the need for O&M. | To perform auto scaling, you must create a Server Load Balancer (SLB) instance. However, the scaling speed when using the SLB instance is slower compared to using Function Compute. |
Monitoring, alerting, and queries | More fine-grained metrics are provided for execution in CloudFlow and Function Compute. You can query the latency and logs of each function execution. The solution also uses a more comprehensive monitoring and alerting mechanism. | Only metrics on auto scaling or containers can be used. |
Excellent transcoding performance
For example, a cloud service consumes 188 seconds to perform regular transcoding on an MOV video file, which is 89 seconds in length, to convert it to the .mp4 format. This amount of time, 188 seconds, is marked as T as a reference. The following formula is used to calculate the performance acceleration rate: Performance acceleration rate = T/Function Compute transcoding duration
.
Video segmentation interval (s) | Transcoding duration of Function Compute (s) | Performance acceleration rate (%) |
45 | 160 | 117.5 |
25 | 100 | 188 |
15 | 70 | 268.6 |
10 | 45 | 417.8 |
5 | 35 | 537.1 |
Higher cost-effectiveness
In specific scenarios, fewer costs are required for video processing in Function Compute. The serverless solution also outperforms other solutions in video transcoding services from other cloud vendors in costs.
This section uses the conversion between .mp4 and .flv files to compare the costs of using Function Compute and another cloud service. In this example, memory of functions in Function Compute are set to 3 GB. The following table compares the costs.
In the table Cost reduction rate = (Cost of another cloud service - Function Compute cost)/Cost of another cloud service
.
Table 1. .mp4 to .flv
Resolution | Speed | Frame rate | Transcoding duration of Function Compute | Transcoding cost of Function Compute | Cost of another cloud service | Cost reduction rate |
Standard definition (SD): 640 × 480 pixels | 889 KB/s | 24 | 11.2s | 0.003732288 | 0.032 | 88.3% |
High definition (HD): 1280 × 720 pixels | 1963 KB/s | 24 | 20.5s | 0.00683142 | 0.065 | 89.5% |
Ultra HD: 1920 × 1080 pixels | 3689 KB/s | 24 | 40s | 0.0133296 | 0.126 | 89.4% |
4K 3840*2160 | 11185 KB/s | 24 | 142s | 0.04732008 | 0.556 | 91.5% |
Table 2. .flv to .mp4
Resolution | Speed | Frame rate | Transcoding duration of Function Compute | Transcoding cost of Function Compute | Cost of another cloud service | Cost reduction rate |
Standard definition (SD): 640 × 480 pixels | 712 KB/s | 24 | 34.5s | 0.01149678 | 0.032 | 64.1% |
High definition (HD): 1280 × 720 pixels | 1806 KB/s | 24 | 100.3s | 0.033424 | 0.065 | 48.6% |
Ultra HD: 1920 × 1080 pixels | 3911 KB/s | 24 | 226.4s | 0.0754455 | 0.126 | 40.1% |
4K 3840*2160 | 15109 KB/s | 24 | 912s | 0.30391488 | 0.556 | 45.3% |
In this example, the other cloud service charges you based on the billing rules of regular transcoding. The minimum billable duration is 1 minute for each video. In this example, videos that last for 2 minutes are used. The percentage of cost reduction fluctuates by less than 10% even a 1.5-minute video is used.
The preceding tables describe the prices for transcoding videos between .flv and .mp4 files. Videos in the .flv format are more complex in transcoding than videos in the .mp4 format. Compared with the solutions provided by other cloud service providers, the solution based on Function Compute and CloudFlow is more cost-effective in terms of computing resources as shown by the metrics displayed in the tables. Based on practical experience, the actual cost reduction is more apparent than what is described in the previous tables. The following items describe the reasons for the cost reduction:
The test videos have high bitrates, whereas the majority of videos in actual use are of SD or low definition (LD) quality and have lower bitrates. The videos in actual use require fewer computing resources. Transcoding by Function Compute requires less time, and is more cost-effective. However, the configured pricing policies of general cloud transcoding services do not vary with the video quality or bitrate.
If general cloud transcoding services are used, costs are higher for specific resolutions. For example, you want to transcode a video that has a resolution of 856 × 480 pixels or 1368 × 768 pixels, the video is billed based on its higher resolution level. A video that has a resolution of 856 × 480 pixels is billed as an HD video whose resolution is 1280 × 720 pixels. A video that has a resolution of 1368 × 768 pixels is billed as an ultra HD video whose resolution is 1920 × 1080 pixels. In this case, the unit price for video transcoding is greatly increased, whereas the increase in computing capabilities is probably less than 30%. To resolve this issue, you can use Function Compute, which allows you to pay only for consumed computing resources.
Operations and deployment
This section describes how to deploy a simple video processing system and a video processing workflow system in the serverless solution.
Simple video processing system
Prerequisites
Function Compute: Activate Function Compute.
OSS: Create buckets.
Procedure
Create a service.
Log on to the Function Compute console. In the left-side navigation pane, click Services & Functions.
In the top navigation bar, select a region. On the Services page, click Create Service.
In the Create Service panel, enter a service name and description, configure the parameters based on your business requirements, and then click OK.
In this example, the service role is set to
AliyunFCDefaultRole
andAliyunOSSFullAccess
is added to the policy.For more information about how to create a service, see Create a service.
Create a function.
On the Functions page, click Create Function.
On the Create Function page, select a method to create the function, configure the following parameters, and then click Create.
Method to create the function: Use Built-in Runtime.
Basic Settings: Configure the basic information about the function, including Function Name and Handler Type. Set Handler Type to Event Handler.
Code: Set Runtime to Python 3.9 and Code Upload Method to Use Sample Code.
Advanced Settings: The processing of video files is time-consuming. In this example, vCPU Capacity is set to 4 vCPUs, Memory Capacity is set to 8 GB, Size of Temporary Disk is set to 10 GB, and Execution Timeout Period is set to 7200 seconds. The preceding settings are configured based on the size of videos to process.
Retain the default values for other parameters. For more information about how to create a function, see Create a function.
Create an OSS trigger.
On the function details page, click the Triggers tab, select a version or alias from the Version or Alias drop-down list, and then click Create Trigger.
In the Create Trigger panel, specify related parameters and click OK.
Parameter
Description
Example
Trigger Type
Select the type of the trigger. In this example, OSS is selected.
OSS
Name
Enter the trigger name.
oss-trigger
Version or Alias
Specify the version or alias. The default value is LATEST. If you want to create a trigger for another version or alias, select a version or alias from the Version or Alias drop-down list on the function details page. For more information about versions and aliases, see Manage versions and Manage aliases.
LATEST
Bucket Name
Select the bucket that you created.
testbucket
Object Prefix
Enter the prefix of the object names that you want to match. To prevent unexpected costs that are generated by nested loops, we recommend that you configure the prefix and suffix of the object name. If you specify the same event type for different triggers of a bucket, the prefix or suffix must be unique. For more information, see Rules for triggering native OSS triggers.
ImportantThe object prefix cannot start with a forward slash (
/
). Otherwise, the OSS trigger cannot be triggered.source
Object Suffix
Enter the suffix of the object name that you want to match. We recommend that you configure Object Prefix and Object Suffix to prevent additional costs that are generated by nested loops. If you specify the same event type for different triggers in a bucket, the object prefix or suffix must be unique. For more information, see Rules for triggering native OSS triggers.
mp4
Trigger Event
Select one or more trigger events from the drop-down list. For more information about the event types of OSS, see OSS events.
oss:ObjectCreated:PutObject, oss:ObjectCreated:PostObject, oss:ObjectCreated:CompleteMultipartUpload
Role Name
Specify the name of a resource access management (RAM) role. In this example, AliyunOSSEventNotificationRole is selected.
NoteAfter you configure the preceding parameters, click OK. The first time you create a trigger of this type, click Authorize Now in the message that appears.
AliyunOSSEventNotificationRole
Write function code.
On the function details page, click the Code tab to write code in the code editor.
The function converts .mp4 files to the .flv format and stores the .flv files in the
dest
directory of an OSS bucket. In this example, Python is used. The following sample code provides an example:# -*- coding: utf-8 -*- import logging import oss2 import os import json import subprocess import shutil logging.getLogger("oss2.api").setLevel(logging.ERROR) logging.getLogger("oss2.auth").setLevel(logging.ERROR) LOGGER = logging.getLogger() def get_fileNameExt(filename): (_, tempfilename) = os.path.split(filename) (shortname, extension) = os.path.splitext(tempfilename) return shortname, extension def handler(event, context): LOGGER.info(event) evt = json.loads(event) evt = evt["events"] oss_bucket_name = evt[0]["oss"]["bucket"]["name"] object_key = evt[0]["oss"]["object"]["key"] output_dir = "dest" dst_format = "flv" shortname, _ = get_fileNameExt(object_key) creds = context.credentials auth = oss2.StsAuth(creds.accessKeyId, creds.accessKeySecret, creds.securityToken) oss_client = oss2.Bucket(auth, 'oss-%s-internal.aliyuncs.com' % context.region, oss_bucket_name) exist = oss_client.object_exists(object_key) if not exist: raise Exception("object {} is not exist".format(object_key)) input_path = oss_client.sign_url('GET', object_key, 6 * 3600) # M3U8 special handling. rid = context.request_id if dst_format == "m3u8": return handle_m3u8(rid, oss_client, input_path, shortname, output_dir) else: return handle_common(rid, oss_client, input_path, shortname, output_dir, dst_format) def handle_m3u8(request_id, oss_client, input_path, shortname, output_dir): ts_dir = '/tmp/ts' if os.path.exists(ts_dir): shutil.rmtree(ts_dir) os.mkdir(ts_dir) transcoded_filepath = os.path.join('/tmp', shortname + '.ts') split_transcoded_filepath = os.path.join( ts_dir, shortname + '_%03d.ts') cmd1 = ['ffmpeg', '-y', '-i', input_path, '-c:v', 'libx264', transcoded_filepath] cmd2 = ['ffmpeg', '-y', '-i', transcoded_filepath, '-c', 'copy', '-map', '0', '-f', 'segment', '-segment_list', os.path.join(ts_dir, 'playlist.m3u8'), '-segment_time', '10', split_transcoded_filepath] try: subprocess.run( cmd1, stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=True) subprocess.run( cmd2, stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=True) for filename in os.listdir(ts_dir): filepath = os.path.join(ts_dir, filename) filekey = os.path.join(output_dir, shortname, filename) oss_client.put_object_from_file(filekey, filepath) os.remove(filepath) print("Uploaded {} to {}".format(filepath, filekey)) except subprocess.CalledProcessError as exc: # if transcode fail, trigger invoke dest-fail function raise Exception(request_id + " transcode failure, detail: " + str(exc)) finally: if os.path.exists(ts_dir): shutil.rmtree(ts_dir) # Remove the ts file. if os.path.exists(transcoded_filepath): os.remove(transcoded_filepath) return {} def handle_common(request_id, oss_client, input_path, shortname, output_dir, dst_format): transcoded_filepath = os.path.join('/tmp', shortname + '.' + dst_format) if os.path.exists(transcoded_filepath): os.remove(transcoded_filepath) cmd = ["ffmpeg", "-y", "-i", input_path, transcoded_filepath] try: subprocess.run( cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=True) oss_client.put_object_from_file( os.path.join(output_dir, shortname + '.' + dst_format), transcoded_filepath) except subprocess.CalledProcessError as exc: # if transcode fail, trigger invoke dest-fail function raise Exception(request_id + " transcode failure, detail: " + str(exc)) finally: if os.path.exists(transcoded_filepath): os.remove(transcoded_filepath) return {}
Click Deploy.
Test the function code.
You can configure function input parameters to simulate OSS events and verify the code. In actual operations, the function is automatically triggered when a specified OSS event occurs.
On the function details page, click the Code tab, click the icon to the right of Test Function, and select Configure Test Parameters from the drop-down list.
In the Configure Test Parameters panel, click the Create New Test Event or Modify Existing Test Event tab, configure Event Name and the event content, and then click OK.
The following sample code provides an example of event configurations. For more information about the event parameter, see Step 2: Configure input parameters of the function.
{ "events": [ { "eventName": "oss:ObjectCreated:CompleteMultipartUpload", "eventSource": "acs:oss", "eventTime": "2022-08-13T06:45:43.000Z", "eventVersion": "1.0", "oss": { "bucket": { "arn": "acs:oss:cn-hangzhou:123456789:testbucket", "name": "testbucket", "ownerIdentity": "164901546557****" }, "object": { "deltaSize": 122539, "eTag": "688A7BF4F233DC9C88A80BF985AB****", "key": "source/a.mp4", "size": 122539 }, "ossSchemaVersion": "1.0", "ruleId": "9adac8e253828f4f7c0466d941fa3db81161****" }, "region": "cn-hangzhou", "requestParameters": { "sourceIPAddress": "140.205.XX.XX" }, "responseElements": { "requestId": "58F9FF2D3DF792092E12044C" }, "userIdentity": { "principalId": "164901546557****" } } ] }
Click Test Function. On the Code tab, view the execution result.
Video processing workflow system
Prerequisites
Activate Function Compute and create an OSS bucket
Function Compute: Activate Function Compute.
OSS: Create buckets.
CloudFlow: Activate CloudFlow.
NAS: Activate NAS.
Virtual Private Cloud (VPC): Activate VPC.
Configure a service role
AliyunFcDefaultRole: The default role of Function Compute. You must configure this role when you create a service. Attach the
AliyunOSSFullAccess
,AliyunFnFFullAccess
, andAliyunFCInvocationAccess
permissions to the AliyunFcDefaultRole to invoke functions, manage workflows, and manage OSS.AliyunOSSEventNotificationRole: The default role that is used by OSS to send event notifications.
fnf-execution-default-role: The role required to create and manage workflows. You must attach the
AliyunFCInvocationAccess
andAliyunFnFFullAccess
permissions to the fnf-execution-default-role role.
Install and configure Serverless Devs
Procedure
This solution implements a video processing system by using Serverless Workflow to orchestrate functions. Code and workflows of multiple functions need to be configured. In this example, Serverless Devs is used to deploy the system.
Run the following command to initialize an application:
s init video-process-flow -d video-process-flow
The following table describes the required configuration items. Specify the configuration items based on your business requirements.
Parameter
Example
Region ID
cn-hangzhou
Service
video-process-flow-demo
Alibaba Cloud Resource Name (ARN) of a RAM role
acs:ram::10343546****:role/aliyunfcdefaultrole
OSS bucket name
testBucket
Prefix
source
Path to save transcoded videos
dest
ARN of the RAM role associated with an OSS trigger
acs:ram::10343546****:role/aliyunosseventnotificationrole
Segmentation interval
30
Video format after transcoding
.mp4, .flv, and .avi
Name of a workflow
video-process-flow
ARN of the RAM role associated with the workflow
acs:ram::10343546****:role/fnf-execution-default-role
please select credential alias
default
Run the following command to go to the project and deploy the project:
cd video-process-flow && s deploy - y
The following content is returned if the deployment is successful.
[2023-08-31 13:22:21] [INFO] [S-CORE] - Project video-demo-flow successfully to execute fc-video-demo-split: region: cn-hangzhou service: name: video-process-flow-wg76 function: name: split runtime: python3 handler: index.handler memorySize: 3072 timeout: 600 fc-video-demo-transcode: region: cn-hangzhou service: name: video-process-flow-wg76 function: name: transcode runtime: python3 handler: index.handler memorySize: 3072 timeout: 600 fc-video-demo-merge: region: cn-hangzhou service: name: video-process-flow-wg76 function: name: merge runtime: python3 handler: index.handler memorySize: 3072 timeout: 600 fc-video-demo-after-process: region: cn-hangzhou service: name: video-process-flow-wg76 function: name: after-process runtime: python3 handler: index.handler memorySize: 512 timeout: 120 fc-oss-trigger-trigger-fnf: region: cn-hangzhou service: name: video-process-flow-wg76 function: name: trigger-fnf runtime: python3 handler: index.handler memorySize: 128 timeout: 120 triggers: - type: oss name: oss-t video-demo-flow: RegionId: cn-hangzhou Name: video-process-flow
Test the project.
Log on to the OSS console. Go to the source directory of testBucket and upload an .mp4 file.
Log on to the CloudFlow console. On the Flows page, click the target workflow. On the Executions tab, click the name of the execution to view the execution process and execution status.
If the status of the execution is Succeeded, you can go to the dest directory of testBucket to view the transcoded files.
If you see the transcoded file, the service of the video processing system is running as expected.
References
FAQ
I have deployed a video processing system on a virtual machine or container platform by using FFmpeg. How can I improve the elasticity and availability of the system?
You can migrate your system that is developed by using FFmpeg from the virtual machine or container platform to Function Compute with ease. Function Compute can be integrated with FFmpeg-related commands. The system reconstruction is cost-effective, and the elasticity and high availability of Function Compute can be inherited.
What can I do if I need to process a large number of videos in parallel?
For more information about the solution, see Video processing workflow system. When multiple videos are uploaded to OSS at the same time, Function Compute automatically scales out the resources to process videos in parallel. For more information, see refine fc-fnf-video-processing.
Hundreds of 1080p videos, each of which is more than 4 GB in size, are regularly generated every Friday, and I want to complete the processing of the videos within a few hours. How can I process such a large number of oversized videos at a time with high efficiency?
You can specify the size of video segments to ensure that the original oversized video has adequate computing resources for transcoding. Video segmentation can greatly improve transcoding efficiency. For more information about the solution, see refine fc-fnf-video-processing.
I want to record the transcoding details in my database each time a video is transcoded. I also want the popular videos to be automatically prefetched to CDN points of presence (POPs) after the videos are transcoded to relieve pressure on the origin server. How can I use such advanced custom processing features?
For more information about the solution, see Video processing workflow system. You can perform specific custom operations during media processing or perform additional operations based on the process. For example, you can add preprocessing steps before the process begins or add subsequent steps.
My custom video processing workflow contains multiple operations, such as transcoding videos, adding watermarks to videos, and generating GIF images based on video thumbnails. After that, I want to add more features to my video processing system, such as adjusting the parameters used for transcoding. I also hope that the existing online services provided by the system are not affected when new features are released. How can I achieve this goal?
For more information about the deployment scheme, see Video processing workflow system. CloudFlow orchestrates and calls only functions. Therefore, you need to only update the corresponding functions. In addition, versions and aliases can be used to perform canary releases of functions. For more information, see Manage versions.
I require only simple transcoding services or lightweight media processing services. For example, I want to obtain a GIF image that is generated based on the first few frames of a video, or query the duration of an audio file or a video file. In this case, building a custom media processing system is cost-effective. How can I achieve this goal?
Function Compute supports custom features. You can run specific FFmpeg commands to achieve your goal. For more information about the typical sample project, see fc-oss-ffmpeg.
My source video files are stored in NAS or on disks that are attached to ECS instances. I want to build a custom video processing system that can directly read and process my mezzanine video files, without migrating them to OSS. How can I achieve this goal?
You can integrate Function Compute with NAS to allow Function Compute to process the files that are stored in NAS. For more information, see Configure a NAS file system.