By Alibaba Cloud Edge Plus
The Tokyo Olympics has come to an end. During the Games, hundreds of millions of global viewers swarmed to various broadcasting platforms to watch the games. The live broadcast capabilities of these platforms are particularly important. As the technology foundation of ApsaraVideo Live, Alibaba Cloud has advantages in product technology, resource bandwidth, and service assurance. On this basis, Alibaba Cloud can provide full-procedure technical support and guarantee for major live broadcast platforms, which ensures ultimate watching experience. This article introduces the implementation of live broadcast technology of Alibaba Cloud.
According to the forecast of iiMedia Research, a third-party consultancy, from 2017 to 2020, the live video broadcast industry has been in a stage of high-speed development. In 2020, the live video broadcast industry generated over 1 trillion yuan in market revenue, covering a total of 526 million users.
The application scope of live video broadcast has expanded from pan-Internet industries, including video entertainment and e-commerce, to traditional industries such as online education, video security, radio and television media, and medical services. "Live broadcast +" has become a new trend. With huge market potential, the live video broadcast industry is a competitive industry that involves lots of participants. To attract more users, live broadcast providers must be able to refine the live content, enrich live broadcast scenarios, and innovate their marketing models. To achieve these, live broadcast platforms need to incorporate real-time interaction and short videos, thus achieving better overall effect.
Live broadcast providers that build their own live broadcast platforms face great challenges:
Alibaba Cloud ApsaraVideo Live is an audio and video live broadcast platform based on leading technologies, including content access and distribution network and large-scale distributed real-time video processing. It features easy access, low latency, and high concurrency, providing high-definition and smooth audio and video live broadcast services.
As shown in the preceding figure, a caster collects live content from collection devices and then uses the stream ingest SDK to push live stream. The ApsaraVideo Live service pushes live stream to the live broadcast center of Alibaba Cloud through edge stream ingest. Then, the video stream is accelerated through CDN edge nodes to ensure the stability of uplink transmission. After the video stream is delivered to the live center, the caster can process the stream based on your needs. For example, the caster can transcode the stream, perform time shifting, record the stream, or capture some snapshots of the stream.
The processed stream is delivered to client devices for playback through CDN nodes. Mobile players can be developed by integrating player SDK provided by Alibaba Cloud. In addition to transcoding and capturing snapshots of live stream, users can deliver the recorded live stream to ApsaraVideo VOD by using the Live-to-VOD feature. In ApsaraVideo VOD, users can edit the recorded live stream online as short videos and provide the recorded live stream as on-demand videos. This process associates live streaming with the production and dissemination of short videos.
Alibaba Cloud has over 2,800 edge cloud nodes around the world and nine live centers. It supports the seamless layout of overseas business. Supported by the global real-time transport network (GRTN) for audio and videos of Alibaba Cloud, live streams from the whole world can be accessed from the nearest point and quickly transmitted to designated live centers through express connect for content delivery.
Alibaba Cloud's Narrowband HD technology can intelligently analyze the scenes, actions, content, textures, and other details in videos. For example, for different content such as footballs, players, and grass in football matches, encoding optimizations based on different strategies are implemented. Thus, the bitrate is reduced while the image output continues, saving the bandwidth cost by 20% to 40%.
The image on the left shows normal transcoding, while the image on the right shows Narrowband HD transcoding. When the audience sees this picture, the focus would be on the human face. With intelligent analysis, the system assigns more bitrates to the human face so as to achieve better recognition of the texture of the whole human face by making the details clearer. Now, let's look at the bitrate analysis. If the video image on the left is complex, the bitrate is between 1.5 MB and 2 MB.
When there are less details in the video image, for example, during the halftime break of a football match, we can use intelligent recognition to reduce the consumption of the bitrate. With this technology, the overall bandwidth is reduced by 30% to 40% on average. In other words, the bandwidth is saved while ensuring clearer images. This is Alibaba Cloud's Narrowband HD 2.0 technology.
Alibaba Cloud has also developed its real-time high-performance video encoder called Ali S265, which supports H265 1080p high-quality real-time transcoding, video enhancement algorithms, and image enhancement. Encoding in live video broadcast scenarios has a critical prerequisite. That is, encoding must be real-time, which means a one-hour video must be transcoded in one hour. More precisely, for example, the video content of each second needs to be transcoded in one second one by one to ensure the real-time transcoding.
Ali S265 can achieve 1080p high-quality real-time transcoding for videos and use an image enhancement algorithm to enhance the image quality. In the example above, you can see that the details of the snowflakes on the tree behind the animal have been enhanced after being processed by Ali S265. On the basis of ensuring real-time transcoding and image quality, the image is processed by an enhancement algorithm to be clearer and more layered.
Based on ApsaraVideo Live, Real-time Streaming (RTS) optimizes underlying technologies such as full-procedure latency monitoring, CDN protocol transformation, and UDP. By integrating with ApsaraVideo Player SDK, it achieves millisecond-level latency among nodes in scenarios with tens of millions of concurrent requests. This reduces the latency of 3 to 6 seconds in traditional live broadcasts and ensures low latency, less stalling, and ultimately fast access and smooth live streaming watching experience. RTS has multiple technical advantages and can be widely used in various industrial scenarios. With practical experience for hundreds of customers, RTS brings great value to the business.
Based on ApsaraVideo Live and ApsaraVideo Media Processing (MTS), the Production Studio service by Alibaba Cloud is developed to transform traditional tools for video production on the cloud. The effects of directed videos are innovated by integrating video AI recognition, bilingual translation, and various interaction features. You can use the Production Studio service on demand without purchasing extra hardware. The production studio service provides the production console, APIs, and Web SDKs. You can access them as needed to facilitate secondary development or for direct use. The console is easy to interact with and can reduce learning costs.
In addition to live stream and on-demand video sources, multiple types of content sources, such as pictures, documents, and web pages, are supported. A maximum of six videos can be mixed and encoded at the same time. Capabilities such as multi-view, real-time image and text packaging components, multi-language subtitles, and video AI are provided. They help to package and produce live broadcasts at any time and synchronize them online with one click, creating a wonderful and immersive live broadcast experience.
The multi-location function combines and switches among multiple streams from multiple locations on different sites of the event. Videos from different locations are transmitted through video frame-level synchronous playback, enabling users to have multiple viewing angles at the same time and helping them enjoy all wonderful scenes. The virtual studio is realized by using the real-time automatic matting technology based on the depth algorithm, which supports multi-device, multi-location, and remote broadcast. Through cloud matting and synthesis capabilities, broadcast scenes such as dual screens, split screens, and picture in picture are realized, creating an immersive live broadcast experience.
This feature is used to gather multiple video programs, create live broadcast rooms similar to carousel studios, and diversify live broadcast scenarios and program forms. Users can add, remove, modify, and search for programs in an episode list and modify program content. Users can use this feature to implement business scenarios in a flexible, easy, and collaborative manner.
Production Studio real-time subtitles, integrated with production studio, Damo Academy ASR, and translation services, provides real-time multi-language voice-to-subtitle service for live stream. It supports long-term storage of translated subtitles during live recordings and settings of various parameters such as font, background, effect, and display time. In addition, flexible use of templates in multiple languages such as Chinese, English, French, Spanish, and Russian is also possible. Moreover, real-time overlay of subtitles is implemented in the process of converting live broadcast voice to text, and the translation is integrated into the live stream in the form of subtitles for display.
Production Studio also supports the integration of live video clips, on-demand video clips, images, texts, dynamic H5 component materials, and AI capabilities. By doing so, it reconstructs the production procedure of video content, displays data information in multiple dimensions, enhances content richness, expands traffic exposure, and gains through advertisements.
The Video Review service is realized based on massive labeled data and deep learning algorithms. This service can accurately identify prohibited content in media files, including pornography, violence, terrorism, advertising, and unhealthy scenarios, in several dimensions, including voices, texts, and visual display. This service also supports the content review of videos, images, and files to ensure content security.
Stream ingest SDK of Alibaba Cloud is a powerful audio/video broadcast service based on Content Delivery Network (CDN) and audio/video real-time communication technologies of Alibaba Cloud. It provides easy-to-use open APIs, smooth and network-adaptive playback experience, multi-node-based low-latency optimization, and real-time retouching. Intelligent retouching is a detection and recognition technology for a large number of human faces based on intelligent vision algorithms. It provides capabilities such as retouching, shaping, and makeup beautifying and shooting filters and stickers.
The exclusive locating technology for facial key positions covers 106 basic positions and 280 high-accuracy positions, which makes effects realistic. The intelligent vision algorithm and real-time rendering technology are optimized on a regular basis for a better user experience. Face retouching and shaping effects, filters, stickers, and materials are constantly upgraded and enriched to make images more enjoyable. Comprehensive developer support ensures quick response to customer needs as well as excellent and reliable services.
ApsaraVideo Live supports access control, such as the Refer UA blacklist/whitelist and the IP blacklist/whitelist. It also supports playback center authentication and business remote authentication. Playback center authentication includes URL authentication for stream ingest and playback. Secure URL authentication supports customized authentication keys and authentication expiration time to dynamically generate authentication URLs. The business remote authentication refers to transmitting the business request information to the customized authentication center of the customer for validity check.
Reliable and stable live broadcast is achieved through the switching between active and standby streams. The switching process is simple and easy to operate. ApsaraVideo Live supports customized authentication by using EdgeScript. Users can customize authentication scripts based on the business features, thus achieving fast deployment and publishing. Users can compile EdgeScript on CDN edge nodes for live broadcast without paying attention to the hardware configuration, region deployment, scheduling, and automatic scaling of the machine. After being uploaded, the edge cloud nodes of ApsaraVideo Live can be deployed at the globe. The requests from all over the world can be processed on global edge nodes based on the code logic.
Live video encryption is a cloud-device integrated video encryption solution that uses a proprietary cryptography algorithm to ensure the security of video stream transmission. It supports general-purpose DRM encryption, as well as multi-terminal, multi-platform, and comprehensive copyright protection. This encryption solution uses independent encryption keys to avoid a wide range of security problems caused by the leakage of a single key. It supports encryption transcoding and decryption for playback. With dynamic key management, this solution provides better protection for video resources and effectively prevents video leaks and hotlinks. With the application of digital watermarking technology in live videos, we can obtain evidence, trace the source, and investigate the responsible persons for infringement of copyrights in live broadcasts of major sports events.
ApsaraVideo Live provides real-time monitoring of the quality of live stream ingest, views, error status, viewers, playback traffic bandwidth, and playback quality in seconds. Users can detect the exceptions in the live broadcast process in a timely manner with ultra-low latency. Real-time log delivery is designed to deliver the logs of domains in ApsaraVideo Live to Log Service. Users can also analyze the logs to detect and identify issues related to stream ingest or formulate operation strategies based on the analysis of live stream audience.
Based on their applications, typical live video broadcast scenarios include live broadcasts of large sports events, pan-entertainment (shows, games, and social media), e-commerce, party activities, online education, and enterprises.
Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.
8 Things Customers Expect When Interacting with Brands Digitally
2,599 posts | 762 followers
FollowAlibaba Cloud Community - July 12, 2022
Alibaba Clouder - April 13, 2018
Alibaba Cloud Community - December 7, 2023
Alibaba Cloud Community - March 2, 2022
JJ Lim - July 26, 2022
Alibaba Cloud Community - August 2, 2024
2,599 posts | 762 followers
FollowAn array of powerful multimedia services providing massive cloud storage and efficient content delivery for a smooth and rich user experience.
Learn MoreA professional solution for live video and audio
Learn MoreProvides low latency and high concurrency, helping improve the user experience for your live-streaming
Learn MoreThis solution provides tools and best practices to ensure a live stream is ingested, processed and distributed to a global audience.
Learn MoreMore Posts by Alibaba Clouder