By Defu, a wireless development expert at Alibaba Entertainment
For mobile developers, improving the video startup speed has always been a focal point for client optimization. How can we achieve this? In addition to improving the network, raising the server bandwidth, and optimizing the frame rate and bit rate for the video files, we can achieve further optimization from the following two aspects:
1) By pre-loading video data. The mobile client preloads a certain amount of video data so that the player can immediately read local video data and start playing within seconds.
2) By caching while playing. The mobile client saves the video being played to the local cache. When the user replays the video, the player can read the video data from the local cache without downloading from the network again and therefore reduces the video startup delay.
For convenience, most video players internally implement the video data download and caching functions. For these players, their data callback interfaces are not exposed and therefore are unavailable to the video data service layer, making optimization in this aspect impossible.
This is where the local video proxy and cache solution comes in. This solution solves the issue of transferring the player's built-in download logic to the service layer, making the player solely responsible for data receiving, playing, and playing control.
The system player AVPlayer
provides resourceLoader
and can accomplish the takeover of video data download mentioned previously.
resourceLoader
was originally used for custom resource download. When the system cannot process the corresponding resources of AVURLAsset, it calls a series of callback methods of resourceLoader to let the service layer handle the resource download. The key callback is as follows:
The AVAssetResourceLoadingRequest
is a data request of AVURLAsset
and contains request-related information, such as data request segments. What we need to do is to retrieve the corresponding data according to the request information, and then fill the data into the AVAssetResourceLoadingRequest
to complete the request.
Through this process, the data download logic is handed over to the service layer. The service layer creates network requests by itself to download data. The downloaded data are not only filled in AVAssetResourceLoadingRequest
for the player to play but also written to the local video cache. Every time the AVAssetResourceLoadingRequest
request is received, the player starts playing video by trying to read data in the local cache.
The preceding solution is based on the system player API and depends on the interfaces provided by the system player. However, in actual business scenarios, other players may also be used in addition to the system player, such as the open-source player ijkplayer. These players do not necessarily provide the corresponding interfaces for the service layer to implement video data caching. Therefore, a solution that is independent on the player is especially important.
Therefore, we decided to create a local server for the Taopiaopiao client to implement video data caching. The client converts the network address of a video file into a local address and then passes it to the player. When the player requests data, the local server receives the request and takes over logic including the data download logic.
The original player process is as shown in Figure 1.
The process with the local server added is as shown in Figure 2.
Based on this idea, we developed AliSmartVideoCache
components.
AliSmartVideoCache
is a set of components that include local video proxy, cache, and preload. The overall architecture is as shown in Figure 3.
AliSmartVideoCache
exposes two components to the business side: the preload component and the local proxy component. When the device is idle, the preload component can download video data in advance and allows you to specify the amount of data to preload and the preload policy for different network environment. A video is preloaded as a queue.
The local proxy component is responsible for converting video addresses, determining the video resource format and the corresponding local proxy policy, and maintaining the local server.
The cached part of a video file is more complex because the cached data of a video file is often discontinuous. Generally, if a user plays a video from the beginning, then theoretically the video file is downloaded from the beginning, and the cached data is continuous. However, if the user drags the progress bar to anywhere during playback, the downloaded data in the cache becomes discontinuous.
To maximize the data downloaded to cache, we developed the video caching component based on a single file. By maintaining a mapping table in the file header (see figure 4), we established the mapping relationship between the offset of the data in the original file and the offset of the local cache file, so as to implement continuous cache for discontinuous video data (similar to sparse files).
After the local proxy receives a video download request, it first determines the format of the video based on the URL rule specified by the business side. For HLS video, the local proxy parses the playlist file to replace the shard address. For other video formats, the local proxy searches the local cache data, returns the existing data directly, and initiates a network request for the missing data. After obtaining data from the network, the local proxy returns the data to the player and at the same time determines whether to write the data to the local cache according to the local policy.
For the preceding technologies, additional attention is required for handling certain details and special situations. For example, according to the preceding idea, caching data for the beginning part of a video helps improve the video startup speed. However, the moov segment of some videos is located at the end of the file. In this case, the local proxy has to skip to the file end to download certain amount of data and parse the moov metadata in order for the player to start playing. As a result, the improved startup speed achieved by caching data for the beginning part of a video is reduced. To resolve this problem, we coordinate the server to perform a second transcoding and move the moov segment to the file header.
In addition, as mentioned earlier, a user dragging the progress bar causes the download policy to change, and for this, a situation must be considered, which is that the server does not support 206 partial content. In this situation, the client cannot specify the required data position through Content-Range, and as a result, the requested data must always start from the beginning of the file. So, the download policy needs special handling for this situation.
Also, careful handling is required for exceptions such as network request failure, retry, data error, and failures in creating or expanding cache objects due to insufficient disk space on the device.
After using the local proxy component of this video player with the preloading logic at the service layer, the system and 3rd party players have significantly raised the rate of video startup within one second and the effective view through rate. Among them, the rate of video startup within one second has exceeded 95%. The effect seems to be good.
Alibaba Clouder - January 5, 2021
Alibaba Clouder - June 5, 2018
Alibaba Clouder - August 27, 2020
Alibaba Clouder - February 11, 2019
Alibaba Clouder - January 21, 2019
OpenAnolis - January 10, 2023
A professional solution for live video and audio
Learn MoreTranscode multimedia data into media files in various resolutions, bitrates, and formats that are suitable for playback on PCs, TVs, and mobile devices.
Learn MoreAn all-in-one VOD solution
Learn MoreProvides low latency and high concurrency, helping improve the user experience for your live-streaming
Learn More