This topic describes how to record videos for custom 2D avatars. Before you record a video for a custom 2D avatar, read the guide to familiarize yourself with the specific requirements on equipment, site, model, recording, and video processing. This ensures that the custom 2D avatar meets your business requirements.
Preparations
Site
Find a noise-free location for recording and avoid areas where planned construction activities are conducted on the day of the recording.
Make sure that the lighting equipment provides even illumination with sufficient brightness. Avoid shadows or overexposure.
If you need to perform chroma keying during post-production, use a green screen as the background for recording, and make sure that the green screen is smooth and free of wrinkles. We recommend that the model be more than 2.5 meters away from the green screen. If you need to record a full body, cover the floor with a green screen.
If you record the video on location, make sure that the background is static and free of real or lifelike human figures to avoid affecting the final output.
Equipment
Select a smartphone or camera that can record videos at a resolution of 1080p or higher and a frame rate of at least 30 FPS.
Use a professional tripod or phone mount to secure your smartphone or camera, and turn off the auto-focus feature to ensure the footage is stable and in focus.
If you use a camera, you can install a PAD-based or professional teleprompter. If you use a smartphone, you can download a teleprompter app. Make sure that the model maintains eye contact with the camera during the recording.
Model
The model must wear clean clothes made of non-reflective fabric. To reduce the difficulty of chroma keying, the color of the clothing cannot be similar to the green screen.
The hairstyle of the model must be neat and cannot cover the face, features, or neck of the model. Use hair gel or other hair products when necessary to keep hair in place and to avoid affecting the chroma keying effect.
The makeup of the model must be clean and free of shine. If glasses are required, use contact lenses or small-framed glasses to prevent reflection on the lenses during the recording.
Before the recording, the model must read the script (approximately 1,500 words) to avoid frequent pauses and stumbles during the recording. we recommend that the model use a familiar script to ensure a natural performance. Make sure that the content of the script for the on-site recording is not repetitive and can be continuously read at a normal pace for more than 3 minutes.
Recording process
Test recording
Before the official recording, a test recording can be conducted. Take note of the following items:
Framing: The model must be proportionally placed in the frame, with a level gaze (not looking up or down, and not tilted) and clear facial features. If hand gestures are required, the model must remain within the frame.
Teleprompter: The speed of the teleprompter must match the speaking pace of the model. The teleprompter and the camera lens must be positioned in the same place to prevent the eyes of the model from wandering.
Audio: The background noise must be low, the voice must be clear, and the audio must be in sync with the video.
Performance: The model must be formal and natural and avoid stiffness in expressions or movements.
Note: If the model does not have extensive experience in recording talking videos, we recommend that the model be seated, as shown in the following figure. This helps the model maintain a natural and relaxed state and avoid stiffness.
Official recording
Make sure the site is quiet and clear of irrelevant personnel before you start the recording. The recording process is expected to last 5 to 15 minutes. The recording must be done in one continuous take without pauses or post-production editing.
The following section describes the recording steps:
15 seconds of silence: After the camera is turned on, the model must remain silent for about 15 seconds, place his or her hands in front of the body, and maintain direct eye contact with the camera. Blinking is allowed, but other facial or hand movements, such as opening the mouth, are not allowed.
5 to 15 minutes of speaking: Start the teleprompter. The model must look directly into the camera to record the speaking video. The enunciation must be clear. The facial expressions and the head and hand movements of the model can be designed based on the requirements of the final video. The amplitude and frequency of head movements (turning left and right, nodding up and down) cannot be excessively large. Avoid actions with clear connotations, such as shaking the head, waving, or shrugging to show helplessness. If the model needs to raise hands, make sure that the face or neck of the model is not blocked. Avoid licking lips, sticking out the tongue, pouting, or other exaggerated facial expressions. During pauses in speaking, keep the lips closed. If it is difficult to make gestures, place the hands naturally in front of the body or on a table.
NoteThe facial expressions and movements of the final avatar will be exactly the same as the model in the video.
Post-recording processing
If the position of the model at the beginning of the recording is different from the position at the end of the recording, make sure that the unusable parts are edited out before you submit the video. No editing is allowed in the middle of the video.
If you use a green screen as the background and want to overlay a custom background on the avatar for the final video output, remove the background color. Requirements for removing the background:
We recommend that you do not rely entirely on automatic keying tools, including various software, SDKs, APIs, and web-based tools. After you use an automatic keying tool, perform a manual check and correction. Potential issues of automatic keying include over or under keying, jagged edges, and reduced video clarity.
The edges of the keyed-out area must be clear and smooth. The foreground and the background must be separated.
The transparency (alpha) channel cannot contain semi-transparent values. The foreground must be all white (255), and the background must be all black (0). Minimize semi-transparent values at the edges, such as around the hair.
Ensure continuity between frames to ensure that no flickering occurs, in which one frame has more or less of the edge than the adjacent frames. This is a common issue with automatic keying.
Adjust the color tone as needed. If the green screen reflection on the face and body of the model is severe, correct the color tone after keying.
After keying, use video editing software to overlay solid color background images and confirm that the keying result is precise and meets the requirements. You can use a light color, red, or dark color background. We recommend that you try different backgrounds.
If excessive noise exists, import the video into editing software to reduce noise before uploading.
Do not use effects that alter the face shape or features, such as face slimming or eye enlargement. When you export the video, pay attention to parameters such as the bitrate to ensure the exported video maintains high clarity.
Submitted files
The files that you must submit consist of a complete video file and a preview image.
Video file
The video file must have an aspect ratio of 16:9 (landscape) or 9:16 (portrait), a resolution of 1080p, a frame rate of 30 FPS, and a duration of 5 to 15 minutes. The file size cannot exceed 40 GB.
If you need to overlay the video on a custom background, remove the original background and export the video file with an alpha channel in WebM or MOV format (⚠️ Note: MOV files must be ProRes encoded). If you use the background from the on-location recording, export the video in MP4 format without an alpha channel.
Avatar file
The avatar file is required for previewing and matching avatars. The avatar image must have an aspect ratio of 1:1.
If you use a green screen for recording, the background must be removed and the image must be exported as a PNG file with an alpha channel.
If you record videos on location, you do not need to remove the background.
Video file checklist
Before you submit a video, check each item to ensure that the video meets the requirements.
Image
The head of the model in the video must face upwards.
The expression and posture of the model must be natural and relaxed.
No other face (including the faces of real people or images of people on objects) can appear in the video.
The body and head of the model cannot make significant movements.
The head and hand movements of the model must remain within the frame. The hands cannot cover the face or neck.
The face of the model must be evenly lit, with clear features and facial contours, and not obscured by hair.
The model must look directly at the camera, with normal blinking frequency, and no wandering or squinting.
During the first 15 seconds of silence, the lips of the model must be closed, and the facial expression must be natural.
When speaking, the model cannot stutter and the speech must be clear, with natural lip movement and visible teeth. Lips must be closed during pauses.
The video cannot be edited or spliced. No obvious frame skipping can occur.
Significant effects that alter the appearance, such as face slimming or eye enlargement, cannot be used.
If background removal is performed, no over-removal or mis-removal can occur. A solid color background must be added, with no color gradients at the edges of the removed area.
Audio
The audio must be clear and synchronized with the video. The audio must have low noise and no noticeable echo or reverberation.
The model should not cough, clear his or her throat, or make other distracting sounds. No other voices can be heard.
The pronunciation must be standard. Dialects cannot be used.