Attribute name | Attribute type | Attribute value | Required | Description |
voice | String | The name of the voice that can be called. The value of the voice attribute can only contain lowercase letters, such as siyue. | No | This attribute is included in the proprietary tag of Alibaba Cloud for speech synthesis. This attribute specifies the voice that is used for speech synthesis. The specified voice has a higher priority than the voice that is specified by the voice parameter in an API request. For more information, see Intelligent voice samples. |
encodeType | String | PCM/WAV/MP3 | No | This attribute is included in the proprietary tag of Alibaba Cloud for speech synthesis. This attribute specifies the audio file format for speech synthesis. The specified audio file format has a higher priority than the audio file format that is specified by the format parameter in an API request. |
sampleRate | String | 8000/16000/24000/48000 | No | This attribute is included in the proprietary tag of Alibaba Cloud for speech synthesis. This attribute specifies the audio sampling rate for speech synthesis. The specified audio sampling rate has a higher priority than the audio sampling rate that is specified by the sample_rate parameter in an API request. |
rate | String | Valid values: an integer ranging from -500 to 500. Default value: 0. | No | This attribute is included in the proprietary tag of Alibaba Cloud for speech synthesis. This attribute specifies the audio speed for speech synthesis. The specified audio speed has a higher priority than the audio speed that is specified by the speech_rate parameter in an API request. |
pitch | String | Valid values: an integer ranging from -500 to 500. Default value: 0. | No | This attribute is included in the proprietary tag of Alibaba Cloud for speech synthesis. This attribute specifies the audio pitch for speech synthesis. The specified audio pitch has a higher priority than the audio pitch that is specified by the pitch_rate parameter in an API request. |
volume | String | Valid values: an integer ranging from 0 to 100. Default value: 50. | No | This attribute is included in the proprietary tag of Alibaba Cloud for speech synthesis. This attribute specifies the audio volume for speech synthesis. The specified audio volume has a higher priority than the audio volume that is specified by the volume parameter in an API request. |
effect | String | robot/lolita/lowpass/echo/eq/lpfilter/hpfilter | No | This attribute is included in the proprietary tag of Alibaba Cloud for speech synthesis. This attribute can be used to produce various sound effects for the synthesized speech. Valid values: robot: robot voice lolita: little girl voice lowpass: low-pass effect echo: echo effect eq: equalizer lpfilter: low-pass filter hpfilter: high-pass filter
Note The eq, lpfilter, and hpfilter values specify advanced filters. If you set this attribute to eq, lpfilter, or hpfilter, you can configure the effectValue attribute to specify a custom effect for the specified filter. An SSML structure supports only one sound effect. You cannot set this attribute to multiple values. If you configure this attribute, the system latency may increase. |
effectValue | String | The effect of a specific filter. If you set the effect attribute to eq, lpfilter, or hpfilter, you can configure this attribute to modify the default effect of the specified filter. | No | eq: specifies the equalizer. The system provides eight default bands Frequencies: ["40Hz", "100Hz", "200Hz", "400Hz", "800Hz", "1600Hz", "4000Hz", "12000Hz"]; Bandwidths: ["1.0q", "1.0q", "1.0q", "1.0q", "1.0q", "1.0q", "1.0q", "1.0q"]. If you configure this attribute, you must specify a gain for each band. The gain ranges from -20 dB to 20 dB. For example, you can set the effectValue attribute to 1 1 1 1 1 1 1 1. The input value is a string consisting of eight integers separated by spaces. The value 0 indicates that the gain of the band is not adjusted. lpfilter: the frequency of the low-pass filter. The value is an integer in the range of (0, Required sampling rate/2]. For example, you can set the effectValue attribute to 800. hpfilter: the frequency of the high-pass filter. The value is an integer in the range of (0, Required sampling rate/2]. For example, you can set the effectValue attribute to 1200.
|
bgm | String | The name of the background music (BGM) that can be called online. You can view the description of the bgm attribute to obtain more information. | No | This attribute is included in the proprietary tag of Alibaba Cloud for speech synthesis. This attribute specifies the BGM of the synthesized speech. |
backgroundMusicVolume | String | Valid values: an integer ranging from 0 to 100. Default value: 50. | No | This attribute is included in the proprietary tag of Alibaba Cloud for speech synthesis. This attribute specifies the volume of the BGM. |