Welcome to Alibaba Cloud Intelligent Speech Interaction!
Intelligent Speech Interaction provides you with the following services and features:
Real-time speech recognition: recognizes speech data streams that last for a long time. This service applies to uninterrupted speech recognition scenarios such as conference speeches and live streaming.
Short sentence recognition: recognizes short speech that lasts within 1 minute. This service applies to short speech recognition scenarios such as chat conversations and voice command control.
Recording file recognition: recognizes the recording files that you upload. This service applies to scenarios where real-time recognition is not required.
Speech synthesis: converts text to natural-sounding speech. This service provides a variety of speakers in different languages, dialects, and voices. You can specify the speaker of the synthesized speech based on your business requirements. This service applies to virtual conversation scenarios such as intelligent customer services and outbound voice calls.
Long-text-to-speech synthesis: converts long text that contains up to 100,000 characters to natural-sounding speech. This service provides a variety of speakers in different languages, dialects, and voices. You can specify the speaker of the synthesized speech based on your business requirements. In addition, you are allowed to cache and reuse the synthesized speech. This service applies to scenarios where you need the system to read literature and news aloud for you.
Self-learning platform: provides hotword training and custom linguistic models to help you improve the recognition effect of the preceding recognition services.
The Quick Start document provides guidelines for you to activate Intelligent Speech Interaction, create a project, and run an SDK to call Intelligent Speech Interaction services. We recommend that you read the topics of this document in the following sequence: