Qwen-Audio is a large audio language model developed by Alibaba Cloud, capable of processing various audio inputs including speaker speech, natural sounds, music, and singing to produce text output. Building upon Qwen-Audio, an AI voice assistant called Qwen-Audio-Chat has been developed using alignment mechanisms based on the large language model. This AI voice assistant supports flexible interaction methods such as multi-audio, multi-round question and answer, creation, and other capabilities.
This article introduces how to quickly build an AI voice assistant service based on Alibaba Cloud AMD servers and OpenAnolis AI container service.
When creating an ECS instance, the instance type should be selected based on the model's size. The inference process of the entire model requires a significant amount of computing resources and runtime memory. To ensure model stability, the recommended instance type is ecs.g8a.4xlarge. Furthermore, the operation of Qwen-Audio-Chat necessitates downloading multiple model files, thus necessitating a considerable amount of storage allocation of at least 100 GB when creating the instance. Additionally, to expedite environment installation and model downloads, the instance's bandwidth should be allocated at 100 Mbit/s.
The operating system chosen is Alibaba Cloud Linux 3.2104 LTS 64-bit.
For more information about how to install Docker on Alibaba Cloud Linux 3, see Install and use Docker (Linux). After the installation is completed, make sure that the Docker daemon has been enabled.
systemctl status docker
The OpenAnolis community provides a variety of container images based on Anolis OS, including AMD-optimized PyTorch images. You can use these images to create a PyTorch runtime environment.
docker pull registry.openanolis.cn/openanolis/pytorch-amd:1.13.1-23-zendnn4.1
docker run -d -it --name pytorch-amd --net host -v $HOME:/root registry.openanolis.cn/openanolis/pytorch-amd:1.13.1-23-zendnn4.1
The above command first pulls the container image, then uses the image to create a container named pytorch-amd
that runs in independent mode and maps the user's home directory to the container to preserve the development content.
After the PyTorch container is created and run, run the following command to access the container environment:
docker exec -it -w /root pytorch-amd /bin/bash
You must run subsequent commands in the container environment. If you exit unexpectedly, re-enter the container environment. To check whether the current environment is a container, you can use the following command to query.
cat /proc/1/cgroup | grep docker
# A command output indicates that it is the container environment
Before deploying Qwen-Audio-Chat, you need to install some required software.
yum install -y git git-lfs wget xz gperftools-libs anolis-epao-release
The subsequent download of the pre-trained model requires support for Git LFS to be enabled.
git lfs install
Download the GitHub project source code and the pre-trained model.
git clone https://github.com/QwenLM/Qwen-Audio.git
git clone https://www.modelscope.cn/qwen/Qwen-Audio-Chat.git qwen-audio-chat
Before deploying the Python environment, you can change the pip download source to speed up the download of the dependency package.
mkdir -p ~/.config/pip && cat > ~/.config/pip/pip.conf <<EOF
[global]
index-url=http://mirrors.cloud.aliyuncs.com/pypi/simple/
[install]
trusted-host=mirrors.cloud.aliyuncs.com
EOF
Install Python runtime dependencies.
yum install -y python3-transformers python-einops
pip install typing_extensions==4.5.0 tiktoken transformers_stream_generator accelerate gradio
Install ffmpeg.
wget https://johnvansickle.com/ffmpeg/releases/ffmpeg-6.1-amd64-static.tar.xz
tar -xf ffmpeg-6.1-amd64-static.tar.xz
cp ffmpeg-6.1-amd64-static/{ffmpeg,ffprobe} /usr/local/bin
rm -rf ffmpeg-6.1-amd64-static*
To ensure that ZenDNN can fully release CPU computing power, two environment variables need to be set: OMP_NUM_THREADS
and GOMP_CPU_AFFINITY
.
cat > /etc/profile.d/env.sh <<EOF
export OMP_NUM_THREADS=\$(nproc --all)
export GOMP_CPU_AFFINITY=0-\$(( \$(nproc --all) - 1 ))
EOF
source /etc/profile
A web demo is provided in the project source code, which can be used to interact with Qwen-Audio-Chat in real time.
cd ~/Qwen-Audio
export LD_PRELOAD=/usr/lib64/libtcmalloc.so.4
python3 web_demo_audio.py -c=${HOME}/qwen-audio-chat/ --cpu-only --server-name=0.0.0.0 --server-port=7860
After the service is deployed, you can go to http://<ECS public IP address>:7860
to access the service.
New Tool of Java 22: Use Java Stream Gather to Handle States in a Stream
1,029 posts | 252 followers
FollowAlibaba Cloud Community - December 6, 2023
Alibaba Cloud Community - April 18, 2024
Alibaba Cloud Community - April 18, 2024
Daniel Molenaars - April 12, 2024
Alibaba Developer - June 17, 2020
Alibaba Cloud Native - November 29, 2023
1,029 posts | 252 followers
FollowApply the latest Reinforcement Learning AI technology to your Field Service Management (FSM) to obtain real-time AI-informed decision support.
Learn MoreAlibaba Cloud Container Service for Kubernetes is a fully managed cloud container management service that supports native Kubernetes and integrates with other Alibaba Cloud products.
Learn MoreThis solution provides you with Artificial Intelligence services and allows you to build AI-powered, human-like, conversational, multilingual chatbots over omnichannel to quickly respond to your customers 24/7.
Learn MoreOffline SDKs for visual production, such as image segmentation, video segmentation, and character recognition, based on deep learning technologies developed by Alibaba Cloud.
Learn MoreMore Posts by Alibaba Cloud Community