Qwen-VL is a large-scale vision language model developed by Alibaba Cloud. It takes images, text, and detection boxes as input, and generates text and detection boxes as output. Building upon Qwen-VL, an AI vision assistant called Qwen-VL-Chat has been developed using alignment mechanisms based on the large language model. This vision assistant supports flexible interaction methods, including multi-picture, multi-round question and answer, and content creation capabilities. It naturally supports multi-language dialogues such as English and Chinese, multi-picture input and comparison, designated picture question and answer, and multi-picture literature creation.
This article introduces how to quickly build a personal AI vision assistant service based on Alibaba Cloud AMD servers and OpenAnolis AI container service.
When you create an ECS instance, you must select an instance type based on the size of the model. The inference process of the entire model consumes a large number of computing resources, and the run-time memory occupies a large amount of memory. To ensure the stability of the model, select an ecs.g8a.4xlarge instance type. In addition, multiple model files need to be downloaded to run the Qwen-VL-Chat, which can occupy a large amount of storage. When creating an instance, at least 100 GB of storage disk should be allocated. Finally, to guarantee the speed of environment installation and model download, the instance bandwidth is allocated 100 Mbit/s.
Alibaba Cloud Linux 3.2104 LTS 64-bit is chosen for the instance operating system.
For more information about how to install Docker on Alibaba Cloud Linux 3, see Install and use Docker (Linux). After the installation is completed, make sure that the Docker daemon has been enabled.
systemctl status docker
The OpenAnolis community provides a variety of container images based on Anolis OS, including AMD-optimized PyTorch images. You can use these images to create a PyTorch runtime environment.
docker pull registry.openanolis.cn/openanolis/pytorch-amd:1.13.1-23-zendnn4.1
docker run -d -it --name pytorch-amd --net host -v $HOME:/root registry.openanolis.cn/openanolis/pytorch-amd:1.13.1-23-zendnn4.1
The above command first pulls the container image, then uses the image to create a container named pytorch-amd
that runs in independent mode and maps the user's home directory to the container to preserve the development content.
After the PyTorch container is created and run, run the following command to access the container environment:
docker exec -it -w /root pytorch-amd /bin/bash
You must run subsequent commands in the container environment. If you exit unexpectedly, re-enter the container environment. To check whether the current environment is a container, you can use the following command to query.
cat /proc/1/cgroup | grep docker
# A command output indicates that it is the container environment
Before deploying the Qwen-VL-Chat, you need to install some required software.
yum install -y git git-lfs wget gperftools-libs anolis-epao-release
The subsequent download of the pre-trained model requires support for Git LFS to be enabled.
git lfs install
Download the GitHub project source code and the pre-trained model.
git clone https://github.com/QwenLM/Qwen-VL.git
git clone https://www.modelscope.cn/qwen/Qwen-VL-Chat.git qwen-vl-chat
Before deploying the Python environment, you can change the pip download source to speed up the download of the dependency package.
mkdir -p ~/.config/pip && cat > ~/.config/pip/pip.conf <<EOF
[global]
index-url=http://mirrors.cloud.aliyuncs.com/pypi/simple/
[install]
trusted-host=mirrors.cloud.aliyuncs.com
EOF
Install Python runtime dependencies.
yum install -y python3-transformers python-einops
pip install tiktoken transformers_stream_generator accelerate gradio
To ensure that ZenDNN can fully release CPU computing power, two environment variables need to be set: OMP_NUM_THREADS
and GOMP_CPU_AFFINITY
.
cat > /etc/profile.d/env.sh <<EOF
export OMP_NUM_THREADS=\$(nproc --all)
export GOMP_CPU_AFFINITY=0-\$(( \$(nproc --all) - 1 ))
EOF
source /etc/profile
A web demo is provided in the project source code, which can be used to interact with Qwen-VL-Chat.
cd ~/Qwen-VL
export LD_PRELOAD=/usr/lib64/libtcmalloc.so.4
python3 web_demo_mm.py -c=${HOME}/qwen-vl-chat/ --cpu-only --server-name=0.0.0.0 --server-port=7860
After the service is deployed, you can go to http://<ECS public IP address>:7860
to access the service.
1,042 posts | 256 followers
FollowAlibaba Cloud Project Hub - March 19, 2024
Alibaba Cloud Community - August 25, 2023
Alibaba Cloud Community - December 6, 2023
Alibaba Cloud Community - April 18, 2024
Alibaba Cloud Community - April 18, 2024
Daniel Molenaars - April 12, 2024
1,042 posts | 256 followers
FollowApply the latest Reinforcement Learning AI technology to your Field Service Management (FSM) to obtain real-time AI-informed decision support.
Learn MoreAlibaba Cloud Container Service for Kubernetes is a fully managed cloud container management service that supports native Kubernetes and integrates with other Alibaba Cloud products.
Learn MoreThis solution provides you with Artificial Intelligence services and allows you to build AI-powered, human-like, conversational, multilingual chatbots over omnichannel to quickly respond to your customers 24/7.
Learn MoreOffline SDKs for visual production, such as image segmentation, video segmentation, and character recognition, based on deep learning technologies developed by Alibaba Cloud.
Learn MoreMore Posts by Alibaba Cloud Community