×
Community Blog Tongyi Wanxiang - An Alibaba Cloud SaaS for Multimodal Content Generation

Tongyi Wanxiang - An Alibaba Cloud SaaS for Multimodal Content Generation

In this blog we will see about the capabilities of Tongyi Wanxiang, a mutimodal SaaS for Generative AI.

Alibaba Cloud has various SaaS products under the Tongyi ecosystem. The various SaaS products of Tongyi are as follows:
Tongyi Qianwen - Qwen LLMs to generate text outcome based on text prompts and Qwen VL models to respond based on images.
Tongyi Wanxiang - Generate images based on text prompts, edit foreground and background of a base image using reference images and prompts and video generation based on prompts.
Tongyi Lingma - A co-pilot code generation plugin which is available for IDEs like VS Code and Jet Beans.
Tongyi Tingwu - A voice to text translation and transliteration from live audio or stored audio.

In this blog we will see about the capabilities of Tongyi Wanxiang, a mutimodal SaaS for Generative AI. Alibaba Cloud has model studio which works with the functionality of Model as a Service (MaaS). This has the vision language model of Qwen VL. We will use this primarily for the following scenario:

Azeez is an architect based in dubai and specializes in designing sky scrappers in a newly assigned project. He browsed through the internet and found a reference image for his project. He can't use the image as is due to possible copyright claim and it is not 100 percent satisfactory to his thoughts. So he wants to generate an image as similar to the one downloaded from the internet. He is not from a technical background to craft good prompt but has knowledge about using Alibaba Cloud's Model Studio. The proceedure on how he transformed this image into a non copyrighted AI generated image and convincing enough for his project productivity.

Enter the console of Model Studio.
1
Click on "Use Now".
2
Click on Playground.
3
Browse through the models and select Qwen-VL-Plus or Qwen-VL-Max.
4
Click on the image icon to select the picture he downloaded.
5
Enter the prompt as "Create a prompt to generate a picture as same as this image". Click on the button to the bottom right to enter.
6
Copy the prompt and open the Tongyi Wanxiang SaaS portal.
7
Enter the prompt copied from Qwen VL Max and click on "Generate a painting".
8
The architect chooses the second image generated as it looks close to his imagination. Click on that image.
9
Click to download the image. Now the image needs a dynamic video footage. Go to the video generation page. This video generation feature got released in the recent Apsara conference 2024.
10
Click on figure video.
11
Click on the highlighted area and select the image and click finish.
12
Enter the prompt to float with the imagination. Click on Generate Video.
13
It will take a while to generate the video.
14
The Tongyi app is available to be used in iOS and Android.
15
Download the video and it plays as follows.


For other prompts, we got some videos created and shared for reference.
Generate a video where this man is playing badminton wearing a cyborg outfit.

A beautiful Indian girl wearing a blue traditional attire spinning a yarn to make silk mat which is red in colour with 9:16.

A beautiful Indian girl wearing a blue traditional attire spinning a yarn to make silk mat which is red in colour.
0 1 0
Share on

ferdinjoe

21 posts | 131 followers

You may also like

Comments