×
Community Blog Deploy OCR in Function Compute Using QWEN VL and Model Studio

Deploy OCR in Function Compute Using QWEN VL and Model Studio

This article describes how to deploy an OCR solution using QWEN VL, Alibaba Cloud's Model Studio and Function Compute.

Written by M Fakhri Darmawan, Solution Architect Alibaba Cloud Indonesia

In today's data-driven world, Optical Character Recognition (OCR) plays a crucial role in automating the extraction of text from images or scanned documents. Whether you're dealing with invoices, receipts, forms, or any other type of document, OCR can significantly streamline workflows by converting unstructured data into structured, machine-readable formats.

In this blog post, we'll walk through how to deploy an OCR solution using QWEN VL, Alibaba Cloud's Model Studio and Function Compute.

Step 1

Activate Model Studio with no cost using this LINK. Next is create API Key.

Step 2

Create and Get API Key

1.  Hover your cursor over 1 in the upper-right corner of the page and select API-KEY.

2

2.  In the left-side navigation pane, select All API Keys or My API Key to view or create API keys. Click Create My API Key or View existing API keys.

3

Note

  • You can only use the All API Keys page if you are using the Alibaba Cloud account. The Alibaba Cloud account can obtain the API keys of all RAM users, but a RAM user can only obtain its own API key.
  • Keep your API key confidential to avoid security risks or financial losses caused by unauthorized usage.
  • You API key is an important asset, make sure to keep it safe. If you click Delete in the Actions column to delete an existing API Key, you cannot use the key to use Model Studio services again. All applications or services associated with this key will fail.

Step 3

Deploy OCR Sample Application

Prerequisite

Steps

1.  Download sample code from this github repository.

2.  Unzip the code

3.  Change code in s.yaml file

edition: 3.0.0
name: fc3-example
access: default
resources:
  fcDemo:
    component: fc3
    props:
      region: ap-southeast-5
      handler: handler
      role: acs:ram::XXXX:role/aliyunfcdefaultrole # put your main account uid in XXXX
      disableOndemand: false
      description: OCR
      timeout: 120
      diskSize: 512
      internetAccess: true
      customRuntimeConfig:
        port: 9000
        command:
          - python3
          - app.py
      # logConfig: # define log project in sls to enable log for your FC
      #   enableRequestMetrics: true
      #   enableInstanceMetrics: true
      #   logBeginRule: DefaultRegex
      #   project: serverless-ap-southeast-5-xxx
      #   logstore: default-logs
      functionName: ocr-qwen
      runtime: custom.debian10
      cpu: 1
      instanceConcurrency: 100
      memorySize: 1024
      environmentVariables:
        PATH: >-
          /var/fc/lang/python3.10/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/ruby/bin:/opt/bin:/code:/code/bin
        PYTHONPATH: /opt/python:/code/python:/code
        OSS_BUCKET: yourossbucket # your oss bucket name
        TZ: Asia/Jakarta
        USER_PASSWORD: '' # use this if you enable login page
        DASHSCOPE_API_KEY: sk-xxx #put your API Key here or in FC Configuration -> Environment variable
        USER_NAME: '' # use this if you enable login page
        OSS_ENDPOINT: https://oss-ap-southeast-1.aliyuncs.com # OSS singapore endpoint
        ENABLE_LOGIN: 'false'
      code: ./ocr-qwen
      triggers:
        - triggerConfig:
            methods:
              - GET
              - POST
              - PUT
              - DELETE
            authType: anonymous
            disableURLInternet: false
          triggerName: httpTrigger
          description: ''
          qualifier: LATEST
          triggerType: http

Code need to change:

  • Line 10. role: acs:ram::XXXX:role/aliyunfcdefaultrole # put your main account uid in XXXX
  • Line 36. OSS_BUCKET: yourossbucket # put your oss bucket name
  • Line 39. DASHSCOPE_API_KEY: sk-xxx # put your API Key here or in FC Configuration -> Environment variable

4.  Deploy the code using serverless deployment

s deploy s.yaml

Result if the deployment succeeds

[2025-03-14 12:24:59][WARN][s_cli] It is not recommended to run the command as root user.
(node:46363) [DEP0060] DeprecationWarning: The `util._extend` API is deprecated. Please use Object.assign() instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
⌛  Steps for [deploy] of [fc3-example]
====================
Downloading[/v3/packages/fc3-domain/zipball/0.0.26]...
Download fc3-domain successfully
(node:46363) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.

TIPS:

You can use "s info" get more detail

✔ [fcDemo] completed (9.51s)

🚀  Result for [deploy] of [fc3-example]
====================
region:               ap-southeast-5
cpu:                  1
customRuntimeConfig: 
  command: 
    - python3
    - app.py
  port:    9000
description:          OCR
diskSize:             512
environmentVariables: 
  DASHSCOPE_API_KEY: sk-xxx
  ENABLE_LOGIN:      false
  OSS_BUCKET:        yourossbucket
  OSS_ENDPOINT:      https://oss-ap-southeast-1.aliyuncs.com
  PATH:              /var/fc/lang/python3.10/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/ruby/bin:/opt/bin:/code:/code/bin
  PYTHONPATH:        /opt/python:/code/python:/code
  TZ:                Asia/Jakarta
  USER_NAME:         
  USER_PASSWORD:     
functionArn:          acs:fc:ap-southeast-5:5846842125303498:functions/ocr-qwen
functionName:         ocr-qwen
handler:              handler
instanceConcurrency:  100
internetAccess:       true
memorySize:           1024
role:                 acs:ram::5846842125303498:role/acs:ram::xxxx:role/aliyunfcdefaultrole
runtime:              custom.debian10
timeout:              120
triggers: 
  - 
    description:   
    qualifier:     LATEST
    triggerConfig: 
      methods: 
        - GET
        - POST
        - PUT
        - DELETE
      authType:           anonymous
      disableURLInternet: false
    triggerName:   httpTrigger
    triggerType:   http
url: 
  system_url:          https://your-url
  system_intranet_url: https://your-url
__component:          fc3

5.  Ensure the FC was deployed and environment variable was configured correctly

4
5

6.  Add your domain

6

Put your domain, select your function and version then create

7

Configure your domain and pointing to CNAME from Function Compute

7.  Go to your domain and test the application

8

8.  Try using your own image and specific the prompt

9

By combining Qwen-VL for OCR, Model Studio and Function Compute for serverless execution, you can build a scalable, efficient, and cost-effective OCR solution tailored to various business needs. This approach leverages the strengths of both Qwen-VL and Function Compute to deliver a robust OCR system with minimal infrastructure management.

0 1 0
Share on

Alibaba Cloud Indonesia

108 posts | 19 followers

You may also like

Comments

Alibaba Cloud Indonesia

108 posts | 19 followers

Related Products

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free Get Started for Free