×
Community Blog Quickly Deploy a Llama 3 Model in EAS

Quickly Deploy a Llama 3 Model in EAS

This article describes how to quickly deploy a Llama 3 model and use the deployed web application in Elastic Algorithm Service (EAS) of Platform for AI (PAI).

This article describes how to quickly deploy a Llama 3 model and use the deployed web application in Elastic Algorithm Service (EAS) of Platform for AI (PAI).

Background Information

Llama 3 provides pretrained and instruction-tuned versions of models in 8B and 70B sizes, which are suitable for various scenarios. Llama 3 inherits the overall architecture of Llama 2 but increases the context length from 4K to 8K. In specific performance evaluations, the pretrained and instruction-tuned versions of Llama 3 models demonstrated significant improvements over the previous generation in various capabilities, such as subject ability, reasoning, knowledge, and comprehension.

Deploy a Model Service in EAS

1.  Go to the Elastic Algorithm Service (EAS) page.

  1. Log on to the PAI console.
  2. In the left-side navigation pane, click Workspaces. On the Workspaces page, click the name of the workspace in which you want to deploy the model.
  3. In the left-side navigation pane, choose Model Deployment > Elastic Algorithm Service (EAS) to go to the Elastic Algorithm Service (EAS) page.

1

2.  On the Elastic Algorithm Service (EAS) page, click Deploy Service. In the Scenario-based Model Deployment section, click LLM Deployment.

2

3.  On the LLM Deployment page, configure the parameters. The following table describes the key parameters. Use the default values for other parameters.

Parameter Description
Service Name The name of the service. In this example, chat_llama3_demo is used.
Model Source Select Open Source Model.
Model Type Select llama3-8b.
Resource Configuration We recommend that you select ml.gu7i.c8m30.1-gu30 for the Instance Type parameter in the China (Beijing) region.
Note:
If the preceding instance type is unavailable, you can also use the ecs.gn6i-c24g1.12xlarge instance type.

3

4.  Click Deploy. The model deployment requires approximately 3 minutes.

When the Service Status changes to Running, the service is deployed.

Use the Web Application to Perform Model Inference

1.  Find the service that you want to manage and click View Web App in the Service Type column.

4

2.  Perform model inference by using the web application.

Enter a prompt in the input text box, such as Give me a plan for learning the basics of personal finance. Then, click Send.

5

0 1 0
Share on

You may also like

Comments

Related Products