one-click fine-tuning of DeepSeek-R1 distill models - Platform For AI

Supported models

Model Gallery supports LoRA SFT for the six distill models. The following table lists minimum configurations with default parameters:

Go to the Model Gallery page.
1. Log on to the PAI console.
2. In the upper-left corner, select a region.
3. In the left pane, click Workspaces. On the Workspaces page, click a workspace name.
4. In the left pane, choose QuickStart > Model Gallery.
On the Model Gallery page, click the DeepSeek-R1-Distill-Qwen-7B model card to go to the details page.

This page displays deployment, training details, SFT data format, and invocation methods.

Click Train in the upper-right corner and configure the following key parameters:

Dataset Configuration: Upload prepared data to an OSS bucket.
Computing Resources: Minimum configurations are listed in Supported models. Adjusting hyperparameters may require more memory.

Hyperparameters: Adjust these LoRA SFT hyperparameters based on your data and resources. For details, see Guide to fine-tuning LLMs.

Hyperparameter	Type	Default value (for 7B model as an example)	Description
learning_rate	float	5e-6	Controls weight adjustment magnitude.
num_train_epochs	int	6	Number of training epochs (dataset iterations).
per_device_train_batch_size	int	2	Samples per GPU per iteration. Higher values increase efficiency and memory usage.
gradient_accumulation_steps	int	2	The number of gradient accumulation steps.
max_length	int	1024	Max tokens per sample.
lora_rank	int	8	LoRA dimension.
lora_alpha	int	32	LoRA scaling factor.
lora_dropout	float	0	LoRA dropout rate. Randomly drops neurons during training to prevent overfitting.
lorap_lr_ratio	float	16	LoRA+ learning rate ratio (λ = ηB/ηA). Uses different rates for adapter matrices A and B. Set to 0 for standard LoRA.

Click Train. The training page shows job status and logs.
- On success, the model is registered in AI Asset Management - Models for deployment. See Register and manage models.
- On failure, click next to Status or check the Task log tab. For common errors, see FAQ and Model Gallery FAQ.
- Metric Curve shows the loss progression.
After training, click Deploy to create an EAS service. Invocation follows the original distill model. See the model detail page or One-click deployment of DeepSeek-V3 and DeepSeek-R1 models.

Model Gallery training uses DLC, billed by job duration. Resources stop automatically when jobs end. See Billing of Deep Learning Containers (DLC).

Cause: max_length too small. Data exceeding this limit is discarded:

Solution: Increase max_length. If too much data is discarded, training or validation datasets may become empty, causing failure:
Error: failed to compose dlc job specs, resource limiting triggered, you are trying to use more GPU resources than the threshold

Solution: Training is limited to 2 simultaneous GPUs. Wait for ongoing jobs to finish, or submit a ticket to increase quota.
Error: the specified vswitch vsw-**** cannot create the required resource ecs.gn7i-c32g1.8xlarge, zone not match

Solution: The requested instance type is unavailable in the current zone. Try one of these:
- Leave vSwitch empty. DLC auto-selects one based on inventory.
- Switch to a different instance type.

Set the model output path to an OSS directory when creating the training job, then download the model from OSS.

Try the following approaches:

Use a larger model with better baseline performance, such as DeepSeek or Qwen3 series with higher parameter counts.
Refine your prompts.
Increase max_tokens.
Break complex tasks into smaller subtasks for the model to handle separately.