全部產品

Platform For AI：ChatLLM-WebUI版本發布詳情

更新時間：Sep 21, 2024

本文為您介紹ChatLLM-WebUI的重要版本發布資訊。

重要版本發布資訊

日期	鏡像版本	內建庫版本	更新內容
2024.6.21	eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0.4 Tag：chat-llm-webui:3.0 eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0.4-flash-attn eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0.4-vllm Tag：chat-llm-webui:3.0-vllm eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0.4-vllm-flash-attn eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0.4-blade Tag: chat-llm-webui:3.0-blade	Torch：2.3.0 Torchvision：0.18.0 Transformers：4.41.2 vLLM：0.5.0.post1 vllm-flash-attn：2.5.9 Blade：0.7.0	支援Rerank模型部署。支援Embedding、Rerank、LLM多模型同時或單獨部署。 Transformers後端支援Deepseek-V2、Yi1.5和Qwen2。更改Qwen1.5的model type為qwen1.5。 vLLM後端支援Qwen2。 BladeLLM後端支援Llama3和Qwen2。 HuggingFace後端支援batch輸入。 BladeLLM後端支援OpenAI Chat。 BladeLLM Metrics訪問修正。 Transformers後端支援FP8模型部署。 Transformers後端支援多量化工具：AWQ、HQQ和Quanto等。 vLLM後端支援FP8。 vLLM&Blade推理參數支援設定stop words。 Transformers後端適配H20顯卡。
2024.4.30	eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0.3 eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0.3-flash-attn eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0.3-vllm eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0.3-vllm-flash-attn eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0.3-blade	Torch：2.3.0 Torchvision：0.18.0 Transformers：4.40.2 vllm：0.4.2 Blade：0.5.1	支援Embedding模型部署。 vLLM後端支援Token Usage返回。支援Sentence-Transformers模型部署。 Transformers後端支援yi-9B、qwen2-moe、llama3、qwencode、qwen1.5-32G/110B、phi-3以及gemma-1.1-2/7B。 vLLM後端支援yi-9B、qwen2-moe、SeaLLM、llama3以及phi-3。 Blade後端支援qwen1.5和SeaLLM。支援LLM與Embedding多模型部署。 Transformers後端發布flash-attn鏡像。 vLLM後端發布flash-attn鏡像。
2024.3.28	eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0.2 eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0.2-vllm eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0.2-blade	Torch：2.1.2 Torchvision：0.16.2 Transformers：4.38.2 Vllm：0.3.3 Blade：0.4.8	添加blad推理後端：支援單機多卡和量化配置。 Transformers後端基於tokenizer chat template模板做推理。 HF後端已支援Multi-LoRA推理。 Blade支援量化模型部署。 Blade自動拆分模型。 Transformers後端支援Deepseek和Gemma。 vLLM後端支援Deepseek和Gemma。 Blade後端支援qwen1.5和yi模型。 vLLM和Blade鏡像開放/metrics訪問。 Transformers後端流式返回支援Token統計。
2024.2.22	eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0.1 eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0.1-vllm	Torch：2.1.2 Torchvision：0.16.0 Transformers：4.37.2 vLLM：0.3.0	vLLM擴充參數配置：支援推理時更改vLLM所有推理參數。 vLLM支援Multi-LoRA。 vLLM支援量化模型部署。 vLLM鏡像不依賴LangChain示範。 Transformers推理後端支援qwen1.5和qwen2模型。 vLLM推理後端支援qwen-1.5和qwen-2模型。
2024.1.23	eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0 eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:3.0-vllm	Torch：2.1.2 Torchvision：0.16.2 Transformers：4.37.2 vLLM：0.2.6	拆分後端鏡像，後端獨立編譯&發布：新添加BladeLLM後端。支援標準的OpenAI API。 Baichuan等模型支援效能統計指標。支援yi-6b-chat、yi-34b-chat以及secgpt等模型。 openai/v1/chat/completions適配chatglm3 history-format。非同步流式最佳化。 vLLM支援模型與HuggingFace拉齊。後端調用介面最佳化。完善報錯日誌。
2023.12.6	eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:2.1 Tag: chat-llm-webui:2.1	Torch：2.0.1 Torchvision：0.15.2 Transformers：4.33.3 vLLM：0.2.0	Huggingface後端支援mistral、zephyr、yi-6b、yi-34b、qwen-72b、qwen-1.8b、qwen7b-int4、qwen14b-int4、qwen7b-int8、qwen14b-int8、qwen-72b-int4、qwen-72b-int8、qwen-1.8b-int4和qwen-1.8b-int8模型。 vLLM後端支援Qwen和ChatGLM1/2/3模型。 Huggingface推理後端支援flash attention。 ChatGLM系列模型支援效能統計指標。添加命令列參數--history-format支援設定角色。 LangChain支援示範Qwen模型。最佳化fastapi流式提供者。
2023.9.13	eas-registry.cn-hangzhou.cr.aliyuncs.com/pai-eas/chat-llm-webui:2.0 Tag: chat-llm-webui:2.0	Torch：2.0.1+cu117 Torchvision：0.15.2+cu117 Transformers：4.33.3 vLLM：0.2.0	支援多後端：vLLM和Huggingface; 支援LangChain示範ChatLLM與Llama2模型支援Baichuan、Baichuan2、Qwen、Falcon、Llama2、ChatGLM、ChatGLM2、ChatGLM3以及yi等模型。添加http和webscoket支援對話流式。非流式返回結果包含產生Token數。所有模型支援多輪對話。支援對話記錄匯出。支援System Prompt設定及無模板輸入Prompt拼接。推理參數可配置支援日誌Debug模式：支援推理時間輸出 vLLM後端單機多卡預設支援TP並行方案。支援Float32、Float16、Int8以及Int4等精度的模型部署。

相關文檔

EAS為ChatLLM提供了情境化部署方式，您只需配置幾個參數，即可輕鬆部署流行的開源LLM大語言模型服務應用。關於部署和調用LLM大語言模型服務的更詳細內容介紹，請參見LLM大語言模型部署。