購買執行個體
購買執行個體可參考購買OpenSearch向量檢索版執行個體。
配置執行個體
新購買的執行個體,在其詳情頁中,執行個體狀態為“待配置”,並且會自動部署一個與購買的查詢節點和資料節點的個數及規格一致的空執行個體,之後需要為該執行個體配置表資訊>資料同步>欄位配置>索引結構,之後等待索引重建完成即可正常搜尋。
1. 表基礎資訊
表管理點擊“添加表",輸入表名稱,設定資料分區數和資料更新資源數,選擇需要的情境模板,點擊下一步:
配置說明:
表名稱:可自訂。
資料分區數:分區數設定時,請填寫不超過256的正整數, 用於提升全量構建速度、單次查詢效能。(部分存量執行個體,仍需各索引表分區數保持一致;或至少一個索引表分區數為1,其餘索引表分區數一致)
資料更新資源數:資料更新所用資源數,每個索引預設免費提供2個4核8G的更新資源,超出免費額度的資源將產生費用,詳情可參考向量檢索版國際站計費文檔
情境模板:向量檢索版內建了3種模板可供使用者選擇:通用、向量-圖片搜尋、向量-文本語義模板。
2. 資料同步
配置資料來源(目前支援的資料來源有MaxCompute資料來源和API推送資料來源),這裡以MaxCompute資料來源為例,資料來源類型選擇MaxCompute,設定Project、AccesskeyID、AccesskeySecret、Table、分組鍵partition、時間戳記,可按需選擇是否開啟“自動索引重建”選擇完成之後可選擇校正,通過後可點擊下一步:
MaxCompute資料來源文檔參考
API 資料來源文檔參考
OSS資料來源文檔參考
3. 欄位配置
OpenSearch會根據您選擇的情境模板,預置相關欄位,並會將全量資料來源中的欄位(如有),自動匯入欄位列表中:
設定欄位,必須包含至少兩個欄位,主鍵欄位和向量欄位(向量欄位需要設定為多值float類型):
如果需要帶有類目的向量,可以在主鍵和向量欄位中間加一個類目欄位。
欄位配置說明:
必選欄位:主鍵欄位和向量欄位,主鍵欄位為int或string類型並且需要勾選主鍵按鈕,向量欄位為float類型並且需要勾選向量欄位按鈕;
向量欄位預設為多值的float類型,多值分隔字元預設使用ha3分割符^] 進行切分(其對應utf編碼為\x1D),也可以輸入自訂多值分隔字元
使用向量檢索,在定義欄位時有位置要求,需要按照主鍵欄位、命名空間欄位(非必要)、向量欄位的順序建立。(如上圖所示)
當資料中缺少欄位或欄位為空白時,系統將自動補充預設值,數字類型預設補0,STRING類型預設補Null 字元串,支援自訂預設值
4. 索引結構
4.1. 向量索引
OpenSearch會對主鍵與向量欄位自動構建索引,索引名與欄位名相同,只需要在控制台配置向量索引:
進階配置,向量索引需要單獨配置參數,詳情可參考向量索引通用配置
主鍵欄位、向量欄位必須填寫,命名空間欄位非必填,可以為空白。
僅支援選擇固定的三個欄位,不支援新增。
系統自動填滿向量索引的配置參數,如無特殊需求,可直接點擊「確定」快速完成配置。
命名空間欄位:執行個體引擎版本為vector service 1.0.2及以下版本,namespace標籤欄位不支援string格式類型;執行個體引擎版本為vector service 1.0.2及以上版本,無此限制。
5. 確認建立
索引配置完成後,點擊確認建立。
6. 變更歷史
執行個體管理-變更歷史-資料來源變更,可以看到建立表及新增索引及全量的所有FSM,全部完成之後引擎搭建完成,可以開始查詢測試:
7. 查詢測試
查詢樣本:
{
"vector": [0.0019676427,0.005902928,0.021644069,0.21644068,0.12199384,0.043288138,0.007870571,0.0,0.08460863,0.041320495,0.043288138,0.035417568,0.011805856,0.055093993,0.12592913,0.017708784,0.021644069,0.0019676427,0.0,0.0,0.0019676427,0.078705706,0.1987319,0.041320495,0.039352853,0.0039352854,0.007870571,0.0039352854,0.0039352854,0.017708784,0.035417568,0.06886749,0.0019676427,0.0019676427,0.013773498,0.049191065,0.2125054,0.22824654,0.123961486,0.0039352854,0.0,0.0,0.021644069,0.14560555,0.078705706,0.1987319,0.22824654,0.005902928,0.064932205,0.0019676427,0.0019676427,0.021644069,0.027546996,0.035417568,0.22824654,0.22824654,0.1337997,0.023611711,0.009838213,0.007870571,0.0039352854,0.0039352854,0.017708784,0.20069954,0.033449925,0.005902928,0.019676426,0.035417568,0.015741142,0.029514639,0.13183205,0.123961486,0.029514639,0.0,0.027546996,0.22824654,0.15741141,0.0,0.0039352854,0.043288138,0.18889369,0.072802775,0.055093993,0.17315255,0.08460863,0.0019676427,0.007870571,0.035417568,0.22824654,0.10034977,0.009838213,0.021644069,0.062964566,0.027546996,0.015741142,0.04525578,0.086576276,0.033449925,0.023611711,0.017708784,0.0,0.0,0.03738521,0.072802775,0.16724962,0.035417568,0.031482283,0.20463483,0.043288138,0.011805856,0.0039352854,0.051158708,0.023611711,0.11412327,0.13183205,0.16134669,0.049191065,0.023611711,0.0039352854,0.0039352854,0.049191065,0.035417568,0.015741142,0.0039352854,0.03738521,0.08264099,0.094446845,0.021644069],
"topK": 10,
"includeVector": true
}
vector:具體要查詢的向量
topK:取top K個結果
是否返迴文檔中的向量資訊
結果示範:
詳細的查詢文法可參考下文的文法說明。