OpenAPI封装了云原生数据仓库AnalyticDB PostgreSQL版向量操作的DDL和DML,使您可以通过OpenAPI来管理向量数据。本文以SDK Java调用方式介绍如何通过API导入并查询向量数据。
前提条件
已创建存储弹性模式6.0版的AnalyticDB PostgreSQL版实例。具体操作,请参见创建实例。
已开启向量引擎优化。具体操作,请参见开启或关闭向量检索引擎优化。
已创建初始账号。具体操作,请参见创建数据库账号。
若您使用RAM用户,则需要对RAM用户进行授权,更多方式请参见使用OpenAPI示例。
操作流程
初始化向量库
在使用向量检索前,需初始化knowledgebase库以及全文检索相关功能。
调用示例如下:
InitVectorDatabaseRequest request = new InitVectorDatabaseRequest();
request.setDBInstanceId("gp-bp1c62r3l489****");
request.setManagerAccount("myaccount");
request.setManagerAccountPassword("myaccount_password");
request.setRegionId("ap-southeast-1");
InitVectorDatabaseResponse response = client.getAcsResponse(request);
System.out.println(new Gson().toJson(response));
参数说明,请参见InitVectorDatabase - 初始化向量数据库。
创建Namespace
Namespace用于Schema隔离,在使用向量前,需至少创建一个Namespace或者使用public的Namespace。
调用示例如下:
CreateNamespaceRequest request = new CreateNamespaceRequest();
request.setDBInstanceId("gp-bp1c62r3l489****");
request.setManagerAccount("myaccount");
request.setManagerAccountPassword("myaccount_password");
request.setNamespace("vector_test");
request.setNamespacePassword("vector_test_password");
request.setRegionId("ap-southeast-1");
CreateNamespaceResponse response = client.getAcsResponse(request);
System.out.println(new Gson().toJson(response));
参数说明,请参见CreateNamespace - 创建命名空间。
创建完后,可以在实例的knowledgebase库查看对应的Schema。
SELECT schema_name FROM information_schema.schemata;
创建Collection
Collection用于存储向量数据,并使用Namespace隔离。
调用示例如下:
Map<String,String> metadata = new HashMap<>();
metadata.put("title", "text");
metadata.put("link", "text");
metadata.put("content", "text");
metadata.put("pv", "int");
List<String> fullTextRetrievalFields = Arrays.asList("title", "content");
CreateCollectionRequest request = new CreateCollectionRequest();
request.setDBInstanceId("gp-bp1c62r3l489****");
request.setManagerAccount("myaccount");
request.setManagerAccountPassword("myaccount_password");
request.setNamespace("vector_test");
request.setCollection("document");
request.setDimension(10L);
request.setFullTextRetrievalFields(StringUtils.join(fullTextRetrievalFields, ","));
request.setMetadata(new Gson().toJson(metadata));
request.setParser("zh_ch");
request.setRegionId("ap-southeast-1");
CreateCollectionResponse response = client.getAcsResponse(request);
System.out.println(new Gson().toJson(response));
参数说明,请参见CreateCollection - 创建向量数据集。
创建完后,可以在实例的knowledgebase库查看对应的Table。
SELECT tablename FROM pg_tables WHERE schemaname='vector_test';
上传向量数据
将准备好的Embedding向量数据上传到对应的Collection中。
调用示例如下:
UpsertCollectionDataRequest request = new UpsertCollectionDataRequest();
request.setDBInstanceId("gp-bp1c62r3l489****");
request.setCollection("document");
request.setNamespace("vector_test");
request.setNamespacePassword("vector_test_password");
request.setRegionId("ap-southeast-1");
List<UpsertCollectionDataRequest.UpsertCollectionDataRequestRows> rows = new ArrayList<>();
UpsertCollectionDataRequest.UpsertCollectionDataRequestRows row = new UpsertCollectionDataRequest.UpsertCollectionDataRequestRows();
row.setId("0CB55798-ECF5-4064-B81E-FE35B19E01A6");
row.setVector(Arrays.asList(0.2894745251078251,0.5364747050266715,0.1276845661831275,0.22528871956822372,0.7009319238651552,0.40267406135256123,0.8873626696379067,0.1248525955774931,0.9115507046412368,0.2450859133174706));
Map<String, String> rowsMetadata = new HashMap<>();
rowsMetadata.put("title", "测试文档");
rowsMetadata.put("content","测试内容");
rowsMetadata.put("link","http://127.0.0.1/document1");
rowsMetadata.put("pv","1000");
row.setMetadata(rowsMetadata);
rows.add(row);
request.setRows(rows);
UpsertCollectionDataResponse response = client.getAcsResponse(request);
System.out.println(new Gson().toJson(response));
参数说明,请参见UpsertCollectionData - 上传向量数据。
上传完成,可以在实例的knowledgebase库查看数据。
SELECT * FROM vector_test.document;
召回向量数据
准备需要召回的查询向量或全文检索字段,执行查询接口。
调用示例如下:
QueryCollectionDataRequest request = new QueryCollectionDataRequest();
request.setDBInstanceId("gp-bp1c62r3l489****");
request.setCollection("document");
request.setNamespace("vector_test");
request.setNamespacePassword("vector_test_password");
request.setContent("测试");
request.setFilter("pv > 10");
request.setTopK(10L);
request.setVector(Arrays.asList(0.7152607422256894,0.5524872066437732,0.1168505269851303,0.704130971473022,0.4118874999967596,0.2451574619214022,0.18193414783144812,0.3050522957905741,0.24846180714868163,0.0549715380856951));
request.setRegionId("ap-southeast-1");
QueryCollectionDataResponse response = client.getAcsResponse(request);
System.out.println(new Gson().toJson(response));
返回结果如下:
{
"Matches": {
"match": [
{
"Id": "0CB55798-ECF5-4064-B81E-FE35B19E01A6",
"Metadata": {
"title":"测试文档",
"content":"测试内容",
"link":"http://127.0.0.1/document1",
"pv":"1000"
},
"Values": [
0.2894745251078251,
0.5364747050266715,
0.1276845661831275,
0.22528871956822372,
0.7009319238651552,
0.40267406135256123,
0.8873626696379067,
0.1248525955774931,
0.9115507046412368,
0.2450859133174706
]
}
]
},
"RequestId": "ABB39CC3-4488-4857-905D-2E4A051D0521",
"Status": "success"
}