导入数据后,您可以对表中的数据进行向量分析。本教程将指导您如何进行向量分析。
前提条件
已根据快速入门,完成了导入向量数据。
向量分析
本教程的向量分析以获取欧氏距离(平方)、点积距离或者余弦相似度为例。
获取欧式距离
使用向量分析,并获取欧氏距离(平方值)。
SELECT id, l2_squared_distance(feature, array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[]) AS distance
FROM vector_test.car_info
ORDER BY feature <-> array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[]
LIMIT 10;
返回示例如下。
id | distance
------+--------------------
2 | 0
1331 | 0.0677967891097069
1543 | 0.079616591334343
5606 | 0.0892329216003418
6423 | 0.0894578248262405
1667 | 0.0903968289494514
8215 | 0.0936210229992867
7801 | 0.0952572822570801
2581 | 0.0965127795934677
2645 | 0.0987173467874527
(10 rows)
获取点积距离(余弦相似度)
使用向量分析,并获取点积距离(在归一化时,点积距离等于余弦相似度)。
SELECT id, dp_distance(feature, array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[]) AS similarity
FROM vector_test.car_info
ORDER BY feature <-> array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[]
LIMIT 10;
返回示例如下。
id | similarity
------+-------------------
2 | 1
1331 | 0.966101586818695
1543 | 0.960191607475281
5606 | 0.955383539199829
6423 | 0.955271065235138
1667 | 0.954801559448242
8215 | 0.953189492225647
7801 | 0.95237135887146
2581 | 0.951743602752686
2645 | 0.950641334056854
(10 rows)
融合检索查询
如需进行结构化与非结构化的融合,可以采用如下SQL进行查询。
SELECT id, dp_distance(feature, array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[]) AS similarity
FROM vector_test.car_info
WHERE market_time >= '2020-10-30 00:00:00'
AND market_time < '2021-01-01 00:00:00'
AND color in ('red', 'white', 'blue')
AND price < 100
ORDER BY feature <-> array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[]
LIMIT 10;
返回示例如下。
id | similarity
------+-------------------
7645 | 0.922723233699799
8956 | 0.920517802238464
8219 | 0.91210675239563
8503 | 0.895939946174622
5113 | 0.895431876182556
7680 | 0.893448948860168
8433 | 0.893425941467285
3604 | 0.89293098449707
3945 | 0.891274154186249
7153 | 0.891128540039062
(10 rows)