匯入資料後,您可以對錶中的資料進行向量分析。本教程將指導您如何進行向量分析。
前提條件
已根據快速入門,完成了通過SQL匯入向量資料。
向量分析
本教程的向量分析以擷取歐氏距離(平方)、點積距離或者餘弦相似性為例。
擷取歐式距離
使用向量分析,並擷取歐氏距離(平方值)。
SELECT id, l2_squared_distance(feature, array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[]) AS distance
FROM vector_test.car_info
ORDER BY feature <-> array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[]
LIMIT 10;
返回樣本如下。
id | distance
------+--------------------
2 | 0
1331 | 0.0677967891097069
1543 | 0.079616591334343
5606 | 0.0892329216003418
6423 | 0.0894578248262405
1667 | 0.0903968289494514
8215 | 0.0936210229992867
7801 | 0.0952572822570801
2581 | 0.0965127795934677
2645 | 0.0987173467874527
(10 rows)
擷取點積距離(餘弦相似性)
使用向量分析,並擷取點積距離(在歸一化時,點積距離等於餘弦相似性)。
SELECT id, dp_distance(feature, array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[]) AS similarity
FROM vector_test.car_info
ORDER BY feature <-> array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[]
LIMIT 10;
返回樣本如下。
id | similarity
------+-------------------
2 | 1
1331 | 0.966101586818695
1543 | 0.960191607475281
5606 | 0.955383539199829
6423 | 0.955271065235138
1667 | 0.954801559448242
8215 | 0.953189492225647
7801 | 0.95237135887146
2581 | 0.951743602752686
2645 | 0.950641334056854
(10 rows)
融合檢索查詢
如需進行結構化與非結構化的融合,可以採用如下SQL進行查詢。
SELECT id, dp_distance(feature, array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[]) AS similarity
FROM vector_test.car_info
WHERE market_time >= '2020-10-30 00:00:00'
AND market_time < '2021-01-01 00:00:00'
AND color in ('red', 'white', 'blue')
AND price < 100
ORDER BY feature <-> array[0.495181661387,0.108697291209,0.181728549067,0.109680543346,0.19713082404,0.0197809514512,0.534227452778,0.442411970815,0.409909873031,0.0975687394505]::float4[]
LIMIT 10;
返回樣本如下。
id | similarity
------+-------------------
7645 | 0.922723233699799
8956 | 0.920517802238464
8219 | 0.91210675239563
8503 | 0.895939946174622
5113 | 0.895431876182556
7680 | 0.893448948860168
8433 | 0.893425941467285
3604 | 0.89293098449707
3945 | 0.891274154186249
7153 | 0.891128540039062
(10 rows)