FAQs - Vector Retrieval Service - Alibaba Cloud Documentation Center

1. Which partition applies if no partition is specified for operations on documents?

When a collection is created, a partition named default is created by default and cannot be deleted. If no partition is specified for operations on documents, the default partition applies. For example, if no partition is specified for document search, only documents in the default partition are searched for.

2. What are the differences between inserting a document, updating a document, and upserting a document?

Inserting a document: If the ID of the document to be inserted already exists, the existing document is not overwritten, and the insert operation is invalid for the document.
Updating a document: This operation overwrites the existing document. If the ID of the document to be updated does not exist, the update operation is invalid for the document.
Inserting or updating a document: If the ID of the document to be inserted already exists, the operation of updating a document is performed. If the ID does not exist, the operation of inserting a document is performed.

3. How do I clear a collection?

You cannot clear a collection. However, you can delete a collection and then create a new one.

4. How do I enable the asynchronous mode for operations on documents?

Set the async_req parameter to True to enable the asynchronous mode for operations such as inserting a document, updating a document, upserting a document, searching for a document, deleting a document, and obtaining a document. The following sample code provides an example.

# Perform 1,000 asynchronous writes, and set dimension to 20000 and batch-size to 8.
batch_size = 8
loop = 1000
start = time.time()

async_results = [
    collection.insert(
        [(j + i * batch_size, np.random.rand(20000)) for j in range(batch_size)],
        async_req=True
    ) for i in range(loop)
]

# Wait until all the write operations are complete.
print([async_result.get() for async_result in async_results])

print(f"async insert {loop} times with batch-size = {batch_size}, cost = {time.time() - start}")

# Output:
# async insert 1000 times with batch-size = 8, cost = 31.13356590270996

# Synchronous writes (code omitted).
# sync insert 1000 times with batch-size = 8, cost = 408.63447427749634

Important

Asynchronous operations tend to require more quotas than those specified in Limits. If you need more quotas, apply for them.

5. Where must the ID of a document be unique? In a collection or a partition?

In a partition. A collection contains different partitions that may contain documents with the same ID.

6. Why does a vector suffer precision loss after being inserted?

DashVector supports data of the single-precision floating-point type, which is also known as FP32 or float32. The following table describes its precision range.

If the precision of an inserted vector exceeds the upper limit, it is automatically rounded to the nearest value within the precision range of FP32, which results in a precision loss.