This topic describes how to update documents in a collection by using the SDK for Python.
If the ID of a document to be updated does not exist, the update operation is invalid for the document.
If you update only some fields, the rest of the fields are set to
None
by default.
Prerequisites
A cluster is created. For more information, see Create a cluster.
An API key is obtained. For more information, see Manage API keys.
The SDK of the latest version is installed. For more information, see Install DashVector SDK.
API definition
Collection.update(
docs: Union[Doc, List[Doc], Tuple, List[Tuple]],
partition: Optional[str] = None,
async_req: False
) -> DashVectorResponse
Example
You need to replace YOUR_API_KEY with your API key and YOUR_CLUSTER_ENDPOINT with the endpoint of your cluster in the sample code for the code to run properly.
You need to create a collection named
quickstart
in advance. For more information, see the "Example" section of the Create a collection topic.
import dashvector
from dashvector import Doc
import numpy as np
client = dashvector.Client(
api_key='YOUR_API_KEY',
endpoint='YOUR_CLUSTER_ENDPOINT'
)
collection = client.get(name='quickstart')
Update a document
# Use a Doc object to update a document.
ret = collection.update(
Doc(
id='1',
vector=[0.1, 0.2, 0.3, 0.4]
)
)
# Check whether the update operation is successful.
assert ret
# Simplified version: use a tuple to update a document.
ret = collection.update(
('2', [0.1, 0.1, 0.1, 0.1]) # (id, vector)
)
Update a document with fields
# Use a Doc object to update a document and specify the values of related fields.
ret = collection.update(
Doc(
id='3',
vector=np.random.rand(4),
fields={
# Specify the values of fields that are predefined during collection creation.
'name': 'zhangsan', 'weight':70.0, 'age':30,
# Specify schema-free fields and values.
'anykey1': 'str-value', 'anykey2': 1,
'anykey3': True, 'anykey4': 3.1415926
}
)
)
# Use a tuple to update a document and specify the values of related fields.
ret = collection.update(
('4', np.random.rand(4), {'foo': 'bar'}) # (id, vector, fields)
)
Update multiple documents at a time
# Use Doc objects to update 10 documents at a time.
ret = collection.update(
[
Doc(id=str(i+5), vector=np.random.rand(4)) for i in range(10)
]
)
# Simplified version: use tuples to update three documents at a time.
ret = collection.update(
[
('15', [0.2,0.7,0.8,1.3], {'age': 20}),
('16', [0.3,0.6,0.9,1.2], {'age': 30}),
('17', [0.4,0.5,1.0,1.1], {'age': 40})
] # List[(id, vector, fields)]
)
# Check whether the batch update operation is successful.
assert ret
Asynchronously update documents
# Asynchronously update 10 documents at a time.
ret_funture = collection.update(
[
Doc(id=str(i+18), vector=np.random.rand(4), fields={'name': 'foo' + str(i)}) for i in range(10)
],
async_req=True
)
# Wait for the result of the asynchronous update operation.
ret = ret_funture.get()
Update a document containing a sparse vector
ret = collection.update(
Doc(
id='28',
vector=[0.1, 0.2, 0.3, 0.4],
sparse_vector={1:0.4, 10000:0.6, 222222:0.8}
)
)
Request parameters
Parameter | Type | Default value | Description |
docs | Union[Doc, List[Doc], Tuple, List[Tuple]] | - | The one or more documents to be updated. |
partition | Optional[str] | None | Optional. The name of the partition. |
async_req | bool | False | Optional. Specifies whether to enable the asynchronous mode. |
If the type of the docs parameter is Tuple, the elements in the tuple must be in the order of
(id, vector)
or(id, vector, fields)
. In this case, the tuple is equivalent to a Doc object.Each field in a Doc object can be set to a user-defined key-value pair. In a key-value pair, the key must be of the
str
type, and the value can be of thestr, int, bool, or float
type.If a key is predefined during collection creation, the value must be of the predefined type.
If the key is not predefined during collection creation, the value can be of the
str, int, bool, or float
type.
For more information about predefining fields, see Schema-free.
Response parameters
A DashVectorResponse object is returned, which contains the operation result, as described in the following table.
Parameter | Type | Description | Example |
code | int | The returned status code. For more information, see Status codes. | 0 |
message | str | The returned message. | success |
request_id | str | The unique ID of the request. | 19215409-ea66-4db9-8764-26ce2eb5bb99 |