LinearBuilder
Parameter | Type | Default value | Description |
proxima.linear.builder.column_major_order | string | false | Specifies how to sort the features of an index when the index is being built. Valid values: false and true. false: indicates that the features of an index are sorted row by row. true: indicates that the features of an index are sorted column by column. |
QcBuilder
Parameter | Type | Default value | Description |
proxima.qc.builder.train_sample_count | uint32 | 0 | The volume of training data. If you set the value of this parameter to 0, all data of a document is specified as training data. |
proxima.qc.builder.thread_count | uint32 | 0 | The number of threads that can be used. If you set the value of this parameter to 0, the number of threads that can be used is equal to the number of CPU cores of OpenSearch Vector Search Edition. |
proxima.qc.builder.centroid_count | string | Optional | The number of centroids that you want to use for clusters. Hierarchical clusters are supported. Separate levels of hierarchical clusters with asterisks (*). Sample value for hierarchical clusters that include one level: 1000. Sample value for hierarchical clusters that include two levels: 100*100. If you want to specify the number of centroids for hierarchical clusters that include two levels, we recommend that you specify more centroids for the first level than the second level. This ensures a result that is better than the result obtained when you specify a smaller number of centroids for the first level. The experience points that can be obtained in the first level are 10 times those in the second level. If you do not specify the number of centroids, the system automatically infers the appropriate number of centroids. We recommend that you allow the system to automatically infer the number of centroids. |
proxima.qc.builder.cluster_class | string | OptKmeansCluster | A clustering method. For more information, see Proxima Cluster parameters. |
proxima.qc.builder.cluster_auto_tuning | bool | false | Specifies whether to automatically change the number of centroids. |
proxima.qc.builder.cluster_params_in_level_ | IndexParams | - | The parameters that are required to configure a clustering method. For more information, see Proxima Cluster parameters. You must specify parameters for each level and from the first level. Sample value for the first level: proxima.qc.builder.cluster_params_in_level_1. |
proxima.qc.builder.optimizer_class | string | HcBuilder | The type of the builder optimizer that you want to use for centroids to improve the precision of classification. The type of builder optimizer decides the type of searcher optimizer by which queries are performed for candidate centroids in an online scenario. For example, if you set the parameter value to HcBuilder, HcSearcher is used to query candidate centroids in an online scenario. Valid values: HcBuilder, HnswBuilder, SsgBuilder, and LinearBuilder. |
proxima.qc.builder.optimizer_params | IndexParams | - | Parameters and parameter values for the builder optimizer and searcher optimizer that are configured based on the value of the proxima.qc.builder.optimizer_class parameter. For example, if you set the value of the proxima.qc.builder.optimizer_class parameter to HnswBuilder, you can refer to the following sample code to specify the parameters and parameter values: proxima.hnsw.builder.max_neighbor_count: 100 proxima.hnsw.searcher.max_scan_ratio: 0.1 |
proxima.qc.builder.converter_class | string | - | If you set the value of the Measure parameter to InnerProduct, automatic engine conversion is performed and OpenSearch Vector Search Edition uses the L2 norm to search documents. |
proxima.qc.builder.converter_params | IndexParams | - | The parameters for initializing proxima.qc.builder.converter_class. |
proxima.qc.builder.quantizer_class | string | - | The quantizer. By default, the system does not use quantizers. The valid values of this parameter are Int8QuantizerConverter, HalfFloatConverter, and DoubleBitConverter. In most cases, if you specify a value for this parameter, performance will be improved and the size of an index will be reduced. However, retrieval loss may occur in specific scenarios. |
proxima.qc.builder.quantizer_params | IndexParams | - | The parameters and parameter values for the quantizer that you specify by using the proxima.qc.builder.quantizer_class parameter. |
proxima.qc.builder.optimizer_quantizer_class | string | - | The name of the quantizer that is used to perform quantization on centroids. |
proxima.qc.builder.optimizer_quantizer_params | IndexParams | - | The parameters and parameter values for the quantizer that you specify by using the proxima.qc.builder.optimizer_quantizer_class parameter. |
proxima.qc.builder.quantize_by_centroid | bool | False | Specifies whether to perform quantization based on centroids if you specify a value for the proxima.qc.builder.quantizer_class parameter. The proxima.qc.builder.quantize_by_centroid parameter takes effect only when you set the value of proxima.qc.builder.quantizer_class to Int8QuantizerConverter. |
proxima.qc.builder.store_original_features | bool | False | Specifies whether to retain raw features. If you specify a value for proxima.qc.builder.quantizer_class, IndexProvider obtains the features on which quantization is performed. To obtain the raw features, set the value of proxima.qc.builder.store_original_features to True. |
HnswSearcher
Parameter | Type | Default value | Description |
proxima.hnsw.builder.max_neighbor_count | uint32 | 100 | The maximum number of neighbors for a node in the graph. A larger value indicates the better connectivity of a graph. Correspondingly, the building cost and index size also increase. |
proxima.hnsw.builder.efconstruction | uint32 | 500 | The size of the neighboring area that can be scanned when a graph is being built. A larger value indicates the higher quality of the offline graph building and the slower index building. We recommend that you set the value to 400 for the first time. |
proxima.hnsw.builder.thread_count | uint32 | 0 | The number of threads that can be used. If you set the value of this parameter to 0, the number of threads that can be used is equal to the number of CPU cores of OpenSearch Vector Search Edition. |