1. Clustering
1.1 KmeansCluster and BatchKmeansCluster
Parameter | Type | Default value | Description |
proxima.general.cluster.count | UINT32 | 0 | The number of centroids. |
proxima.kmeans.cluster.count | UINT32 | 0 | The number of centroids. The priority of this parameter is higher than the priority of the proxima.general.cluster.count parameter and lower than the priority of the K value of suggest. |
proxima.kmeans.cluster.shard_factor | FLOAT | 16.0f | The factor for tuning multi-thread concurrency. |
proxima.kmeans.cluster.epsilon | DOUBLE | FL_EPSILON | The precision of clustering convergence. |
proxima.kmeans.cluster.max_iterations | UINT32 | 20 | The maximum number of iterations. |
proxima.kmeans.cluster.purge_empty | BOOL | false | Specifies whether to delete empty centroids. |
proxima.kmeans.cluster.seeker_class | STRING | LinearSeeker | The class of the algorithm for seeking centroids. |
proxima.kmeans.cluster.seeker_params | IndexParams | The parameters of the class of the algorithm for seeking centroids. They are IndexParams objects. |
1.2 GpuKmeansCluster
Parameter | Type | Default value | Description |
proxima.general.cluster.count | UINT32 | 0 | The number of centroids. |
proxima.kmeans.cluster.count | UINT32 | 0 | The number of centroids. The priority of this parameter is higher than the priority of the proxima.general.cluster.count parameter and lower than the priority of the K value of suggest. |
proxima.kmeans.cluster.epsilon | DOUBLE | FL_EPSILON | The precision of clustering convergence. |
proxima.kmeans.cluster.max_iterations | UINT32 | 100 | The maximum number of iterations. |
proxima.kmeans.cluster.purge_empty | BOOL | false | Specifies whether to delete empty centroids. |
1.3 MiniBatchKmeansCluster
Parameter | Type | Default value | Description |
proxima.general.cluster.count | UINT32 | 0 | The number of centroids. |
proxima.minibatchkmeans.cluster.count | UINT32 | 0 | The number of centroids. The priority of this parameter is higher than the priority of the proxima.general.cluster.count parameter and lower than the priority of the K value of suggest. |
proxima.minibatchkmeans.cluster.shard_factor | FLOAT | 16.0f | The factor for tuning multi-thread concurrency. |
proxima.minibatchkmeans.cluster.epsilon | DOUBLE | FL_EPSILON | The precision of clustering convergence. |
proxima.minibatchkmeans.cluster.max_iterations | UINT32 | 20 | The maximum number of iterations. |
proxima.minibatchkmeans.cluster.purge_empty | BOOL | false | Specifies whether to delete empty centroids. |
proxima.minibatchkmeans.cluster.try_count | UINT32 | 20 | The number of attempts. The minimum value is 1. |
proxima.minibatchkmeans.cluster.batch_count | UINT32 | 0 | The number of features that are sampled for batch training. If the parameter value is 0, the actual value is the total number of features divided by the number of attempts. |
proxima.minibatchkmeans.cluster.seeker_class | STRING | LinearSeeker | The class of the algorithm for seeking centroids. |
proxima.minibatchkmeans.cluster.seeker_params | IndexParams | The parameters of the class of the algorithm for seeking centroids. |
1.4 BikmeansCluster
Parameter | Type | Default value | Description |
proxima.general.cluster.count | UINT32 | 0 | The number of centroids. |
proxima.bikmeans.cluster.count | UINT32 | 0 | The number of centroids. The priority of this parameter is higher than the priority of the proxima.general.cluster.count parameter and lower than the priority of the K value of suggest. |
proxima.bikmeans.cluster.init_count | UINT32 | 0 | The number of centroids for clustering initialization in the first phase. If the parameter value is 0, the actual value is the total number of features divided by four. |
proxima.bikmeans.cluster.purge_empty | BOOL | false | Specifies whether to delete empty centroids. |
proxima.bikmeans.cluster.first_class | STRING | KmeansCluster | The clustering method in the first phase. |
proxima.bikmeans.cluster.second_params | IndexParams | The parameters of the clustering method in the first phase. | |
proxima.bikmeans.cluster.second_class | STRING | KmeansCluster | The clustering method in the second phase. |
proxima.bikmeans.cluster.second_params | IndexParams | The parameters of the clustering method in the second phase. |
1.5 KmeansppCluster
Parameter | Type | Default value | Description |
proxima.general.cluster.count | UINT32 | 0 | The number of centroids. |
proxima.kmeanspp.cluster.count | UINT32 | 0 | The number of centroids. The priority of this parameter is higher than the priority of the proxima.general.cluster.count parameter and lower than the priority of the K value of suggest. |
proxima.kmeanspp.cluster.shard_factor | UINT32 | 16.0f | The factor for tuning multi-thread concurrency. |
proxima.kmeanspp.cluster.class | STRING | KmeansCluster | The clustering method that is called after the centroids are initialized. |
proxima.kmeanspp.cluster.params | IndexParams | The parameters of the clustering method. |
1.6 Kmc2Cluster/AFKmc2Cluster
Parameter | Type | Default value | Description |
proxima.general.cluster.count | UINT32 | 0 | The number of centroids. |
proxima.kmc2.cluster.count | UINT32 | 0 | The number of centroids. The priority of this parameter is higher than the priority of the proxima.general.cluster.count parameter and lower than the priority of the K value of suggest. |
proxima.kmc2.cluster.shard_factor | UINT32 | 2.5f | The factor for tuning multi-thread concurrency. |
proxima.kmc2.cluster.markov_chain_length | UINT32 | 0u | The length of the Markov chain. If the parameter value is 0, the actual value is the number of threads multiplied by the concurrency factor. |
proxima.kmc2.cluster.class | STRING | KmeansCluster | The clustering method that is called after the centroids are initialized. |
proxima.kmc2.cluster.params | IndexParams | The parameters of the clustering method. |
1.7 KmedoidsCluster
Parameter | Type | Default value | Description |
proxima.general.cluster.count | UINT32 | 0 | The number of centroids. |
proxima.kmedoids.cluster.count | UINT32 | 0 | The number of centroids. The priority of this parameter is higher than the priority of the proxima.general.cluster.count parameter and lower than the priority of the K value of suggest. |
proxima.kmedoids.cluster.shard_factor | FLOAT | 16.0f | The factor for tuning multi-thread concurrency. |
proxima.kmedoids.cluster.epsilon | DOUBLE | FL_EPSILON | The precision of clustering convergence. |
proxima.kmedoids.cluster.max_iterations | UINT32 | 20 | The maximum number of iterations. |
proxima.kmedoids.cluster.purge_empty | BOOL | false | Specifies whether to delete empty centroids. |
proxima.kmedoids.cluster.bench_ratio | FLOAT | 0.1f | The ratio of candidate points. |
proxima.kmedoids.cluster.only_means | BOOL | false | Specifies whether to use only the mean value as a candidate point. The algorithm degrades to k-means. |
proxima.kmedoids.cluster.without_means | BOOL | false | Specifies whether to not use the mean value as a candidate point. |
proxima.kmedoids.cluster.seeker_class | STRING | LinearSeeker | The class of the algorithm for seeking centroids. |
proxima.kmedoids.cluster.seeker_params | IndexParams | The parameters of the class of the algorithm for seeking centroids. They are IndexParams objects. |
1.8 StratifiedCluster
Parameter | Type | Default value | Description |
proxima.general.cluster.count | UINT32 | 0 | The total number of centroids at the second layer. |
proxima.stratified.cluster.count | UINT32 | 0 | The total number of centroids at the second layer. The priority of this parameter is higher than the priority of the proxima.general.cluster.count parameter and lower than the priority of the K value of suggest. |
proxima.stratified.cluster.first_class | STARING | KmeansCluster | The clustering method that you want to use at the first layer. |
proxima.stratified.cluster.second_class | STARING | KmeansCluster | The clustering method that you want to use at the second layer. |
proxima.stratified.cluster.first_count | UINT32 | 0 | The number of centroids that you want to cluster at the first layer. |
proxima.stratified.cluster.second_count | UINT32 | 0 | The number of centroids that you want to cluster at the second layer. |
proxima.stratified.cluster.first_params | IndexParams | The parameters of the clustering method that you want to use at the first layer. | |
proxima.stratified.cluster.second_params | IndexParams | The parameters of the clustering method that you want to use at the second layer. | |
proxima.stratified.cluster.auto_tuning | BOOL | false |
2. Clustering estimation
2.1 GapstatsClusterEstimater