Vector index
Warning
Supported only for row-oriented tables. Support for column-oriented tables is currently under development.
Alert
The following features are not supported:
- Index update: the main table can be modified, but the existing index will not be updated. A new index is to be built to reflect the changes. If necessary, the existing index can be atomically replaced with the newly built one.
- Building an index for vectors with bit quantization.
These limitations may be removed in future versions.
Warning
It makes no sense to create an empty table with a vector index, because for now we don't allow mutations in tables with vector indexes.
You should use ALTER TABLE ... ADD INDEX command) to add a vector index to an existing table.
Vector index in row-oriented tables is created using the same syntax as secondary indexes, by specifying vector_kmeans_tree as the index type. Subset of syntax available for vector indexes:
CREATE TABLE `<table_name>` (
...
INDEX `<index_name>`
GLOBAL
[SYNC]
USING vector_kmeans_tree
ON ( <index_columns> )
[COVER ( <cover_columns> )]
[WITH ( <parameter_name> = <parameter_value>[, ...])]
[, ...]
)
Where:
<index_name>- unique index name for data accessSYNC- indicates synchronous data writing to the index. This is the only currently available option, and it is used by default.<index_columns>- comma-separated list of table columns used for index searches (the last column is used as embedding, others as filtering columns)<cover_columns>- list of additional table columns stored in the index to enable retrieval without accessing the main table<parameter_name>and<parameter_value>- list of key-value parameters:
- common parameters for all vector indexes:
vector_dimension- embedding vector dimensionality (16384 or less)vector_type- vector value type (float,uint8,int8, orbit)distance- distance function (cosine,manhattan, oreuclidean), mutually exclusive withsimilaritysimilarity- similarity function (inner_productorcosine), mutually exclusive withdistance
- specific parameters for
vector_kmeans_tree(see Vector Index Type `vector_kmeans_tree` {#kmeans-tree-type}):clusters- number of centroids for k-means algorithm (values greater than 1000 may degrade performance)levels- number of levels in the tree
Warning
Vector indexes with vector_type=bit are not currently supported.
Example
CREATE TABLE user_articles (
article_id Uint64,
user String,
title String,
text String,
embedding String,
INDEX emb_cosine_idx GLOBAL SYNC USING vector_kmeans_tree
ON (user, embedding) COVER (title, text)
WITH (
distance="cosine",
vector_type="float",
vector_dimension=512,
clusters=128,
levels=2
),
PRIMARY KEY (article_id)
)
Was the article helpful?
Previous
Next