Learned partitioning of vectors (learning to hash, L2H)

In principle, one could apply machine learning to discover optimal hashing strategies for searching a particular collection of vectors. Such methods are often called “learning to hash” (L2H).

SPANN employs k-means clustering as a hashing strategy, where each vector is mapped to the cluster with the nearest centroid.

In practice, though, since vector databases are often updated frequently and training such a model is often costly, this approach is rarely used in mainstream VDBMSs.

David's raw ML reference notes

Explorer

Learned partitioning of vectors (learning to hash, L2H)

Graph View

Backlinks