In principle, one could apply machine learning to discover optimal hashing strategies for searching a particular collection of vectors. Such methods are often called “learning to hash” (L2H).

SPANN employs k-means clustering as a hashing strategy, where each vector is mapped to the cluster with the nearest centroid.

In practice, though, since vector databases are often updated frequently and training such a model is often costly, this approach is rarely used in mainstream VDBMSs.