Vector quantization (VQ) refers to any process by which vectors are mapped to a discrete representation with explicit reference to more than one dimension of the vector. By contrast, when each dimension of the vector is quantized separately, we call this scalar quantization.
-
Vectors can be quantized using uniform quantization. However, this results in an exponentially expanding number of possible levels. For even moderately high-dimensional data, this can quickly negate any benefits of quantization.
-
Low-dimensional vectors are commonly quantized using k-means clustering. This can result in a far more manageable number of levels. However, for high-dimensional data, the curse of dimensionality makes it difficult to detect meaningful centroids. In these cases, naive clustering will result in excessive information loss.
-
Hence, for higher-dimensional data, vectors are typically first broken into sub-vectors which are quantized separately using k-means. This is called product quantization.