An embedding is a projection of a vector into a lower-dimensional subspace, satisfying the characteristic that distance inequalities are preserved in the lower dimension.
Embeddings are widely used to encode sparse, high-dimensional data (such as text or user behavior) into a more tractable vector size for the purpose of statistical inference. In this context, the preservation of distances ensures that the downsampled representation captures the salient characteristics relating the input to the output states.
Embeddings are often generated by multiplying the original vector by a
Such an embedding matrix is frequently learned as a fully connected feed-forward neural network with a linear activation function. Of course, embeddings can also be generated through a nonlinear mapping, such as via a neural network with a nonlinear activation function. In such a case, the distance requirement may be quite difficult to prove.