Language models represent the input sequence (and often the output sequence) as indices into a vector of known tokens called a vocabulary. Often, there are important positional concepts that are not captured as explicit words, such as the end of the sequence. In these cases, the concept is encoded as a special token in the vocabulary. Examples of common special tokens include: