David's raw ML reference notes

❯

01 Statistical (machine) learning (data science)

❯

❯

Natural language processing (NLP)

❯

Language modeling concepts

❯

Beam search in autoregressive language models

Beam search in autoregressive language models

Feb 14, 20251 min read

In the context of autoregressive language models, beam search can be used to sample a sequence of tokens conditioned on the tokens before them. In this context,

The root of the search tree is the start-of-sequence token;
The set of successor states is the predicted probability distribution across the vocabulary conditioned on the preceding tokens; and
The global score of a sequence of tokens is the product of all token selections in a candidate sequence (or, more commonly, the sum of the log probabilities).

As a defense against infinite sequences, one will prune any candidates that reach a maximum length without producing the end-of-sequence token.

There are several possible termination conditions, which can be combined:

Terminate when all of the candidates contain the end-of-sequence token
Terminate when at least one candidate containing the end-of-sequence token exceeds a certain score
Terminate when the top scores cease to increase significantly after several iterations
Terminate if the top sequences are too similar based on some measure of sequence diversity

Graph View

Backlinks

No backlinks found

Created with Quartz v4.4.0 © 2025

Terms of Use
LinkedIn
Buy me a coffee