Instead of just choosing one next token, maintain a tree of possible token sequences, where is the “beam width.” These are created by selecting the top tokens at each position, expanding each sequence with all possible next tokens. After each step, prune the resulting set of sequences to the most probable. This strategy involves making times the number of predictions required for the other techniques. This is a greedy technique; there is no guarantee of optimality.