Prior to applying softmax to the final layer output, set the logits for all but the most probable outputs to . (Equivalently, take the top- after softmax and then renormalize.) This can ensure that inappropriate words will never be selected, but it can be difficult to tune .