Successive halving is an aggressive strategy for pruning hyperparameter combinations during naive methods such as grid search or randomized search. It involves iteratively abandoning half of the possible combinations according to some schedule.
As implemented in SciKit-Learn, successive halving defaults to testing a fixed number of training examples. So in the first iteration, you use
This takes less time than naive grid search, but it’s still a crappy way to do it because the small training set will inevitably lead to high variance. Hence you might be pruning really good parameter combinations that you miss merely due to chance. Use Bayesian optimization instead.