The learning rate, typically denoted , is a hyperparameter shared across all machine learning models that are optimized by following the gradient of a function. For most day-to-day work, one is most likely to encounter the learning rate in the context of gradient boosting and neural networks.

The learning rate is primarily concerned with the resolution of the gradient descent process. At each iteration, gradient descent takes a step opposite the direction of the gradient . In vanilla gradient descent, the magnitude of this step is . An excessively low learning rate will greatly slow down convergence and can also trap the algorithm in a local optimum by preventing it from gaining enough momentum to escape small, shallow minima. Meanwhile, an excessively high learning rate can prevent convergence altogether by causing the algorithm to overshoot the optimum ad infinitum.

The learning rate also plays an indirect role in regularization, albeit through competing tendencies. On the one hand, steep minima that cover a very small area of the gradient manifold often correspond to details of the training data, especially when surrounded by zones of much higher losses. Hence a low learning rate can reduce the likelihood that the optimizer can reach such a minimum, and hence reduces the risk of overfitting. On the other hand, high learning rates can help the optimizer escape local minima, reducing the risk of underfitting. Learning rate scheduling can help with this, in addition to its primary role of improving convergence.

Gradient boosting fits a regression model to the residuals, which in effect resembles the action of gradient descent while making much more informed and adaptive improvements to the overall predictor. In this context, the term “learning rate” refers to a shrinkage factor for each component in the ensemble. That is, “learning rate” defines the magnitude of the contribution from each regressor.

Folder overview

id: 4bc18ead-ef27-463e-bc2e-5290c5855d2a
folderPath: ""
title: "{{folderName}} overview"
showTitle: false
depth: 3
includeTypes:
  - folder
  - markdown
style: list
disableFileTag: false
sortBy: name
sortByAsc: true
showEmptyFolders: false
onlyIncludeSubfolders: false
storeFolderCondition: true
showFolderNotes: false
disableCollapseIcon: true

David's raw ML reference notes

Explorer

Learning rate

Folder overview

Graph View

Backlinks