Batch gradient descent

The most comprehensive way to estimate the gradient of the loss function is to accumulate the loss from all of the examples. You are incorporating all available information, so this approach is in some sense ideal.

However, when using this approach for gradient descent, it can be infeasibly slow. Real-world problems involve high-dimensional gradients and a lot of observations. Each observation increases the amount of work you need to do before you can update your parameters, and each parameter makes the parameter space more complex.

David's raw ML reference notes

Explorer

Batch gradient descent

Graph View

Backlinks