Gradient Descent
loss
|grad|
conf
step
1
Initialize
Random weights are assigned. The model knows nothing — it starts at a random point on the loss surface.
2
Compute gradient
At each position, the model measures the slope — which direction makes the error decrease fastest?
3
Update weights
Weights shift in the direction of steepest descent. The learning rate controls how big each step is.
4
Momentum carries
Like a ball rolling downhill, momentum helps the model push through small bumps and flat regions.
5
Approaching a minimum
The gradient shrinks. Steps get smaller. The model is fine-tuning — making precise adjustments to its parameters.
6
Converged
The model has settled into a valley. Weights are stable. This configuration minimizes error — the model has learned.
Zero State Reflex
Made by Zero State Reflex
drag to rotate · scroll to zoom · double-click to restart
Gradient Descent
click to begin