πŸ”„ 1. Why Normalize Data?


⚑ 2–4. The Problems with SGD

Stochastic Gradient Descent (SGD) is simple but has some practical issues:

βœ… Condition Number:

βœ… Saddle Points & Local Minima:

βœ… Stochastic Noise:


πŸš€ 5. Momentum