diff --git a/_posts/2023-05-01-hitchhikers-momentum.md b/_posts/2023-05-01-hitchhikers-momentum.md index 43c0c8c1..9de66480 100644 --- a/_posts/2023-05-01-hitchhikers-momentum.md +++ b/_posts/2023-05-01-hitchhikers-momentum.md @@ -265,7 +265,7 @@ difference between the current and the previous iterate $$(\xx_{t} - \xx_{t-1})$ Despite its simplicity, gradient descent with momentum exhibits unexpectedly rich dynamics that we'll explore on this post. -### History and relatex work +### History and related work The origins of momentum can be traced back to Frankel's method in the 1950s for solving linear system of equations. It was later generalized by Boris Polyak to non-quadratic objectives. While the quadratic case is by now well understood, the general strongly convex case has instead had some fascinating developments in the last years. @@ -564,4 +564,4 @@ Plotting the asymptotic rates for all regions we can see that Polyak momentum (t ## Reproducibility -All plots in this post were generated using the following Jupyer notebook: [[HTML]]({{'assets/html/2023-05-01-hitchhikers-momentum/hitchhikers-momentum.html' | relative_url}}) [[IPYNB]]({{'assets/html/2023-05-01-hitchhikers-momentum/hitchhikers-momentum.ipynb' | relative_url}}) \ No newline at end of file +All plots in this post were generated using the following Jupyer notebook: [[HTML]]({{'assets/html/2023-05-01-hitchhikers-momentum/hitchhikers-momentum.html' | relative_url}}) [[IPYNB]]({{'assets/html/2023-05-01-hitchhikers-momentum/hitchhikers-momentum.ipynb' | relative_url}})