There is a connection between gradient descent based optimizers and the dynamics of damped harmonic oscillators. What does that mean? We now have a better theory for optimization algorithms.
In this episode I explain how all this works.
All the formulas I mention in the episode can be found in the post The physics of optimization algorithms
Enjoy the show.
Top-3 ways to put machine learning models into production (Ep. 126)
Remove noise from data with deep learning (Ep.125)
What is contrastive learning and why it is so powerful? (Ep. 124)
Neural search (Ep. 123)
Let's talk about federated learning (Ep. 122)
How to test machine learning in production (Ep. 121)
Why synthetic data cannot boost machine learning (Ep. 120)
Machine learning in production: best practices [LIVE from twitch.tv] (Ep. 119)
Testing in machine learning: checking deeplearning models (Ep. 118)
Testing in machine learning: generating tests and data (Ep. 117)
Why you care about homomorphic encryption (Ep. 116)
Test-First machine learning (Ep. 115)
GPT-3 cannot code (and never will) (Ep. 114)
Make Stochastic Gradient Descent Fast Again (Ep. 113)
What data transformation library should I use? Pandas vs Dask vs Ray vs Modin vs Rapids (Ep. 112)
[RB] It’s cold outside. Let’s speak about AI winter (Ep. 111)
Rust and machine learning #4: practical tools (Ep. 110)
Rust and machine learning #3 with Alec Mocatta (Ep. 109)
Rust and machine learning #2 with Luca Palmieri (Ep. 108)
Rust and machine learning #1 (Ep. 107)
Create your
podcast in
minutes
It is Free
Insight Story: Tech Trends Unpacked
Zero-Shot
Fast Forward by Tomorrow Unlocked: Tech past, tech future
Lex Fridman Podcast
All-In with Chamath, Jason, Sacks & Friedberg