Skip to content
Akshay Kiran Jose edited this page Oct 7, 2022 · 4 revisions

Learning to learn by gradient descent by gradient descent

You could try getting some motivation from the plot below or find the latest code here:

An LSTM based optimizer learns to minimize functions, that too, better on average than tried-and-tested optimizers along the likes of Adam, RMS Prop and SGD.

loss_versus_iterations

Clone this wiki locally