This is a simple example to compare popular optimizers used in deep learning (Adam etc.) with stochastic LBFGS.
The stochastic LBFGS optimizer is provided with the code. Further details are given in this paper and also this. Also see this introduction.
Files included are:
lbfgsnew.py
: New LBFGS optimizer
lbfgsb.py
: Bound constrained LBFGS optimizer
run_calibration.py
: Run a simple calibration, like
run_calibration.py --solver_name=LBFGSB
Available options for --solver_name
are LBFGS, LBFGSB, SGD, ADAM
.
Here is an image showing the reduction of calibration error (Student's T loss) with minibatch (CPU time) for LBFGS and Adam. Adam runs faster but slower to converge. LBFGS uses 1 epoch and Adam uses 4 epochs in the image. The minibatch size is 1/10-th of the full dataset.
For a much faster, C/CUDA version of the LBFGS optimizer, follow this link.
For a completely different method to calibrate, see also ManOpt.