Skip to content

ganguli-lab/degrees-of-freedom

Repository files navigation

How many degrees of freedom do we need to train deep networks?

This repository contains source code for the ICLR 2022 paper How many degrees of freedom do we need to train deep networks: a loss landscape perspective by Brett W. Larsen, Sanislav Fort, Nic Becker, and Surya Ganguli (arXiv version).

This code was developed and tested using JAX v0.1.74, JAXlib v0.1.52, and Flax v0.2.0. The authors intend to update the repository in the future with additional versions of the script that work with the flax.linen module.

Top-Level Scripts

  • burn_in_subspace.py: Script for random affine subspace and burn-in affine subspace experiments. To use random affine subspaces, set the parameter init_iters to 0.
  • lottery_subspace.py: Script for lottery subspace experiments
  • lottery_ticket.py: Script for lottery ticket experiments

Sub-Functions

  • architectures.py: Model files
  • data_utils.py: Functions for saving out data
  • generate_data.py: Functions to setup datasets for training
  • logging_tools.py: Setup for logger; generates automatic experiment name with timestamp
  • training_utils.py: Functions related to projecting to and training in a subspace

Citation

@inproceedings{LaFoBeGa22,
	title={How many degrees of freedom do we need to train deep networks: a loss landscape perspective},
	author={Brett W. Larsen and Stanislav Fort and Nic Becker and Surya Ganguli},
	booktitle={International Conference on Learning Representations},
	year={2022},
	url={https://openreview.net/forum?id=ChMLTGRjFcU}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages