This notebook is a simple example for fine-tuning GPT-J-6B with limited memory. A detailed explanation of how it works can be found in this model card. It is heavily based on this Colab. Huge thanks to Hivemind!
You can also finetune GPT-Neo-2.7B, French GPT-J (Cedille's Boris) and T0-3B with limited memory.
Models trained with this method:
Sauge Divine: @saugedivine. Trained on philosophical, trippy and mystical content.
La Voix du Bot: @lavoixdubot. Trained on French news.
LoRA: https://arxiv.org/abs/2106.09685
8-bit Optimizers: https://arxiv.org/abs/2110.02861
Twitter: @gustavecortal