diff --git a/README.md b/README.md index 830fe735..b7354ed6 100644 --- a/README.md +++ b/README.md @@ -33,6 +33,9 @@ We currently support a few LLM models targeting text generation scenarios: `pip install optimum-tpu -f https://storage.googleapis.com/libtpu-releases/index.html` +`export PJRT_DEVICE=TPU` + + ## Inference `optimum-tpu` provides a set of dedicated tools and integrations in order to leverage Cloud TPUs for inference, especially @@ -68,4 +71,3 @@ You can check the examples: - [Fine-Tune Gemma on Google TPU](https://github.com/huggingface/optimum-tpu/blob/main/examples/language-modeling/gemma_tuning.ipynb) - The [Llama fine-tuning script](https://github.com/huggingface/optimum-tpu/blob/main/examples/language-modeling/llama_tuning.md) - diff --git a/docs/source/_toctree.yml b/docs/source/_toctree.yml index 898ff81e..9ea0d58e 100644 --- a/docs/source/_toctree.yml +++ b/docs/source/_toctree.yml @@ -12,6 +12,8 @@ title: Deploying a Google Cloud TPU instance - local: howto/serving title: Deploying a TGI server on a Google Cloud TPU instance + - local: howto/training + title: Training on a Google Cloud TPU instance title: How-To Guides title: Optimum-TPU isExpanded: true diff --git a/docs/source/howto/deploy.mdx b/docs/source/howto/deploy.mdx index 6d5bce75..88ae211c 100644 --- a/docs/source/howto/deploy.mdx +++ b/docs/source/howto/deploy.mdx @@ -41,7 +41,7 @@ target TPUv5 VMs: `gcloud components install alpha` gcloud alpha compute tpus tpu-vm create optimum-tpu-get-started \ --zone=us-west4-a \ --accelerator-type=v5litepod-8 \ ---version=v2-alpha-tpuv5 +--version=v2-alpha-tpuv5-lite ``` ## Connecting to the instance diff --git a/docs/source/howto/training.mdx b/docs/source/howto/training.mdx new file mode 100644 index 00000000..a7e40042 --- /dev/null +++ b/docs/source/howto/training.mdx @@ -0,0 +1,37 @@ +# Training on TPU + +Welcome to the 🤗 Optimum-TPU training guide! This section covers how to fine-tune models using Google Cloud TPUs. + +## Currently Supported Models + +The following models have been tested and validated for fine-tuning on TPU v5e: + +- 🦙 LLaMA Family + - LLaMA-2 7B + - LLaMA-3 8B +- 💎 Gemma Family + - Gemma 2B + - Gemma 7B + +## Getting Started + +### Prerequisites + +Before starting the training process, ensure you have: + +1. A configured Google Cloud TPU instance (see [Deployment Guide](./deploy)) +2. Optimum-TPU installed with PyTorch/XLA support: +```bash +pip install optimum-tpu -f https://storage.googleapis.com/libtpu-releases/index.html +export PJRT_DEVICE=TPU +``` + +### Example Training Scripts + +We provide several example scripts to help you get started: + +1. Gemma Fine-tuning: + - See our [Gemma fine-tuning notebook](https://github.com/huggingface/optimum-tpu/blob/main/examples/language-modeling/gemma_tuning.ipynb) for a step-by-step guide + +2. LLaMA Fine-tuning: + - Check our [LLaMA fine-tuning script](https://github.com/huggingface/optimum-tpu/blob/main/examples/language-modeling/llama_tuning.md) for detailed instructions \ No newline at end of file diff --git a/docs/source/index.mdx b/docs/source/index.mdx index 6bd85727..d1ea0f49 100644 --- a/docs/source/index.mdx +++ b/docs/source/index.mdx @@ -18,10 +18,7 @@ limitations under the License. Optimum TPU provides all the necessary machinery to leverage and optimize AI workloads running on [Google Cloud TPU devices](https://cloud.google.com/tpu/docs). -The API provides the overall same user-experience as Hugging Face transformers with the minimum amount of changes required to target performance for inference. - -Training support is underway, stay tuned! 🚀 - +The API provides the overall same user-experience as Hugging Face transformers with the minimum amount of changes required to target performance for inference and training. ## Installation @@ -30,7 +27,7 @@ As such, we provide a pip installable package to make sure everyone can get easi ### Run Cloud TPU with pip ```bash -pip install optimum-tpu +pip install optimum-tpu -f https://storage.googleapis.com/libtpu-releases/index.html ``` ### Run Cloud TPU within Docker container @@ -43,7 +40,7 @@ docker pull docker run -ti --rm --privileged --network=host ${TPUVM_IMAGE_URL}@sha256:${TPUVM_IMAGE_VERSION} bash ``` -From there you can install optimum-tpu through the pip instructions above. +From there you can install optimum-tpu through the pip instructions mentioned above.
diff --git a/docs/source/tutorials/overview.mdx b/docs/source/tutorials/overview.mdx index fa07e5f3..98573bee 100644 --- a/docs/source/tutorials/overview.mdx +++ b/docs/source/tutorials/overview.mdx @@ -1,19 +1,38 @@ - +### Text Generation +Learn how to perform efficient inference for text generation tasks: -# Overview +- **Basic Generation Script** ([examples/text-generation/generation.py](https://github.com/huggingface/optimum-tpu/blob/main/examples/text-generation/generation.py)) + - Demonstrates text generation using models like Gemma and Mistral + - Features greedy sampling implementation + - Shows how to use static caching for improved performance + - Includes performance measurement and timing analysis + - Supports custom model loading and configuration + +### Language Model Fine-tuning +Explore how to fine-tune language models on TPU infrastructure: + +1. **Interactive Gemma Tutorial** ([examples/language-modeling/gemma_tuning.ipynb](https://github.com/huggingface/optimum-tpu/blob/main/examples/language-modeling/gemma_tuning.ipynb)) + - Complete notebook showing Gemma fine-tuning process + - Covers environment setup and TPU configuration + - Demonstrates FSDPv2 integration for efficient model sharding + - Includes dataset preparation and PEFT/LoRA implementation + - Provides step-by-step training workflow + +2. **LLaMA Fine-tuning Guide** ([examples/language-modeling/llama_tuning.md](https://github.com/huggingface/optimum-tpu/blob/main/examples/language-modeling/llama_tuning.md)) + - Detailed guide for fine-tuning LLaMA-2 and LLaMA-3 models + - Explains SPMD and FSDP concepts + - Shows how to implement efficient data parallel training + - Includes practical code examples and prerequisites + +## Additional Resources + +- Visit the [Optimum-TPU GitHub repository](https://github.com/huggingface/optimum-tpu) for more details +- Explore the [Google Cloud TPU documentation](https://cloud.google.com/tpu/docs) for deeper understanding of TPU architecture -Welcome to the 🤗 Optimum-TPU tutorials! +For the latest updates and to contribute to these examples, visit our [GitHub repository](https://github.com/huggingface/optimum-tpu). \ No newline at end of file