Releases: huggingface/optimum-tpu
Releases · huggingface/optimum-tpu
v0.1.5
v0.1.4
These changes focus on improving support for instruct models and solve an issue appearing when using those models through the web ui interface with invalid settings.
What's Changed
- Fix secret leak workflow by @tengomucho in #72
- Handle selector exception by @tengomucho in #73
- chore(tgi): update TGI base image by @tengomucho in #75
- Fix instruct models UI issue by @tengomucho in #78
Full Changelog: v0.1.3...v0.1.4
v0.1.3
Cleanup of previous fixed and lower batch size to prevent memory issues on Inference Endpoints with some models.
What's Changed
- Few more Inference Endpoints fixes by @tengomucho in #69
- feat(cache): use optimized StaticCache class for XLA by @tengomucho in #70
- Lower TGI IE batch size by @tengomucho in #71
Full Changelog: v0.1.2...v0.1.3
v0.1.2
What's Changed
This Release contains only few small fixes, mainly for Inference Endpoints.
- Several Inference Endpoint fixes by @tengomucho in #66
- More Inference Endpoints features and fixes by @tengomucho in #68
Full Changelog: v0.1.1...v0.1.2
v0.1.1
TPU first release, allowing to have TPU Text Generation Inference and Inference Endpoints container images available.
What's Changed
- Basic TGI server on XLA by @tengomucho in #1
- Enable CI/CD by @tengomucho in #2
- Fix TGI Dockerfile by @shub-kris in #3
- Add static KV cache and test on Gemma-2B by @tengomucho in #4
- Small optimizations by @tengomucho in #5
- Enable compilation by @tengomucho in #6
- Revert "fix: attention mask should be 1 or 0" by @tengomucho in #8
- feat: use dynamic batching when generating by @tengomucho in #9
- Repo layout by @tengomucho in #10
- Add PyPI release workflow by @regisss in #11
- Xla parallel proxy by @tengomucho in #12
- Add documentation to the repository by @mfuntowicz in #13
- Adopt naming convention of transformers API by @mfuntowicz in #14
- Fix main doc build workflow by @regisss in #15
- Improve readme by @mfuntowicz in #16
- Fix layout in README by @mfuntowicz in #17
- Fix rule and instructions for TGI by @mfuntowicz in #18
- Fix typo in index.mdx by @mfuntowicz in #19
- Added some links to Cloud TPU documentation by @mikegre-google in #20
- Parallel sharding by @tengomucho in #21
- Bump version to 0.1.0.dev1 by @mfuntowicz in #24
- Bump version to 0.1.0.dev2 by @mfuntowicz in #25
- Fix TGI missing import by @mfuntowicz in #27
- Forward arguments from TGI launcher to the model by @mfuntowicz in #28
- Fix optimum-tpu pip install instructions by @mfuntowicz in #29
- Fix tests with do_sample=True by @tengomucho in #30
- Sharding in tgi by @tengomucho in #31
- Fix missing '=' to assign environment variables in the default case w… by @mfuntowicz in #33
- Include two different stages for building TGI image: by @mfuntowicz in #34
- Llama support by @tengomucho in #32
- chore(ci): added workflow for nightly tests by @tengomucho in #35
- fix(build): setup.py removed from build_dist dependencies by @tengomucho in #36
- Try again to fix nightly builds by @tengomucho in #37
- Basic Llama2 Tuning by @tengomucho in #39
- Bug doc builder by @pagezyhf in #40
- Fix typo ; Update llama_tuning.md by @furkanakkurt1335 in #42
- Update to Pytorch 2.3.0 and transformers v4.40.2 by @tengomucho in #41
- Fine tuning with FSDP v2 by @tengomucho in #44
- Minor fix for mispelled stage in TGI dockerfile. by @thealmightygrant in #46
- Align to Transformers 4.41.1 by @tengomucho in #45
- chore(training): Allow training on torch xla > 2.3.0, add warning by @tengomucho in #48
- fix(build): add missing setuptools_scm section by @tengomucho in #49
- fix(logging): correct logging usage by @tengomucho in #50
- fix(tests): fix decode sample expected outputs again by @tengomucho in #52
- fix(doc): update server and port when serving TGI by @tengomucho in #53
- fix(ci): correct secrets leak workflow check by @tengomucho in #55
- Add Mistral support 💨 by @tengomucho in #54
- Mistral nits by @tengomucho in #57
- chore: bump to version v0.1.0a1 by @tengomucho in #60
- feat(TGI): add release docker image build and push to registry workflow by @tengomucho in #62
- chore: bump to version v0.1.1 by @tengomucho in #63
New Contributors
- @tengomucho made their first contribution in #1
- @shub-kris made their first contribution in #3
- @regisss made their first contribution in #11
- @mfuntowicz made their first contribution in #13
- @mikegre-google made their first contribution in #20
- @pagezyhf made their first contribution in #40
- @furkanakkurt1335 made their first contribution in #42
- @thealmightygrant made their first contribution in #46
Full Changelog: https://github.com/huggingface/optimum-tpu/commits/v0.1.1