cudnn FE 1.0 pre-release 2
Pre-release
Pre-release
Release Notes:
Improvements over prerelease 1:
[Feature] Added missing python bindings for several pointwise ops.
[Feature] SDPA flash attention feature parity with the backend API.
[Bug fixes] Shape inferencing fixes for dgrad, wgrad where the output dimension cannot be computed deterministically.
Under investigation and development:
- We are still working on additional features for SDPA back prop.
- CPU overhead when using the python bindings are under investigation.
- Better error messages and logging
Miscelleanous updates to the v0.x API:
[Bug fix] Some tests were failing on Ampere GPUs because no plans with 0 size were available. This has been fixed.
[Bug fix] Median of three sampling was incorrectly sorting the results, when cudnnFind was used. This has been fixed.
[Feature] Layer Norm API has been added. And can be used with the v0.x API.
This release is experimental