Skip to content

cudnn FE 1.0 pre-release 2

Pre-release
Pre-release
Compare
Choose a tag to compare
@Anerudhan Anerudhan released this 13 Sep 22:29

Release Notes:

Improvements over prerelease 1:
[Feature] Added missing python bindings for several pointwise ops.

[Feature] SDPA flash attention feature parity with the backend API.

[Bug fixes] Shape inferencing fixes for dgrad, wgrad where the output dimension cannot be computed deterministically.

Under investigation and development:

  • We are still working on additional features for SDPA back prop.
  • CPU overhead when using the python bindings are under investigation.
  • Better error messages and logging

Miscelleanous updates to the v0.x API:

[Bug fix] Some tests were failing on Ampere GPUs because no plans with 0 size were available. This has been fixed.

[Bug fix] Median of three sampling was incorrectly sorting the results, when cudnnFind was used. This has been fixed.

[Feature] Layer Norm API has been added. And can be used with the v0.x API.

This release is experimental