v0.9
[Enhancement] Added ability to filter by shape of tensors to errata filter.
[Enhancement] Added ability to override the default feature vector in the opGraph manually.
[Enhancement] Added support for CUDNN_POINTWISE_RECIPROCAL pointwise operation.
[Enhancement] Added an option to limit the number of kernels benchmarked in find-plan.
[Bug Fix] Fixed "Scale Bias Conv BNGenstats" test case where the sum and square sum channel dimensions were incorrect.
[Bug Fix] Fixed a compiler error "dereferencing type-punned pointer will break strict-aliasing rules" seen in certain compiler while type-casting floating point alpha/beta to int64_t.
[Bug Fix] Waived "ConvScaleBiasAct_int8 sample" for V100 because of lack of int8 support.
[Samples] Added BF16/FP16/FP8 Flash Attention Fprop/Bprop samples.