-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Single-precision support for HIP variants #93
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks, this looks great and confirmed to work on LUMI!
A few minor clean-up comments but no show-stoppers
@@ -10,17 +10,18 @@ | |||
|
|||
#include "cloudsc_validate.h" | |||
|
|||
#include <float.h> | |||
#include <dtype.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include "dtype.h"
or redundant?
Was previously used for DBL_EPSILON, I think.
// #pragma omp parallel for default(shared) private(b, bsize, jk) \ | ||
// reduction(min:zminval) reduction(max:zmaxval,zmaxerr) reduction(+:zerrsum,zsum) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you re-enabled this in the othe rPR?
// #pragma omp parallel for default(shared) private(b, bsize, jl, jk) \ | ||
// reduction(min:zminval) reduction(max:zmaxval,zmaxerr) reduction(+:zerrsum,zsum) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment
// dtype (*field)[nlev][nlon] = (dtype (*)[nlev][nlon]) v_field; | ||
// dtype (*reference)[nlev][nlon] = (dtype (*)[nlev][nlon]) v_ref; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug leftover?
//dtype (*field)[nlon] = (dtype (*)[nlon]) v_field; | ||
//dtype (*reference)[nlon] = (dtype (*)[nlon]) v_ref; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug leftover?
// dtype (*field)[nclv][nlev][nlon] = (dtype (*)[nclv][nlev][nlon]) v_field; | ||
// dtype (*reference)[nclv][nlev][nlon] = (dtype (*)[nclv][nlev][nlon]) v_ref; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug leftover?
// #pragma omp parallel for default(shared) private(b, bsize, jl, jk, jm) \ | ||
// reduction(min:zminval) reduction(max:zmaxval,zmaxerr) reduction(+:zerrsum,zsum) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-enable or remove?
ce87db3
to
50863dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, many thanks. Tested in conjunction with #97 and seems to work fine.
HIP SP tested on LUMI via e.g.
./cloudsc-bundle build --clean --build-dir=build-sp-hip --arch=arch/eurohpc/lumi/cray-gpu/16.0.1 --with-hip --single-precision [--with-serialbox]