-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Longer lived cache for opencl_fft_app #134
base: main
Are you sure you want to change the base?
Conversation
This seems to work in practice too, from one time step I was seeing 25-26s ish
which became 8-9s ish
(everything else being the same) |
(BEGIN UNRELATED TRAIN OF THOUGHT) I agree that shoving this in the CL context is not ideal, because it leads to a reference cycle that will never be garbage-collected. At some point (hopefully not too far into the future), array contexts will make their way into sumpy, at which point we could just use them here. Would you be willing to carry this diff yourself until that point? |
Yeah, I was thinking |
I'm confused as to why this would cause a performance improvement. POCL has a global cache and that should make clBuildProgram return quickly. |
@isuruf My guess was that |
Maybe that is a bottleneck. I was looking at the profile you attached in a previous comment at #134 (comment) which shows that |
Hm, I used Would running it through |
Yes please. Is this the same code as #129 (comment)? |
Nope, this is another code that does some time stepping with the Stokes velocity field. Running that should also show how long it takes to initialize I'll run it through |
@isuruf This should contain the Both of them seem to show that there's quite a bit of time spent in |
Thanks. I can see that it is indeed quite slow. I'll have a look to see if that can be fixed. |
9c52e77
to
25d7ae6
Compare
#131 and #132 fixed the long codegen times, but there seem to still be a few runtime issues around. This is the first one I found from a quick profile.
Caching on the
cl_context
doesn't seem like a particularly good idea though. Ideally we could bubble this up and letpytential
cache it on the array context. Thoughts?For some context: this seems to be happening because of my moving meshes, i.e. I keep recreating the
QBXLayerPotentialSource
, which recreates thesumpy
wranglers and clears all the caches.