You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! I find your repo very interesting and I gave it a star without hesitation! I am also learning L2 cache recently, so I wonder where it uses "immediate eviction" and "fetched from L2 cache"?? I guess it has relation with discard_memory or L2 persistent API?
Thank you!!
By the way, you mentioned you use ncu to perform and analyze it, also interested how that is done. Maybe you could publish a top conference paper!
The text was updated successfully, but these errors were encountered:
Hi, the L2 cache is used implicitly whenever global memory is fetched; the immediate eviction cache policy for weight loads is defined here. The key is that we want to reuse activations (which we need to load many times) in L2 cache, but don't care about weights as they are only accessed exactly once.
We are considering a write-up of this work, however I am currently very busy, so this may take quite a while.
I see! Is it possible to use L2 cache better? I know there is an API mentioned here. But I can not find out a way to use it well.... I mean, maybe some random access will squeeze out the useful data in L2? What do you think? Thanks!!!
Hi! I find your repo very interesting and I gave it a star without hesitation! I am also learning L2 cache recently, so I wonder where it uses "immediate eviction" and "fetched from L2 cache"?? I guess it has relation with discard_memory or L2 persistent API?
Thank you!!
By the way, you mentioned you use ncu to perform and analyze it, also interested how that is done. Maybe you could publish a top conference paper!
The text was updated successfully, but these errors were encountered: