-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] malloc error when using likwid-pin on ARM Jetson AGX Xavier (ARMv8) #488
Comments
It is hart to tell where the problem is exactly. It might be the build options for the Lua interpreter or inside the LIKWID library. In order to find the locations:
As soon as it stops, type |
Thanks, when trying this inside the likwid-5.2.2 folder, I get:
Does this help? |
Yes, it helps, thank you. But it is still not easy to find the problem. The only reason I could think of would be if the detection "how many hardware threads the system" has failed. This is of course a fundamental information and I'm surprised it reaches that point without that info. Can you please send me the content of |
icarus@ubuntu:/$ cat /proc/cpuinfo
processor : 0
model name : ARMv8 Processor rev 0 (v8l)
BogoMIPS : 62.50
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm dcpop
CPU implementer : 0x4e
CPU architecture: 8
CPU variant : 0x0
CPU part : 0x004
CPU revision : 0
MTS version : 55637613
processor : 1
model name : ARMv8 Processor rev 0 (v8l)
BogoMIPS : 62.50
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm dcpop
CPU implementer : 0x4e
CPU architecture: 8
CPU variant : 0x0
CPU part : 0x004
CPU revision : 0
MTS version : 55637613
processor : 2
model name : ARMv8 Processor rev 0 (v8l)
BogoMIPS : 62.50
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm dcpop
CPU implementer : 0x4e
CPU architecture: 8
CPU variant : 0x0
CPU part : 0x004
CPU revision : 0
MTS version : 55637613
processor : 3
model name : ARMv8 Processor rev 0 (v8l)
BogoMIPS : 62.50
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm dcpop
CPU implementer : 0x4e
CPU architecture: 8
CPU variant : 0x0
CPU part : 0x004
CPU revision : 0
MTS version : 55637613
processor : 4
model name : ARMv8 Processor rev 0 (v8l)
BogoMIPS : 62.50
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm dcpop
CPU implementer : 0x4e
CPU architecture: 8
CPU variant : 0x0
CPU part : 0x004
CPU revision : 0
MTS version : 55637613
processor : 5
model name : ARMv8 Processor rev 0 (v8l)
BogoMIPS : 62.50
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm dcpop
CPU implementer : 0x4e
CPU architecture: 8
CPU variant : 0x0
CPU part : 0x004
CPU revision : 0
MTS version : 55637613
processor : 6
model name : ARMv8 Processor rev 0 (v8l)
BogoMIPS : 62.50
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm dcpop
CPU implementer : 0x4e
CPU architecture: 8
CPU variant : 0x0
CPU part : 0x004
CPU revision : 0
MTS version : 55637613
processor : 7
model name : ARMv8 Processor rev 0 (v8l)
BogoMIPS : 62.50
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm dcpop
CPU implementer : 0x4e
CPU architecture: 8
CPU variant : 0x0
CPU part : 0x004
CPU revision : 0
MTS version : 55637613 |
I tested the current parser and found a bug but it is not relevant for your issue. Can you please do the And provide the content of the files As a last resort, comment out the line |
I get:
And
When commenting out
|
OK, very surprising. The failing line contains only It seems I have broken the "disabling of hwloc" somewhen in the past. Hwloc works on almost all systems that's why the disabling is tested rarely/almost never. |
Can you please run |
Here it is:
|
As I thought:
but it should be 1 since you seem to have a single L3 cache. The architecture is quite strange (when looking at the output): There are four sockets, each with 2 cores but all four sockets share a single L3 cache. It might be but it is against the current logic in LIKWID that each socket has its own L3 cache. I'll check the cache domain detection |
So, it seems the 4 sockets cause the problem. The cache domain detection divides "Total number of caches" by "Socket count" and casts it to Integer. This results in 0 cache domains per socket. Hard to fix without access for testing but I'll try to create a patch. |
With likwid-pin -- Version 5.2.2 (commit: 233ab943543480cd46058b34616c174198ba0459) I get the following error on an ARMv8 processor (using Linux) just in the beginning before the program starts:
e.g. when calling
likwid-pin -c S0:0-3 ./executable
The text was updated successfully, but these errors were encountered: