Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use clang with "-march=x86-64-v3" for building plugin #383

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

mikeoliphant
Copy link
Contributor

Using clang with "-march=x86-64-v3" optimizations results in a significant reduction in CPU usage.

On my machine, a single standard NAM model goes from ".15c" (15% of a CPU core) to ".11c"/".12c" - so a 20-25% CPU reduction.

I'd seen a small reduction in CPU by just switching to clang, but @38github keyed me into the additional benefit with the architecture-specific optimization.

This change has two downsides:

  • it requires clang for Visual Studio to be installed to build
  • it only works on "modern" CPUs (about Intel Haswell and later - so circa 2013)

I think the performance increases are significant enough to warrant using this method to build the default Windows plugin. It may also make since to provide a "compatibility" build for older systems.

@mikeoliphant
Copy link
Contributor Author

mikeoliphant commented Oct 23, 2023

Note that I have only made the change for the VST3 plugin project - not the standalone app.

**Edit:** I've now added support to the standalone app as well.

@mikeoliphant
Copy link
Contributor Author

mikeoliphant commented Oct 23, 2023

If you want to give it a try without having to compile it, you can get it from the artifacts section from the PR build:

https://github.com/sdatkinson/NeuralAmpModelerPlugin/actions/runs/6618954029

(download links are at the bottom of the page under "Artifacts")

@38github
Copy link

Also add the compiler flags to the LDFLAGS to gain even more performance. I added the following to my CMakeLists.txt when making the LV2 version:

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Ofast -flto -march=x86-64-v3")
set(CMAKE_LDFLAGS "${CMAKE_LDFLAGS} -Ofast -flto -march=x86-64-v3")

@mikeoliphant
Copy link
Contributor Author

Also add the compiler flags to the LDFLAGS to gain even more performance.

It is a compiler directive, so I don't think it makes sense in LDFLAGS.

Happy to be proven wrong, but I'm not seeing any difference in the generated VS solution (or any performance improvement) by adding the directive to LDFLAGS.

@mikeoliphant
Copy link
Contributor Author

@sdatkinson Any thoughts on how to potentially move forward with this?

My current thinking is that because it is a significant improvement that will work on the vast majority of PCs, it probably makes sense to switch to it for the official build.

I'm thinking of adding a new build configuration (maybe "RELEASE-X64-V3"?) that has the x64 optimizations and switch the github build to use that for the official releases. Then anyone with an old PC can always build a normal RELEASE build if they need to.

@mikeoliphant
Copy link
Contributor Author

I went ahead and added a "Release-Clang-x64-v3" build configuration. Now this PR can be merged without impacting the current official release build.

To make the optimized build the official one for releases, I think you just need to change the build configuration here:

msbuild NeuralAmpModeler.sln /t:NeuralAmpModeler-app;NeuralAmpModeler-vst3 /p:configuration=release /p:platform=x64 /nologo /verbosity:minimal /fileLogger /m /flp:logfile=build-win.log;errorsonly;append

@DomMcsweeney
Copy link

DomMcsweeney commented Nov 4, 2023

Just tried loading this LSTM model into the above version of NAM. It basically doesn't work, doesn't allow a signal through, and produces a loud constant noise. Is anyone able to test?
(Windows 11>Reaper 7.02>vst3)

Deluxe Reverb Vibrato.zip

@38github
Copy link

38github commented Nov 4, 2023

Also this LSTM is problematic too, it causes Reaper to auto-mute. http://coginthemachine.ddns.net/mnt/nam/pedals_outboards_etc/moonball/moonball-pgm_4009_preamp-lead-ga10_vo05_ba05_md05_tr05/moonball-pgm_4009_preamp-lead-ga10_vo05_ba05_md05_tr05-lstm-nl2_hs34/

Interesting. I will have to try it out in Windows but the Linux LV2 handles it well. I wonder what causes the issue.

@DomMcsweeney
Copy link

Yeah no idea, works fine with standard wavenet NAMs, just as soon as I switched to LSTM models, it's not happy.

@38github
Copy link

38github commented Nov 5, 2023

The same happens on my computer under WIndows 10 and REAPER. Strange that the LV2 (x86-64-v3) works fine.

@mikeoliphant
Copy link
Contributor Author

Interesting. I wonder if this has anything to do with the issue we were seeing with the LV2 plugin for some LSTM models?

mikeoliphant/neural-amp-modeler-lv2#41

The behavior was different, but it is possible that there is some memory issue with the LSTM code that is only cause problems with certain combination of models/platform/compiler.

I'm traveling and haven't had a chance to try to reproduce the problem yet.

@38github
Copy link

38github commented Nov 6, 2023

The behavior was different, but it is possible that there is some memory issue with the LSTM code that is only cause problems with certain combination of models/platform/compiler.

The LV2 Windows version works fine on my computer and I can't recall it having issues on Linux but I will test it there also and report back.

@mikeoliphant
Copy link
Contributor Author

Finally got a chance to look into this a bit. I'm getting silence when trying to play LSTM models - all models I've tried so far.

Interestingly, though, I have the same problem with the non-clang build.

I also have the same problem with the latest github action build:

https://github.com/sdatkinson/NeuralAmpModelerPlugin/actions/runs/6781704843

So, while it does seem that there is an LSTM issue, it does not seem to be related specifically to the clang build.

@mikeoliphant
Copy link
Contributor Author

It is strange that the LV2 plugin is working fine. It is currently synced to the same Core code as the NeuralAmpModelerPlugin, so if an LSTM bug had been introduced I would expect to see it there as well.

@mikeoliphant
Copy link
Contributor Author

I've opened an issue to track the LSTM problem here:

#388

@DomMcsweeney
Copy link

I've not had any issues using LSTMs with the main release version, just with the build that you provided.
Also it's odd that you are getting silence, however I'm getting a bleep error noise as soon as I load it up.

@mikeoliphant
Copy link
Contributor Author

mikeoliphant commented Nov 11, 2023

I've not had any issues using LSTMs with the main release version

If by "main release" you mean the last official release, it is almost four months old. The problem seems to have been introduced since then.

You need to build a later release for yourself, or download it from a github build - for example from here:

https://github.com/sdatkinson/NeuralAmpModelerPlugin/actions/runs/6781704843

@DomMcsweeney
Copy link

Yeah sorry - my bad, I've just built locally and I can confirm your correct.
Basically, same issue on both your recent build, and locally building from the current main branch.

@mikeoliphant
Copy link
Contributor Author

The LSTM issue is fixed now in the Core (thanks, @sdatkinson!). It will be fixed here when the Core submodule is next sync'd.

The issue wasn't a problem for the LV2 plugin because of some usage details, but I've updated the LV2 plugin with the LSTM fix and have verified that it still works fine.

@highway11
Copy link

I run multiple instances of NAM for amp and pedal simulation on multiple tracks in my DAW, and 20-25% is more than a significant benefit in this case. This really adds up across multiple instances. Is there a reason this isn't being pushed to the main build? Is it too much work to have multiple releases, this one plus another for legacy CPU support?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants