Tanimoto Score Calculation Index out of Bound Error #177
Labels
bug
Marks a bug that has been confirmed/reproduced
linked to dev
Mark if this issue has been linked to an internal development issue.
I do have a problem with calculating similarity matrices, especially the Tanimoto Score.
I am currently working with Sirius 5.8.6. and I have a working CLI code for this.
I encountered an out of bound error with my similarity calculation some days ago and dont seem to be able to fix it.
I use this command to get the annotations I want and this works just fine. I get all my files in the SIRIUS directory I choose as the output directory.
"C:/Program Files/sirius/sirius.exe" -i "//test_directory/data/feature-data.mgf" -o "//test_directory/data/SIRIUS" config --IsotopeSettings.filter=true --FormulaSearchDB= --Timeout.secondsPerTree=0 --FormulaSettings.enforced=HCNOP --Timeout.secondsPerInstance=0 --AdductSettings.detectable=[[M-H2O+H]+,[M+K]+,[M-H]-,[M+Cl]-,[M+Na]+,[M+H3N+H]+,[M+H]+,[M+Br]-,[M-H2O-H]-,[M-H4O2+H]+] --UseHeuristic.mzToUseHeuristicOnly=650 --AlgorithmProfile=orbitrap --IsotopeMs2Settings=IGNORE --MS2MassDeviation.allowedMassDeviation=5.0ppm --NumberOfCandidatesPerIon=1 --UseHeuristic.mzToUseHeuristic=300 --FormulaSettings.detectable=B,Cl,Br,Se,S --NumberOfCandidates=10 --AdductSettings.enforced=, --AdductSettings.fallback=[[M+K]+,[M+Cl]-,[M-H]-,[M+Na]+,[M+H]+,[M+Br]-] --FormulaResultThreshold=true --InjectElGordoCompounds=true --StructureSearchDB=BIO --RecomputeResults=false formula fingerprint structure canopus write-summaries
For the similarity calculation I use this code:
"C:/Program Files/sirius/sirius.exe" -i "//test_directory/data/SIRIUS" similarity --numpy --tanimoto --tanimoto-canopus -d "//test_directory/data/similarity"
and then encounter this error, which repeats several times with different job numbers and then I get no output (as expected after these errors):
Jul 22, 2024 5:47:13 PM de.unijena.bioinf.jjobs.JJob lambda$logError$2
SEVERE: <27>[JJob-27] Failed!
java.lang.ArrayIndexOutOfBoundsException: Index 3878 out of bounds for length 3878
at de.unijena.bioinf.ChemistryBase.fp.ProbabilityFingerprint$PairwiseIterator.getRightProbability(ProbabilityFingerprint.java:295)
at de.unijena.bioinf.ms.frontend.subtools.similarity.SimilarityMatrixWorkflow.fpcos(SimilarityMatrixWorkflow.java:359)
at de.unijena.bioinf.ms.frontend.subtools.similarity.SimilarityMatrixWorkflow.lambda$tanimoto$9(SimilarityMatrixWorkflow.java:147)
at de.unijena.bioinf.ChemistryBase.math.MatrixUtils$1$1.compute(MatrixUtils.java:537)
at de.unijena.bioinf.jjobs.BasicJJob.call(BasicJJob.java:117)
at de.unijena.bioinf.jjobs.BasicMasterJJob$1.compute(BasicMasterJJob.java:101)
at java.base/java.util.concurrent.RecursiveTask.exec(Unknown Source)
at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
I already tested various things. Like recalculating my .mgf file (has non merged MS/MS data from mzMine that processes my raw data) and recalculated my SIRIUS files. I tried to change the command in different fashions and even tried with the version 6.0.0 (which I didnt manage to get running with my configurations as I wanted and switched back to the previous version)
Do you have any suggestions where the problem may be?
The text was updated successfully, but these errors were encountered: