Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix issue #5612 - boolean param to optionally log skipped fastqs #190

Conversation

glichtenstein
Copy link
Contributor

@glichtenstein glichtenstein commented May 21, 2024

Closes nf-core/modules#5612
nf-core/modules PR counterpart: nf-core/modules#5638

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/demultiplex branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Description

This pull request adds the optional parameter --log_skipped_fastqs to the demultiplex pipeline. This change was necessary to address the issue with logging skipped FASTQ files and preventing the AWS S3 issue.

Patch Details

The changes in this PR were applied using a patch from a specific commit in the nf-core/modules repository. The commit details are as follows:

The patch was applied to ensure that the demultiplex repository includes the latest changes from nf-core/modules necessary for this feature.

Testing

true
nextflow run glichtenstein/demultiplex -r 7d9538e --input nf-core-samplesheet.csv --outdir /data/scratch/iseq-DI/output -profile docker -work-dir /data/scratch/iseq-DI/workdir -resume --skip_tools fastp --log_empty_fastqs=true
false
nextflow run glichtenstein/demultiplex -r 7d9538e --input nf-core-samplesheet.csv --outdir /data/scratch/iseq-DI/output -profile docker -work-dir /data/scratch/iseq-DI/workdir -resume --skip_tools fastp --log_empty_fastqs=false

executor > local (5)
[1a/e1ae22] process > NFCORE_DEMULTIPLEX:DEMULTIPLEX:BCL_DEMULTIPLEX:BCLCONVERT [100%] 1 of 1 ✔
[5f/05036f] process > NFCORE_DEMULTIPLEX:DEMULTIPLEX:FALCO (iseq-DI_S1_L001) [100%] 1 of 1 ✔
[84/64923f] process > NFCORE_DEMULTIPLEX:DEMULTIPLEX:MD5SUM (iseq-DI_S1_L001) [100%] 2 of 2 ✔
[11/351698] process > NFCORE_DEMULTIPLEX:DEMULTIPLEX:MULTIQC [100%] 1 of 1 ✔
-[nf-core/demultiplex] Pipeline completed successfully-

cat output/empty_fastqs.log
Empty or invalid FASTQ file: /data/scratch/iseq-DI/workdir/1a/e1ae22.../Bad-iseq-DI_S2_L001_R1_001.fastq.gz
Empty or invalid FASTQ file: /data/scratch/iseq-DI/workdir/1a/e1ae22.../Bad-iseq-DI_S2_L001_R2_001.fastq.gz

Dataset used

  • A public BCL from 10X Genomics. I will add it to nf-core tests datasets when possible.
mkdir -p iseq-DI; cd iseq-DI
wget https://cf.10xgenomics.com/supp/spatial-exp/demultiplexing/iseq-DI.tar.gz
wget https://cf.10xgenomics.com/supp/spatial-exp/demultiplexing/bcl_convert_samplesheet.csv

SampleSheet was modified to include bad barcodes.

nf-core_samplesheet.csv

id,samplesheet,lane,flowcell,per_flowcell_manifest
AA0BCDFGH,/data/scratch/iseq-DI/bcl_convert_samplesheet.csv,1,/data/scratch/iseq-DI/iseq-DI.tar.gz

bcl_convert_samplesheet.csv

[Header],,,
FileFormatVersion,2,,
,,,
[BCLConvert_Settings],,,
CreateFastqForIndexReads,0,,
,,,
[BCLConvert_Data],,,
Lane,Sample_ID,index,index2
1,iseq-DI,GTAACATGCG,AGGTAACACT
1,Bad-iseq-DI,TATACAGATA,GATACAGATA

@glichtenstein glichtenstein requested review from blajoie and a team as code owners May 21, 2024 05:54

This comment was marked as resolved.

Copy link

github-actions bot commented May 21, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit c12458d

+| ✅ 182 tests passed       |+
#| ❔   3 tests were ignored |#
!| ❗   6 tests had warnings |!

❗ Test warnings:

  • pipeline_todos - TODO string in README.md: Describe the minimum required steps to execute the pipeline, e.g. how to prepare samplesheets.
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-05-31 21:27:21

@glichtenstein glichtenstein changed the base branch from master to dev May 21, 2024 14:08
@glichtenstein glichtenstein changed the title Boolean parameter to --log_skipped_fastqs fix issue #5612 - boolean param to optionally log skipped fastqs May 21, 2024
@glichtenstein
Copy link
Contributor Author

It will need the the nf-core/modules PR counterpart to pass the tests.
nf-core/modules#5638

@glichtenstein glichtenstein force-pushed the boolean-parameter-to--log_skipped_fastqs branch from 4fc95ef to fffb21d Compare May 23, 2024 23:44
@glichtenstein glichtenstein force-pushed the boolean-parameter-to--log_skipped_fastqs branch from 0d6ca1d to 6531e13 Compare May 24, 2024 16:31
@glichtenstein glichtenstein force-pushed the boolean-parameter-to--log_skipped_fastqs branch 2 times, most recently from 34478cd to 3a89595 Compare May 31, 2024 17:58
@glichtenstein
Copy link
Contributor Author

Will close it in favor of nf-core/modules#5734 with help of @k1sauce refactoring of demultiplex.nf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ready to review PR ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

new File breaks cloud execution in bcl_demultiplex
1 participant