Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds fq/lint for early validation of FASTQs #67

Open
wants to merge 9 commits into
base: dev
Choose a base branch
from

Conversation

adamrtalbot
Copy link

@adamrtalbot adamrtalbot commented Nov 2, 2024

Validation of FASTQS early prevents running the pipeline on invalid FASTQ files which will make the pipeline more efficient at achieving it's ultimate objective of checking FASTQ validity.

It adds 3 more parameters:

  • --skip_linting which enables the linting of FASTQs
  • --fq_lint_args which is a string of arguments to pass to the linting tool
  • --continue_with_lint_fail which is a boolean to determine whether to continue if the linting fails

Between these three options the user has a high degree of control over how the pipeline lints which should handle most use cases.

Implements tests for all cases using the rnaseq minimal test dataset which has invalid sequencing names 🙄 .

Closes #31

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/seqinspector branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nf-test test main.nf.test -profile test,docker).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Validation of FASTQS early prevents running the pipeline on invalid FASTQ files which will make the pipeline more efficient at achieving it's ultimate objective of checking FASTQ validity.

It adds 3 more parameters:
 - `--skip_linting` which enables the linting of FASTQs
 - `--fq_lint_args` which is a string of arguments to pass to the linting tool
 - `--continue_with_lint_fail` which is a boolean to determine whether to continue if the linting fails

Between these three options the user has a high degree of control over how the pipeline lints which should handle most use cases.

Closes nf-core#31
Copy link

github-actions bot commented Nov 2, 2024

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit ad5fd9b

+| ✅ 193 tests passed       |+
#| ❔   1 tests were ignored |#
!| ❗  21 tests had warnings |!

❗ Test warnings:

  • readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
  • pipeline_todos - TODO string in main.nf: Remove this line if you don't need a FASTA file
  • pipeline_todos - TODO string in nextflow.config: Specify your pipeline's command line flags
  • pipeline_todos - TODO string in nextflow.config: Optionally, you can add a pipeline-specific nf-core config at https://github.com/nf-core/configs
  • pipeline_todos - TODO string in README.md: TODO nf-core:
  • pipeline_todos - TODO string in README.md: Include a figure that guides the user through the major workflow steps. Many nf-core
  • pipeline_todos - TODO string in README.md: Fill in short bullet-pointed list of the default steps in the pipeline
  • pipeline_todos - TODO string in README.md: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file.
  • pipeline_todos - TODO string in README.md: Add bibliography of tools and data used in your pipeline
  • pipeline_todos - TODO string in usage.md: Add documentation about anything specific to running your pipeline. For general topics, please point to (and add to) the main nf-core website.
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
  • pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in test.config: Specify the paths to your test data on nf-core/test-datasets
  • pipeline_todos - TODO string in test.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in base.config: Check the defaults for all processes
  • pipeline_todos - TODO string in base.config: Customise requirements for specific processes.
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required

❔ Tests ignored:

  • files_unchanged - File ignored due to lint config: .github/CONTRIBUTING.md

✅ Tests passed:

Run details

  • nf-core/tools version 3.0.2
  • Run at 2024-11-18 11:13:47

@kedhammar
Copy link

Duplicate of #57 ?

@adamrtalbot
Copy link
Author

It is! Sorry I wasn't aware of that one although I thought I checked 🤔 .

This one has some additional features to handle different use cases for FQ lint such as continuing without the failures.

@erkutilaslan
Copy link

Got scooped! :D
Adam, as you mentioned your PR seems to be a better implementation of fq lint into seqinspector. Should I close PR #57 ?

@kedhammar
Copy link

Got scooped! :D Adam, as you mentioned your PR seems to be a better implementation of fq lint into seqinspector. Should I close PR #57 ?

If you'd consider closing it and reviewing this one we can maybe get the best of both worlds 🌻

@erkutilaslan erkutilaslan mentioned this pull request Nov 12, 2024
11 tasks
Copy link

@erkutilaslan erkutilaslan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a small typo in README.md line 34.
Rest seems good to me.

README.md Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants