Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to Universc #15

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## v.1.1alpha

Migrates Cell Ranger to UniverSC which supports additional UMI-based single-cell technologies

## v1.0dev - [date]

Initial release of nf-core/demultiplex, created with the [nf-core](http://nf-co.re/) template.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool
4. Single cell 10X sample processes (CONDITIONAL):
NOTE: Must create CONFIG to point to CellRanger genome References
1. Cell Ranger mkfastq runs only when 10X samples exist. This will run the process with [`CellRanger`](https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger), [`CellRanger ATAC`](https://support.10xgenomics.com/single-cell-atac/software/pipelines/latest/what-is-cell-ranger-atac), and [`Cell Ranger DNA`](https://support.10xgenomics.com/single-cell-dna/software/pipelines/latest/what-is-cell-ranger-dna) depending on which sample sheet has been created.
2. Cell Ranger Count runs only when 10X samples exist. This will run the process with [`Cell Ranger Count`](https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/count), [`Cell Ranger ATAC Count`](https://support.10xgenomics.com/single-cell-atac/software/pipelines/latest/using/count), and [`Cell Ranger DNA CNV`](https://support.10xgenomics.com/single-cell-dna/software/pipelines/latest/using/cnv)depending on the output from Cell Ranger mkfastq. 10X reference genomes can be downloaded from the 10X site, a new config would have to be created to point to the location of these. Must add config to point Cell Ranger to genome references if used outside the Crick profile.
2a. Cell Ranger Count runs only when 10X samples exist. This will run the process with [`Cell Ranger Count`](https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/count), [`Cell Ranger ATAC Count`](https://support.10xgenomics.com/single-cell-atac/software/pipelines/latest/using/count), and [`Cell Ranger DNA CNV`](https://support.10xgenomics.com/single-cell-dna/software/pipelines/latest/using/cnv)depending on the output from Cell Ranger mkfastq. 10X reference genomes can be downloaded from the 10X site, a new config would have to be created to point to the location of these. Must add config to point Cell Ranger to genome references if used outside the Crick profile.
2b. [UniverSC](https://github.com/minoda-lab/universc) runs for all single-cell technologies: e.g., DropSeq, ICELL8, SmartSeq3, SureCell if these are given
5. [`bcl2fastq`](http://emea.support.illumina.com/sequencing/sequencing_software/bcl2fastq-conversion-software.html) (CONDITIONAL):
1. Runs on either the original sample sheet that had no error prone samples or on the newly created sample sheet created from the extra steps.
2. This is only run when there are samples left on the sample sheet after removing the single cell samples.
Expand Down
1 change: 1 addition & 0 deletions bin/scrape_software_versions.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
'MultiQC': ['v_multiqc.txt', r"multiqc, version (\S+)"],
'bcl2fastq': ['v_bcl2fastq.txt', r"bcl2fastq v(\S+)"],
'CellRanger': ['v_cellranger.txt', r"cellranger mkfastq (\S+)"]
'UniverSC': ['v_universc.txt', r"UniverSC v(\S+)"]
#'CellRangerATAC': ['v_cellrangeratac.txt', r"CellRangerATAC, version (\S+)"],
#'CellRangerDNA': ['v_cellrangerdna.txt', r"CellRangerDNA, version (\S+)"],
}
Expand Down
6 changes: 3 additions & 3 deletions conf/base.config
Original file line number Diff line number Diff line change
Expand Up @@ -54,11 +54,11 @@ process {
withName: bcl2fastq_problem_SS {
container = 'nfcore/demultiplex:bcl2fastq-2.20.0'
}
withName: cellRangerCount {
container = 'nfcore/demultiplex:cellranger-3.0.2.9001'
withName: UniverSC {
container = 'nfcore/demultiplex:universc-1.0.3'
}
withName: cellRangerMkFastQ {
container = 'nfcore/demultiplex:cellranger-3.0.2.9001'
container = 'nfcore/demultiplex:universc-1.0.3'
}

}
1 change: 0 additions & 1 deletion docker/cellranger/Dockerfile

This file was deleted.

1 change: 1 addition & 0 deletions docker/universc/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
FROM tomkellygenetics/universc:1.0.3
9 changes: 7 additions & 2 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,9 @@ and processes data using the following steps:
* Recheck newly made sample sheet for any errors or problem samples that did not match any indexes in the Stats.json file. If there is still an issue the pipeline will exit at this stage.
* [bcl2fastq](#bcl2fastq) - converting bcl files to fastq, and demultiplexing (CONDITIONAL)
* Processes that only run if there are 10X samples on the sample sheet input (CONDITIONAL):
* [CellRanger](#cellranger) - demultiplexes raw base call (BCL) files generated by Illumina sequencers into FASTQ files and is a wrapper around Illumina's bcl2fastq
* [CellRangerCount](#cellrangercount) - performs alignment, filtering, barcode counting, and UMI counting
* [CellRanger](#cellranger) - demultiplexes raw base call (BCL) files generated by Illumina sequencers into FASTQ files and is a wrapper around Illumina's bcl2fastq to support 10x Genomics sample Index Sets
* Processes that only run if there are single-cell samples on the sample sheet input (CONDITIONAL):
* [UniverSC](#universc) - performs alignment, filtering, barcode counting, and UMI counting
* [FastQC](#fastqc) - read quality control
* [MultiQC](#multiqc) - aggregate report, describing results of the whole pipeline

Expand Down Expand Up @@ -54,6 +55,10 @@ and processes data using the following steps:
* `outs/metrics_summary.csv`
* Run summary metrics in CSV format

## UniverSC

[UniverSC](https://github.com/minoda-lab/universc) a flexible cross-platform single-cell data processing pipeline that enables demultiplexing any UMI-based technology. `launch_universc.sh` automatically renames and converts FASTQ file formats for compatibility with `cellranger count` which is then called. Presets are provided for various technologies including barcode whitelists. This provides the same summary information as for 10x above but does not support "Loupe" browser.

## FastQC

[FastQC](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) gives general quality metrics about your reads. It provides information about the quality score distribution across your reads, the per base sequence content (%T/A/G/C). You get information about adapter contamination and other overrepresented sequences.
Expand Down
5 changes: 3 additions & 2 deletions main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -436,7 +436,7 @@ cr_fqname_fqfile_ch
tuple(sampleID, projectName, refGenome, dataType, fastqDir) }
.set { cr_grouped_fastq_dir_sample_ch }

process cellRangerCount {
process UniverSC {
tag "${projectName}/${sampleID}"
publishDir "${params.outdir}/${runName}", mode: 'copy',
saveAs: { filename ->
Expand All @@ -459,7 +459,7 @@ process cellRangerCount {
script:
genome_ref_conf_filepath = params.cellranger_genomes.get(refGenome, false)
"""
cellranger count --id=$sampleID --transcriptome=${genome_ref_conf_filepath.tenx_transcriptomes} --fastqs=$fastqDir --sample=$sampleID
bash universc/launch_universc.sh --id $sampleID --technology "10x" --reference ${genome_ref_conf_filepath.tenx_transcriptomes} --file ${fastqDir}/${sampleID}
"""
}

Expand Down Expand Up @@ -804,6 +804,7 @@ process get_software_versions {
multiqc --version > v_multiqc.txt
echo \$(bcl2fastq --version 2>&1) > v_bcl2fastq.txt
cellranger mkfastq --version > v_cellranger.txt
bash universc/launch_universc.sh --version | tail -2 | head -n 1 | cut -d" " -f3 > v_universc.txt
#cellranger-atac --version > v_cellrangeratac.txt
#cellranger-dna --version > v_cellrangerdna.txt
scrape_software_versions.py &> software_versions_mqc.yaml
Expand Down