nf-core/scscape is a bioinformatics pipeline that was built for multi-sample single cell analysis downstream from the generation of count matrices. The pipeline operates using many functional components derived from the Seurat R package. Input data is expected to be in the format of barcodes, features, and matrix files. Output includes Seurat objects that contain QC metrics, identified cell clusters, and dimensionally reduced projections that encompass the experiments gene expression variability.
- Gzip all raw input files for consistency
- Initialize seurat object for each sample
- Normalize gene expression counts & perform mitochondrial / cell-cycle scoring
- Detect and remove suspected doublets from each sample
- Merge - normalize - find variable features - scale data (SCTransform)
- Run principal component analysis
- Perform integration to remove technical confounding variables
- Find k nearest-neighbors & cluster (Louvain)
- Dimensionally reduce expression variance and plot
The nf-core/scscape pipeline comes with documentation about the pipeline usage, parameters, and output.
Note: If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with
-profile test
before running the workflow on actual data.
First, prepare a sample sheet with your input data that looks as follows:
Samples.csv
:
id,data_directory,mt_cc_rm_genes
00_dpa_1,/filtered_feature_bc_matrix/,AuxillaryGeneList.csv
Each row represents a samples matrix files (barcodes.tsv, features.tsv, matrix.mtx) and associated genes used in the analysis.
Second, add mitochondrial, S phase, G2 / M phase, removal genes
AuxillaryGeneList.csv
:
MTgenes,G2Mgenes,Sgenes,RMgenes
mt-nd1,hmgb2a,mcm5,
mt-nd2,cdk1,pcna,
Finally, construct a segmentation file defining the analysis groups for the experiment (ex: treatment, rep, age, sex).
segmentation.csv
:
id,00_dpa,04_dpa,all
00_dpa_1,true,false,true
00_dpa_2,true,false,true
04_dpa_1,false,true,true
04_dpa_2,false,true,true
Make sure id columns match between
segmentation.csv
&Samples.csv
Now, you can run the pipeline using:
nextflow run nf-core/scscape \
-profile docker \
-params-file paramaters.json \
-c custom.config
Warning: Please provide pipeline parameters via the CLI or Nextflow
-params-file
option. Custom config files including those provided by the-c
Nextflow option can be used to provide any configuration except for parameters; see docs.
Note: There is the ability to create a
.loupe
file within the configuration options of this pipeline. This file can be used with the 10x Loupe Browser to interactively explore your single cell experiment. In order to successfully generate the file, you are required by 10x to both read the 10x End User License Agreement and accept their terms by setting theeula_agreement
parameter toAgree
(in addition to settingmakeLoupe
totrue
).
For more details and further functionality, please refer to the usage documentation and the parameter documentation.
To see the results of an example test run with a full size dataset refer to the results tab on the nf-core website pipeline page. For more details about the output files and reports, please refer to the output documentation.
nf-core/scscape was originally written by Ryan Seaman, Riley Grindle, Joel Graber.
We thank the following people for their extensive assistance in the development of this pipeline:
If you would like to contribute to this pipeline, please see the contributing guidelines.
For further information or help, don't hesitate to get in touch on the Slack #scscape
channel (you can join with this invite).
An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md
file.
You can cite the nf-core
publication as follows:
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.