This project provides a robust and flexible Snakemake pipeline for genomic analysis, specifically tailored for analyzing data from chromosome 21 (chr21) in the human genome (hg38). The pipeline automates various steps including data preprocessing, alignment, variant calling, annotation, and reporting.
Key Features:
Automated Workflow: The pipeline automates the entire process from data acquisition to final analysis, reducing manual intervention and ensuring reproducibility.
Modular Design: Each step of the analysis is encapsulated into separate rules, making it easy to understand, modify, and extend the pipeline.
Quality Control: Quality control steps, such as FastQC analysis, are integrated to ensure data integrity and reliability of results.
Variant Calling: The pipeline performs variant calling using BCFtools, providing insights into genetic variations within the chr21 dataset.
Annotation: Variants are annotated using SnpEff, providing functional annotations and predicting the effects of variants on genes.
Customization: Users can easily customize the pipeline by modifying parameters, adding new rules, or integrating additional tools to suit specific analysis requirements.
Reporting: The pipeline generates informative reports, including summary statistics and visualizations, to aid in interpretation and decision-making.