genoFreq is designed to process very large VCF files (Terabytes) with high speed & very low memory consumption.
git clone --recursive https://github.com/mr-eyes/genoFreq.git
mkdir -p genoFreq/build && cd genoFreq/build/
cmake ..
make
python -m pip install -r requirements.txt
Run
./genoFreq <file.vcf/file.vcf.gz> <output_dir> <optional: max_haplotype_number, default=6>
Output Sample "TSV"
sample | ./. | 0/0 | 0/1 | 0/2 | 0/3 | 0/4 | 0/5 | 0/6 | 1/0 | 1/1 | 1/2 | 1/3 | 1/4 | 1/5 | 1/6 | 2/0 | 2/1 | 2/2 | 2/3 | 2/4 | 2/5 | 2/6 | 3/0 | 3/1 | 3/2 | 3/3 | 3/4 | 3/5 | 3/6 | 4/0 | 4/1 | 4/2 | 4/3 | 4/4 | 4/5 | 4/6 | 5/0 | 5/1 | 5/2 | 5/3 | 5/4 | 5/5 | 5/6 | 6/0 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SAMPLE1 | 1443 | 6168 | 98 | 3 | 0 | 0 | 0 | 0 | 0 | 18 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
SAMPLE2 | 3361 | 4259 | 58 | 2 | 0 | 0 | 0 | 0 | 0 | 49 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
SAMPLE3 | 2856 | 4740 | 98 | 1 | 0 | 0 | 0 | 0 | 0 | 35 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Run
python scripts/combine_freqs.py <file1.tsv> <file2.tsv> ...
Or
python scripts/combine_freqs.py ./*tsv
Run
python scripts/simplify.py <genpFreq.tsv>
Output Sample "TSV"
sample | ungenotyped | homo_WT | homo_MT | hetero | comp_het |
---|---|---|---|---|---|
SAMPLE1 | 1443 | 6168 | 18 | 101 | 0 |
SAMPLE2 | 3361 | 4259 | 50 | 60 | 0 |
SAMPLE3 | 2856 | 4740 | 35 | 99 | 0 |
Meta information file sample (flixble for changes)
SampleID | Breed | Sex | Coverage |
---|---|---|---|
SAMPLE1 | Breed1 | M | 9.62 |
SAMPLE2 | Breed2 | M | 13.69 |
SAMPLE3 | Breed3 | M | 13.69 |
Run
python scripts/merge.py <metaInfo.tsv> <genoFreq.tsv>
Sample
SampleID | Breed | Sex | Coverage | ungenotyped | homo_WT | homo_MT | hetero | comp_het |
---|---|---|---|---|---|---|---|---|
SAMPLE1 | Breed1 | M | 9.62 | 1443 | 6168 | 18 | 101 | 0 |
SAMPLE2 | Breed2 | M | 13.69 | 3361 | 4259 | 50 | 60 | 0 |
SAMPLE3 | Breed3 | M | 13.69 | 2856 | 4740 | 35 | 99 | 0 |
Run
python scripts/plot.py <merged_file.tsv> <x_axis> <y_axis>