The goal of ezASCAT
is to make life simpler while using ASCAT with tumor-normal pairs from WGS.
Although there exists ascatNgs, it requires installation of perl and C modules. ezASCAT
bypasses these requirements entirely within R with the C code baked in.
remotes::install_github(repo = "CompEpigen/ezASCAT")
Below command will generate two tsv files tumor_nucleotide_counts.tsv
and normal_nucleotide_counts.tsv
that can be used for downstream analysis. Note that the function will process ~900K SNPs from Affymetrix Genome-Wide Human SNP 6.0 Array. The process can be sped up by increasing nthreads
which will launch each chromosome on a separate thread.
Currently hg19
and hg38
are supported.
library("ezASCAT")
#Matched normal BAM files are strongly recommended
counts = ezASCAT::get_counts(t_bam = "tumor.bam", n_bam = "normal.bam", build = "hg19")
Below command will filter SNPs with low coverage (default <30), estimate BAF, logR, and generates the input files for ASCAT.
In addition, it will run ASCAT::ascat.loadData()
and ASCAT::ascat.plotRawData()
for you and returns the ASCAT object that can be further processed with ASCAT functions.
ascat.bc = prep_ascat(t_counts = "tumor_nucleotide_counts.tsv", n_counts = "normal_nucleotide_counts.tsv", sample_name = "tumor")
# Markers: 901235
# Removed 3072 duplicated loci
# Markers > 30: 25246
# ------
# Counts file: normal_nucleotide_counts.tsv
# Markers: 901235
# Removed 3072 duplicated loci
# Markers > 30: 31387
# ------
# Final number SNPs: 23765
# Generated following files:
# tumor.tumour.BAF.txt
# tumor.tumour.logR.txt
# tumor.normal.BAF.txt
# tumor.normal.logR.txt
# ------
# Running ASCAT::ascat.loadData:
# [1] Reading Tumor LogR data...
# [1] Reading Tumor BAF data...
# [1] Reading Germline LogR data...
# [1] Reading Germline BAF data...
# [1] Registering SNP locations...
# [1] Splitting genome in distinct chunks...
# Running ASCAT::ascat.plotRawData:
# [1] Plotting tumor data
# [1] Plotting germline data
# Returned ASCAT object
The returned ASCAT
object can be passed to downstream ASCAT functions:
ascat.bc = ASCAT::ascat.aspcf(ascat.bc)
ASCAT::ascat.plotSegmentedData(ascat.bc)
ascat.output = ASCAT::ascat.runAscat(ascat.bc)
> ascat.bc = ezASCAT::prep_ascat_t(t_counts = "tumor_nucleotide_counts.tsv", sample_name = "tumoronly")
# Library sizes:
# Tumor: 1239964831
# Counts file: tumor_nucleotide_counts.tsv
# Markers: 930104
# Removed 15 duplicated loci
# Markers > 30: 829579
# ------
# Median depth of coverage: 59
# Generated following files:
# tumoronly.tumour.BAF.txt
# tumoronly.tumour.logR.txt
# ------
# Running ASCAT::ascat.loadData:
# [1] Reading Tumor LogR data...
# [1] Reading Tumor BAF data...
# [1] Registering SNP locations...
# [1] Splitting genome in distinct chunks...
# Running ASCAT::ascat.plotRawData()
# [1] Plotting tumor data
# Returned ASCAT object!
The returned ASCAT
object can be processed with ASCAT without matched normal data protocol:
ascat.gg = ASCAT::ascat.predictGermlineGenotypes(ascat.bc)
ascat.bc = ASCAT::ascat.aspcf(ascat.bc,ascat.gg=ascat.gg)
ASCAT::ascat.plotSegmentedData(ascat.bc)
ascat.output = ASCAT::ascat.runAscat(ascat.bc)
Alternatively, tumor logR files generated by prep_ascat()
can be processed with segment_logR()
function which performs circular binary segmentation using DNAcopy and plots the results
> ezASCAT::segment_logR(tumor_logR = "tumor.tumour.logR.txt", sample_name = "tumor")
# Analyzing: tumor
# current chromosome: 1
# current chromosome: 2
# current chromosome: 3
# current chromosome: 4
# current chromosome: 5
# current chromosome: 6
# current chromosome: 7
# current chromosome: 8
# current chromosome: 9
# current chromosome: 10
# current chromosome: 11
# current chromosome: 12
# current chromosome: 13
# current chromosome: 14
# current chromosome: 15
# current chromosome: 16
# current chromosome: 17
# current chromosome: 18
# current chromosome: 19
# current chromosome: 20
# current chromosome: 21
# current chromosome: 22
# current chromosome: MT
# current chromosome: X
# current chromosome: Y
# Segments are written to: tumor_cbs.seg