This repo includes a number of scripts used to perform a mixed model regression GWAS, with alternative model masking, as published in Ravenhall et al. 2018 https://doi.org/10.1371/journal.pgen.1007172. It is primarily meant as a personal archive of those scripts for later use.
- Core pipeline for performing EMMAX mixed model regression with allele masking.
- Input:
CHROMOSOME
- Chromosome, first command-line argumentSUBTYPE
- Subtype, second command-line argumentEMMAXdir
- location of emmaxTPED
- in 12 formatPHENO
- phenotype file: <CASE/CONTROL>, no headerCOVARS
- covariates file: <CASE/CONTROL> , no headerOUTDIR
- output directoryOUTPREFIX
- output prefixKINF
- kinship matrixSCRIPTDIR
- directory with supporting scripts
- Manhattan plot based on EMMAX output
- Supporting script for EMMAX_Pipeline.sh
- Plot SNP intensity for a given SNP, useful for validating classification for key candidates.
- Input:
out.XY.FORMAT
: Rows as SNPs, Columns as 'CHR:Chromosome, POS:Basepair, ID_0:Sample0, ..., ID_N:SampleN', calls as 'intensityX,intensityY'.out.GT.FORMAT
: Rows as SNPs, Columns as 'CHR:Chromosome, POS:Basepair, ID_0:Sample0, ..., ID_N:SampleN', calls as ./., 0/1, 1/1 etc.PLACEHOLDER.sample
: Case/Control sample file, requires 'scanID' (IDs) and 'caseorcontrol' ('CASE'/'CONTROL') columns
- Scrapbook of functions for calculating genomic inflation factor (lambda).
- Requires a .ps file as input.
- Functionality is present within mergePlotEMMAX.r
- Create a refined subset of the most significant SNPs from a combined EMMAX pipeline output.
- Run impute2 in parallel
- Input:
impute2dir
- location of impute2refDir
- reference file directory, containing genetic map, .hap and .legend filesrefName
- prefix of reference filesmapNamePre
- Pre-chromosome component of genetic map file namemapNamePost
- Post-chromosome component of genetic map file nametoImpFile
- .gen file for inputationoutprefix
- output prefixpsLimit
- number of sub-processes to spawn