The High-throughput analysis pipeline of RNA modifications
Post-transcriptional mRNA modifications play substantial roles of regulating biological processes in plants. The workflow here is to effectively classify, characterize, and compare a variety of RNA modifications identified from HAMR workflow derived from RNA-sequencing data.
Each raw dataset will be processed under the same parameter to characterize distribution pattern and other genomic features of different types of modifications. Five general topics will be performed for each dataset as follow:
- Genomic annotation of modifications based on gene annotation from respective species genomes.
- Calculations of numbers of modifications and portions of modified reads over modified reads from both gene and single locus level
- Comparisons of modification numbers from syntenic genes
- Identification of enriched motif(s) for each type of modification
- Comparison of gene density, modification numbers, and adjacent transposable elements frequency from the same dataset
- Reference genome sequences (FASTA)
- Reference genome gene annotation (gff/gff3/gtf)
- Known modifications position identified by HAMR (BED)
- Known modifications mapping reads counts and modified reads counts identified by HAMR (txt)
- Syntenic gene list
- Transposable elements annotation (gff)
General analysis for each dataset will be integrated to address certain biological questions of RNA modifications The capability of pipelines can be summarized as:
- Compare differentially modified genes (DMGs) from multiple experiments
- Generate enriched gene ontology and pathways for certain gene lists
- Comparisons of modifications of syntenic genes across species
- List of modified genes
- Statistics of modification derived from general analysis
- Comparative genomics results (gene synteny) from inter-species comparisons