Next-Generation Sequencing (NGS), also known as high-throughput sequencing, has opened the possibility of a comprehensive characterization of the genomic and epigenomic landscapes, giving answers to fundamental questions for biological and clinical research. In this context, our research has focused on discovering how heterogeneous DNA regions concur to determine particular biological processes or phenotypes.
Di4, 1D intervals incremental inverted index, is a multi-resolution single-dimension data structure for interval-based data queries. It implements characteristic operations to be performed on region data regard identifying co-occurrences of regions, from different biological tests and/or of distinct semantic types, possibly within a certain distance from each others and/or from DNA regions with known structural or functional properties.
Read documentation on:
- how to install Di4;
- how to index data using Di4;
- how to perform high-level queries/operations on the indexed data.
- Jalili, V., Matteucci, M., Goecks, J., Deldjoo, Y., & Ceri, S. (2018). Next generation indexing for genomic intervals. IEEE Transactions on Knowledge and Data Engineering.
- Jalili, V., Matteucci, M., Masseroli, M., & Ceri, S. (2017). Indexing next-generation sequencing data. Information Sciences, 384, 90-109.