Genome Assembly

The link to my team's video presentation is here

The modern DNA revolution was brought about in 2001 by the completion of the human reference genome, which consisted of piecing together small fragments of DNA sequence into 24 full chromosomes and cost about $1 billion. Since 2001, computer scientists have made considerable headway on new algorithms for assembling genomes, and we will need to continue to improve our methods as we strive to complete reference genomes for all extant and extinct (i.e., Neanderthals) organisms.

In this project, my team designed and programmed a computing algorithm (using Python3) to sequence the target genome and produce an assembly via contigs and scaffolds based on paired-read data of DNA sequence. We also illustrated the algorithm and next steps for further improvement in a video presentation.

Genome sequencing is the process of decoding the sequence of nucleotides from some sample’s genome. Genome sequencing produces sequences or reads. Due to technical limitations, these reads are short. The most widely used sequencing platform produces reads that are 200 nucleotides long. Newer technologies are producing longer reads, but they are currently expensive and error-prone.

Genome assembly is the process of taking a large number of sequences and putting them back together to create a representation of the original genome.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
1.png		1.png
2.png		2.png
3.png		3.png
4.png		4.png
5.png		5.png
GenomeAssemblyCodes.ipynb		GenomeAssemblyCodes.ipynb
README.md		README.md
team.txt		team.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Genome Assembly

About

Releases

Packages

Languages

Tigerybr/Genome

Folders and files

Latest commit

History

Repository files navigation

Genome Assembly

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages