-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mapping to repeats leads to deletions with low allele frequency #76
Comments
Dear Phil, Can you tell me if you tried to use the newer version of Sniffles and still get low frequency in such a region? I tried to improve this recently. |
Hi Fritz, I thought you might have come across this issue, it seems quite common in my data. Perhaps these repeat regions are susceptible to indels? I'm using v1.0.11, which I think is the most up to date version? Best, Phil |
Hi Phil, |
Oh my bad 1.11 is the newest. Sorry beeing jetlaged in Brussel at the moment... |
Ah, no worries. Any thoughts on alternative filtering criteria, other than AF, which might help us include some of these ones? |
I will need to think about it. I am up since yesterday.. |
Hello,
I'm analysing some Cryptococcus neoformans (a haploid fungus) PacBio genome data. I noticed something strange when I was looking at some deletions which had low allele frequency. When only part of a repeated region was deleted, sometimes NGMLR was not consistent with how it split the read. Here is a clear example.
There is a TTCTTCCCCC motif repeated four times in the reference genome. Most of the reads which map there only support there being one TTCTT part of the motif left (probably CCCCCTTCTTCCCCC), but the reads are mapped to different 'ends' of the 4-fold repeat in the reference genome. This means that the allele frequency is not as high as it should be, because each end of the deletion is only supported by around half the reads.
When I looked at the variants sniffles called, quite a lot of my deletions with low allele frequencies were in repeat regions.
I just wondered if there was a way to place these reads in repeat regions more consistently, as this would lead to more variants passing an allele frequency threshold of 80%.
Best,
Phil Ashton
The text was updated successfully, but these errors were encountered: