Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

low sequencing read depth on Y chromosome (ploidy=1)

Answered
0

3 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Karen Schwander,

    I'm not exactly sure why you are seeing this difference. Could you provide some examples so I can better understand what you are seeing?

    My first thought is that the Phred score likelihoods could be different with the different ploidies and so there would a discrepancy between the informative and uninformative reads. Take a look at this article and let me know what you think: https://gatk.broadinstitute.org/hc/en-us/articles/360035532252-Allele-Depth-AD-is-lower-than-expected

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Karen Schwander

    Hi Genevieve - I'm sorry for the delay in responding.  Your suggestion that this is a discrepancy between informative and uninformative reads does not seem to be the case here.  In fact, I am seeing the same issue with chromosome X. 

    In variants where the reference allele is called, we see a mismatch between the AD values between haploid and diploid calls.  Why should this be the case?  Reads mapped to X are mapped to X BEFORE we call the variants, so depth should be the same.  

    In variants where the alternate allele is called, we do NOT see this same issue.  

    In addition, the AD column for haploid calls seems to have "groups" of variants that all have identical depth.  For example, in these variants, we see a "group" with AD as 14,0, then 8,0, then 10,0, while the diploid AD values actually differ by variant.  Haploid depth seems to be erroneous.

    Please see the attached file as an example, and let me know what you think.  Any comment/thoughts are greatly appreciated!

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Thanks Karen for the update here! We definitely want to find out why you might be seeing these results.

    First, I want to point you to this article regarding HaplotypeCaller's algorithm: Local re-assembly and haplotype determination. HaplotypeCaller does local reassembly of the reads and so this can result in different depths in the output vs what you expect.

    We have a troubleshooting article to determine why variants certain variants are called. Even though it's not addressing your exact question, it would be very helpful if you provided some of the troubleshooting details referenced in that article. For example, screenshots of the bamout file in this region and also the full VCF lines. It's hard to know what exactly is happening here without all the information present.

    Could you follow up with these details, along with your GATK command? 

    Thank you,

    Genevieve

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk