Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Find rate of homozygosity

Answered
0

4 comments

  • Avatar
    Genevieve Brandt (she/her)

    Thank you for your post, Joyce Anon! I want to let you know we have received your question. We'll get back to you if we have any updates or follow up questions. 

    Unfortunately, our team is not able to provide support within one day. Please see our Support Policy for more details about how we prioritize responding to questions. 

     

    0
    Comment actions Permalink
  • Avatar
    Joyce Anon

    I figured out a way, thank you. 40% of the 4.4 million variants in the first 22 chromosomes (default settings HaplotypeCaller) are homozygous, is this typical? I can't seem to find typical numbers anywhere.

    0
    Comment actions Permalink
  • Avatar
    Joyce Anon

    After more searching, and a lucky period of better-than-usual brain function, I found my answer, these numbers are typical. The longest run of homozygosity is about 500k bp (from AutoMap), which is also typical.

    I got the numbers from a slightly modified a script from another forum that produces a table, then counted it with a regex search. This was the script:

    awk -v OFS="\t" '$0 !~ "^#" {hom_ref = 0; hom_alt = 0; het = 0; for(i=10;i<=NF;i++) { if($i ~ /0\|0/ || $i ~ /0\/0/) hom_ref++; else if($i ~ /1\|1/ || $i ~ /1\/1/) hom_alt++;  else het++; } print $1, $2, hom_ref, hom_alt, het}' ./output/nebulaTranchFiltered.vcf > homo.tsv

    Thank you, my questions are answered now.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Thank you for updating this post Joyce Anon! I'm sure this will be helpful for other GATK users in the future. 

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk