Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

How to choose interval_list?

0

6 comments

  • Avatar
    Genevieve Brandt (she/her)

    Thank you for your post, Yi Ren! I want to let you know we have received your question and will be moving it to the Community Discussions -> General Discussion topic, as the Germline topic is for reporting bugs and issues with GATK.

    We'll get back to you if we have any updates or follow up questions. Please see our Support Policy for more details about how we prioritize responding to questions. 

    0
    Comment actions Permalink
  • Avatar
    Sheryl

    Hi Genevieve-Brandt-she-her,

     

    Do you have an answer for this please?

    I can't find documentation anywhere what the difference is between  the wgs_coverage_regions.hg38.interval_list and wgs_evaluation_regions.hg38.interval_list.

    Do you recommend one for GenomicsDBImport??

    0
    Comment actions Permalink
  • Avatar
    Laura Gauthier

    Hi Sheryl,

    The WGS evaluation regions are typically used internally to assess coverage and a variety of other sample quality metrics and are chosen to minimize variability between samples.  For example, I believe that chrX may be entirely excluded from the evaluation regions so that we don't see systematic biases in mean coverage between males and females (i.e. chrX ploidy 1 versus ploidy 2).  The exome version of the evaluation regions probably excludes target padding.  You should use the wgs_calling_regions.hg38.interval_list for single-sample and joint calling.

    0
    Comment actions Permalink
  • Avatar
    Sheryl

    Thanks Laura Gauthier,

     

    Could you just confirm for me which I should be using to get the median of the coverage over the autosome for the mitochondrial workflow?

    I assume it's just a case of using the correct interval list with the CollectWgsMetrics tool?

    0
    Comment actions Permalink
  • Avatar
    Sheryl

    Sorry - I meant using either:

    wgs_coverage_regions.hg38.interval_list

    or 

    wgs_evaluation_regions.hg38.interval_list

    0
    Comment actions Permalink
  • Avatar
    Megan Shand

    I don't think it will make a substantial difference for the mitochondria workflow between those two interval lists, but we currently use the wgs_coverage_regions.hg38.interval_list with CollectWgsMetrics to get the autosomal median coverage.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk