Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

In case there is no vcf file from genome project, is it still possible to use the  GetPileupSummaries?

0

4 comments

  • Avatar
    Anthony Dias-Ciarla

    Hi Arye Harel,

    Thank you for writing to the GATK forum! Before I bring this to our developers, can you confirm that what you posted is your entire program log? If not, please include it in its entirety.

    I look forward to hearing back from you!

    Best,
    Anthony

    0
    Comment actions Permalink
  • Avatar
    Arye Harel

    Dear Anthony

    The entire log is very long, I present the important details.

    More important- I am not sure it makes sense to use vcf resulting from Mutect2. 
    Thus the important questions are:
    * Is there a way to run GetPileupSummaries  without this  file?
    * What is the best way to generate such vcf?
    * Considering we are using "Tumor-only mode" without  VCF of common germline variant sites (or a PoN),  is it still more correct to use Mutect2 over  HaplotypeCaller to identifying somatic mutations?

     

    Thank you for your help,

     

    Arik

     

     

     

     

     

    0
    Comment actions Permalink
  • Avatar
    Anthony Dias-Ciarla

    Hi Arye Harel,

    Thank you for your much-appreciated patience while I reviewed your inquiry with our developers. I received some feedback to share with you.

    The short answer is no regarding your question on running GetPileupSummaries without the vcf. The vcf is not optional and is required to run the tool.

    Unfortunately, your Mutect2 output will not help identify somatic mutations in your case. Running in tumor-only mode will output a ton of germline mutations, not somatic ones.

    The workaround our developers suggest is to use HaplotypeCaller instead.

    1. You'll need to run the tool with multiple samples (between 20-50).
    2. Set your allele frequency cut off near 1/n or 2/n. These specifications will give you some rare variants, but that shouldn't matter.

    Here are some additional resources that may prove helpful:

    -BQSR Bootstrapping Forum Post
    -GATK Known Variants - Bootstrapping Article

    I hope this helps! Please let me know if this leads you to success. If any other questions come up in the meantime, please do not hesitate to reach out.

    Best,
    Anthony

     

    0
    Comment actions Permalink
  • Avatar
    Arye Harel

    Hi Anthony, 

    Thank you very much.

    >You'll need to run the tool with multiple samples (between 20-50).

    Will it work well for smaller experiment of 12 samples?

     

    >Set your allele frequency cut off near 1/n or 2/n. 

    Could not find "allele frequency" in the  HaplotypeCaller manual.
    Considering you have also recommended running multiple samples together which is not typical for haplotypecaller is it possible you have ment I should use "UnifiedGenotyper" (an old version of HaplotypeCaller) ?

     

     

     

     

    Thank you,

     

    Arik

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk