Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Variant called in reads from ArtificialHaplotypeRG

Answered
0

4 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi yingchen69,

    Could you give more information about your workflow and the filtering steps after Mutect2? Do you see this after FilterMutectCalls?

    Here is a list of details that will help us as we look into this issue.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hello,

    Information that will specifically help this issue is to see the reads that support these insertions in the bamout and input bam for this region. You can find details about the bamout in this troubleshooting document: When HaplotypeCaller and Mutect2 do not call an expected variant.

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    ISmolicz

    Hi Genevieve Brandt (she/her),

    I am also observing reads in the Read group 'ArtificialHaplotypeRG' in the BAM that is generated when running Mutect2 with --bamout. I am using GATK 4.2.0.0. Following Mutect2, I processed the data through LearnReadOrientationModel, GetPileupSummaries, CalculateContamination and FilterMutectCalls.

    After excluding variants that did not pass filtering (i.e. do not have the PASS flag assigned by FilterMutectCalls) and applying some additional filters following variant annotation, variants identified in reads in the ArtificialHaplotypeRG group remain. The variants are present in either the ArtificialHaplotypeRG reads alone or both the ArtificialHaplotypeRG reads and real reads. The data is from targeted sequencing with a capture panel.

    Please could you advise on how to approach this situation? Is it possible to filter out the ArtificialHaplotypeRG reads in their entirety or alternatively, prevent the variants within these reads being accounted for following FilterMutectCalls? Or should another approach be taken?

    Thank you for your time and help. I look forward to your reply.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi ISmolicz,

    Thanks for writing in about this issue so that we can clarify any confusions or fix any problems. 

    First thing to clarify is that the bamout is a debugging tool and it is not meant for any further analysis. The ArtificialHaplotypeRG reads are created in the bamout in order to view the haplotypes that Mutect2 considered at each location. For each chosen haplotype you should have a new read with the tag ArtificialHaplotypeRG in addition to the other reads that support this haplotype. The bamout will also include reads with the tag ArtificialHaplotypeRG for all the other haplotypes considered. 

    Is your question regarding why the haplotypes you are seeing are chosen? Or, is it that there are variants in your final VCF that have no reads supporting them, only reads marked with the ArtificialHaplotypeRG? If this is the case, could you provide more information and examples? How many of these variants are you seeing and what does the bamout look like with one of these examples? 

    Best,

    Genevieve

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk