Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

GGA-mode gives PASS tag while the site failed in the normal workflow

0

13 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi xiucz,

    Take a look at this troubleshooting article we put together for questions like yours: https://gatk.broadinstitute.org/hc/en-us/articles/360043491652-When-HaplotypeCaller-and-Mutect2-do-not-call-an-expected-variant

    It should be able to give some solutions for the issue. Please let me know if you have further questions.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    xiucz

    Hi, Genevieve-Brandt-she-her

    After reading this article  and try the parameters, I still can't find the way to solve my problem. I think my dataset is not so particular, can you take a look at it?

    If the GGA-mode have the filter tags  of `map_qual` or `strand_bias`, I think it is ok.However, it gives `PASS` tag. It seems  that there is no reason to call nothing here. And only the GGA-mode can emit bamout file.

    Thank you,

    xiucz

    #run with the --debug parameter

    9:28:00.058 INFO IntelPairHmm - Available threads: 128
    19:28:00.058 INFO IntelPairHmm - Requested threads: 4
    19:28:00.058 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
    19:28:00.170 INFO ProgressMeter - Starting traversal
    19:28:00.170 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
    19:28:00.701 INFO Mutect2Engine - Assembling chr3:178935576-178935650 with 203 reads: (with overlap region = chr3:178935476-178935750)
    19:28:00.771 INFO ReadThreadingAssembler - Using kmer size of 10 in read threading assembler
    19:28:00.787 INFO ReadThreadingAssembler - Using kmer size of 25 in read threading assembler
    19:28:00.798 INFO ReadThreadingAssembler - Adding haplotype 275M from graph with kmer 10
    19:28:00.798 INFO ReadThreadingAssembler - Found only the reference haplotype in the assembly graph.
    19:28:00.798 INFO ReadThreadingAssembler - GATACAGTGTTTTAATGGGGGTGTTTCTTCATTCTTTTTTTCTTACTGGTTTTTACTTTTTAAATTTGAAAGCTTTGCAGGGATCATAAGGATCTGTTCAGGCAAAGAACATGAAAGGGTTTACATTTTTATCATTTTAGTGTTTCTTATTCTCTATATCAAAAACATTCACAGATAAGTTAACAAGATCCTCATCAGGAGGAAAAGTAAATTGTTCACTACCATCCTCTAGTATCCTAATCTGGTCTTGTTGTTGGCTAACTTCAGCAGTTACT
    19:28:00.798 INFO ReadThreadingAssembler - > Cigar = 275M : 275 score NaN ref false
    19:28:00.799 INFO EventMap - === Best Haplotypes ===
    19:28:00.800 INFO EventMap - GATACAGTGTTTTAATGGGGGTGTTTCTTCATTCTTTTTTTCTTACTGGTTTTTACTTTTTAAATTTGAAAGCTTTGCAGGGATCATAAGGATCTGTTCAGGCAAAGAACATGAAAGGGTTTACATTTTTATCATTTTAGTGTTTCTTATTCTCTATATCAAAAACATTCACAGATAAGTTAACAAGATCCTCATCAGGAGGAAAAGTAAATTGTTCACTACCATCCTCTAGTATCCTAATCTGGTCTTGTTGTTGGCTAACTTCAGCAGTTACT
    19:28:00.800 INFO EventMap - > Cigar = 275M
    19:28:00.800 INFO EventMap - >> Events = EventMap{}
    19:28:01.162 INFO Mutect2Engine - Assembling chr3:178936122-178936196 with 3715 reads: (with overlap region = chr3:178936022-178936296)
    19:28:01.380 INFO ReadThreadingAssembler - Using kmer size of 10 in read threading assembler
    19:28:01.552 INFO ReadThreadingAssembler - Using kmer size of 25 in read threading assembler
    19:28:01.555 INFO ReadThreadingAssembler - Adding haplotype 275M from graph with kmer 10
    19:28:01.555 INFO ReadThreadingAssembler - Adding haplotype 275M from graph with kmer 10
    19:28:01.556 INFO ReadThreadingAssembler - Found 2 candidate haplotypes of 2 possible combinations to evaluate every read against.
    19:28:01.556 INFO ReadThreadingAssembler - GAATTAAGGGAAAATGACAAAGAACAGCTCAAAGCAATTTCTACACGAGATCCTCTCTCTGAAATCACTGAGCAGGAGAAAGATTTTCTATGGAGTCACAGGTAAGTGCTAAAATGGAGATTCTCTGTTTCTTTTTCTTTATTACAGAAAAAATAACTGAATTTGGCTGATCTCAGCATGTTTTTACCATACCTATTGGAATAAATAAAGCAGAATTTACATGATTTTTAAACTATAAACATTGCCTTTTTAAAAACAATGGTTGTAAATTGATA
    19:28:01.556 INFO ReadThreadingAssembler - > Cigar = 275M : 275 score -0.003932270664306348 ref false
    19:28:01.556 INFO ReadThreadingAssembler - GAATTAAGGGAAAATGACAAAGAACAGCTCAAAGCAATTTCTACACGAGATCCTCTCTCTGAAATCACTAAGCAGGAGAAAGATTTTCTATGGAGTCACAGGTAAGTGCTAAAATGGAGATTCTCTGTTTCTTTTTCTTTATTACAGAAAAAATAACTGAATTTGGCTGATCTCAGCATGTTTTTACCATACCTATTGGAATAAATAAAGCAGAATTTACATGATTTTTAAACTATAAACATTGCCTTTTTAAAAACAATGGTTGTAAATTGATA
    19:28:01.556 INFO ReadThreadingAssembler - > Cigar = 275M : 275 score -2.0451055597673964 ref false
    19:28:01.556 INFO EventMap - === Best Haplotypes ===
    19:28:01.557 INFO EventMap - GAATTAAGGGAAAATGACAAAGAACAGCTCAAAGCAATTTCTACACGAGATCCTCTCTCTGAAATCACTGAGCAGGAGAAAGATTTTCTATGGAGTCACAGGTAAGTGCTAAAATGGAGATTCTCTGTTTCTTTTTCTTTATTACAGAAAAAATAACTGAATTTGGCTGATCTCAGCATGTTTTTACCATACCTATTGGAATAAATAAAGCAGAATTTACATGATTTTTAAACTATAAACATTGCCTTTTTAAAAACAATGGTTGTAAATTGATA
    19:28:01.557 INFO EventMap - > Cigar = 275M
    19:28:01.557 INFO EventMap - >> Events = EventMap{}
    19:28:01.558 INFO EventMap - GAATTAAGGGAAAATGACAAAGAACAGCTCAAAGCAATTTCTACACGAGATCCTCTCTCTGAAATCACTAAGCAGGAGAAAGATTTTCTATGGAGTCACAGGTAAGTGCTAAAATGGAGATTCTCTGTTTCTTTTTCTTTATTACAGAAAAAATAACTGAATTTGGCTGATCTCAGCATGTTTTTACCATACCTATTGGAATAAATAAAGCAGAATTTACATGATTTTTAAACTATAAACATTGCCTTTTTAAAAACAATGGTTGTAAATTGATA
    19:28:01.558 INFO EventMap - > Cigar = 275M
    19:28:01.558 INFO EventMap - >> Events = EventMap{chr3:178936091-178936091 [G*, A],}
    19:28:01.599 INFO Mutect2Engine - Assembling chr3:178936274-178936348 with 1852 reads: (with overlap region = chr3:178936174-178936448)
    19:28:01.661 INFO ReadThreadingAssembler - Using kmer size of 10 in read threading assembler
    19:28:01.718 INFO ReadThreadingAssembler - Using kmer size of 25 in read threading assembler
    19:28:01.720 INFO ReadThreadingAssembler - Adding haplotype 275M from graph with kmer 10
    19:28:01.720 INFO ReadThreadingAssembler - Found only the reference haplotype in the assembly graph.
    19:28:01.720 INFO ReadThreadingAssembler - ATAACTGAATTTGGCTGATCTCAGCATGTTTTTACCATACCTATTGGAATAAATAAAGCAGAATTTACATGATTTTTAAACTATAAACATTGCCTTTTTAAAAACAATGGTTGTAAATTGATATTTGTGGAAAATCATACTACATTGGTAGTTGGCACATTAAATGCTTTTTCTTACTCTGAATTCCTGATATGACTTTCTTTAGGATTGTTTAAAATATTCTAGTAGTTTTAGGTCAATTTAGATGTGATTTAGTTGGTCTAGATATTATAATT
    19:28:01.720 INFO ReadThreadingAssembler - > Cigar = 275M : 275 score NaN ref false
    19:28:01.720 INFO EventMap - === Best Haplotypes ===
    19:28:01.720 INFO EventMap - ATAACTGAATTTGGCTGATCTCAGCATGTTTTTACCATACCTATTGGAATAAATAAAGCAGAATTTACATGATTTTTAAACTATAAACATTGCCTTTTTAAAAACAATGGTTGTAAATTGATATTTGTGGAAAATCATACTACATTGGTAGTTGGCACATTAAATGCTTTTTCTTACTCTGAATTCCTGATATGACTTTCTTTAGGATTGTTTAAAATATTCTAGTAGTTTTAGGTCAATTTAGATGTGATTTAGTTGGTCTAGATATTATAATT
    19:28:01.720 INFO EventMap - > Cigar = 275M
    19:28:01.720 INFO EventMap - >> Events = EventMap{}
    19:28:01.808 INFO Mutect2 - 1292 read(s) filtered by: MappingQualityReadFilter
    0 read(s) filtered by: MappingQualityAvailableReadFilter
    0 read(s) filtered by: MappingQualityNotZeroReadFilter
    0 read(s) filtered by: MappedReadFilter
    667 read(s) filtered by: NotSecondaryAlignmentReadFilter
    34894 read(s) filtered by: NotDuplicateReadFilter
    0 read(s) filtered by: PassesVendorQualityCheckReadFilter
    0 read(s) filtered by: NonChimericOriginalAlignmentReadFilter
    0 read(s) filtered by: NonZeroReferenceLengthAlignmentReadFilter
    0 read(s) filtered by: ReadLengthReadFilter
    0 read(s) filtered by: GoodCigarReadFilter
    0 read(s) filtered by: WellformedReadFilter
    36853 total reads filtered
    19:28:01.809 INFO ProgressMeter - unmapped 0.0 9 329.7
    19:28:01.809 INFO ProgressMeter - Traversal complete. Processed 9 total regions in 0.0 minutes.
    19:28:01.914 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 0.0
    19:28:01.914 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 0.0
    19:28:01.915 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 0.00 sec
    19:28:01.969 INFO Mutect2 - Shutting down engine
    [November 17, 2021 7:28:01 PM CST] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 0.04 minutes.
    Runtime.totalMemory()=2496659456
    Tool returned:
    SUCCESS
    Using GATK jar ~/GATK/gatk-4.2.2.0/gatk-package-4.2.2.0-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar ~/GATK/gatk-4.2.2.0/gatk-package-4.2.2.0-local.jar Mutect2 -R ~/MED/database/hg19/gatk_bundle/ucsc.hg19.fasta -I sample.recalibrated.bam -tumor XHF -L tmp.bed --interval-padding 250 --germline-resource /MED/database/hg19/dbmore/af-only-gnomad.raw.sites.hg19.vcf.gz --f1r2-tar-gz XHF.f1r2.tar.gz -O XHF.vcf -bamout XHF.recalibrated.mt2.bam --debug

     

     

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Could you clarify what you mean by GGA-mode?

    0
    Comment actions Permalink
  • Avatar
    xiucz

    Sorry , maybe GGA(genotype given alleles)  is an old saying for `force-calling mode`. 

    In this section  (iv) Force-calling mode of https://gatk.broadinstitute.org/hc/en-us/articles/360037593851-Mutect2 .

     

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi xiucz,

    First, I would recommend trying this out with the same arguments for force calling mode and normal mode. Keep the germline resource and interval padding parameters consistent to get consistent results. 

    Next, could you try the arguments --linked-de-bruijn-graph then --recover-all-dangling-branches? These parameters can lead to better results.

    Could you share a higher quality screenshot of the bamout file from the force calling mode? I can't read which location is the one of interest. Please also highlight the location so I can see how many reads support each allele.

    Then finally, could you use the --debug-assembly-variants-out (contained in the newest 4.2.3.0 release) with the normal mode to see if the variant is detected by the assembly engine.

    Let me know what you find!

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    xiucz

    Hi , Genevieve-Brandt-she-her.

    Following your advice,

    1. Keeping the germline resource and interval padding parameters still give the result( force calling mode can call the site, while normal mode fail).

    2. Try the arguments --linked-de-bruijn-graph and --recover-all-dangling-branches , the site is not in the output vcf , however, it is called in the debug-assembly-variants-out vcf file.

    ##fileformat=VCFv4.2
    #CHROM POS ID REF ALT QUAL FILTER INFO
    chr3 178936091 . G A . . .

     

    Verison: gatk-4.2.3.0

    ~/gatk-4.2.3.0/gatk Mutect2 \
    -R ~/hg19/gatk_bundle/ucsc.hg19.fasta \
    -I XHF.recalibrated.bam -tumor XHF \
    -L tmp.bed --interval-padding 250 \
    --germline-resource ~/af-only-gnomad.raw.sites.hg19.vcf.gz \
    --f1r2-tar-gz XHF.f1r2.tar.gz \
    -O XHF.vcf -bamout XHF.recalibrated.mt2.bam \
    --debug-assembly-variants-out xxx.vcf \
    --linked-de-bruijn-graph true

    #log:
    23:44:00.264 INFO ProgressMeter - Starting traversal
    23:44:00.265 INFO ProgressMeter - Current Locus Elapsed Minutes Regions
    Processed Regions/Minute
    23:44:01.487 INFO Mutect2 - 1292 read(s) filtered by: MappingQualityReadFilter
    0 read(s) filtered by: MappingQualityAvailableReadFilter
    0 read(s) filtered by: MappingQualityNotZeroReadFilter
    0 read(s) filtered by: MappedReadFilter
    667 read(s) filtered by: NotSecondaryAlignmentReadFilter
    34894 read(s) filtered by: NotDuplicateReadFilter
    0 read(s) filtered by: PassesVendorQualityCheckReadFilter
    0 read(s) filtered by: NonChimericOriginalAlignmentReadFilter
    0 read(s) filtered by: NonZeroReferenceLengthAlignmentReadFilter
    0 read(s) filtered by: ReadLengthReadFilter
    0 read(s) filtered by: GoodCigarReadFilter
    0 read(s) filtered by: WellformedReadFilter
    36853 total reads filtered
    23:44:01.488 INFO ProgressMeter - unmapped 0.0 9 441.5
    23:44:01.488 INFO ProgressMeter - Traversal complete. Processed 9 total regions in 0.0 minutes.
    23:44:01.603 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 0.0
    23:44:01.604 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 0.0
    23:44:01.604 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 0.02 sec
    23:44:01.718 INFO Mutect2 - Shutting down engine
    [November 20, 2021 11:44:01 PM CST] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 0.04 minutes.
    Runtime.totalMemory()=2501902336
    Tool returned:
    SUCCESS
    ~/gatk-4.2.3.0/gatk Mutect2 \
    -R ~/hg19/gatk_bundle/ucsc.hg19.fasta \
    -I XHF.recalibrated.bam -tumor XHF \
    -L tmp.bed --interval-padding 250 \
    --germline-resource ~/af-only-gnomad.raw.sites.hg19.vcf.gz \
    --f1r2-tar-gz XHF.f1r2.tar.gz \
    -O XHF.vcf -bamout XHF.recalibrated.mt2.bam \
    --debug-assembly-variants-out xxx.vcf \
    --recover-all-dangling-branches true


    #log:
    23:51:03.029 INFO ProgressMeter - Starting traversal [80/1890]
    23:51:03.030 INFO ProgressMeter - Current Locus Elapsed Minutes Regions
    Processed Regions/Minute
    23:51:04.585 INFO Mutect2 - 1292 read(s) filtered by: MappingQualityReadFilter
    0 read(s) filtered by: MappingQualityAvailableReadFilter
    0 read(s) filtered by: MappingQualityNotZeroReadFilter
    0 read(s) filtered by: MappedReadFilter
    667 read(s) filtered by: NotSecondaryAlignmentReadFilter
    34894 read(s) filtered by: NotDuplicateReadFilter
    0 read(s) filtered by: PassesVendorQualityCheckReadFilter
    0 read(s) filtered by: NonChimericOriginalAlignmentReadFilter
    0 read(s) filtered by: NonZeroReferenceLengthAlignmentReadFilter
    0 read(s) filtered by: ReadLengthReadFilter
    0 read(s) filtered by: GoodCigarReadFilter
    0 read(s) filtered by: WellformedReadFilter
    36853 total reads filtered
    23:51:04.586 INFO ProgressMeter - unmapped 0.0
    9 347.0
    23:51:04.586 INFO ProgressMeter - Traversal complete. Processed 9 total regions in
    0.0 minutes.
    23:51:04.710 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 0.0
    23:51:04.710 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() :
    0.0
    23:51:04.710 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman
    : 0.02 sec
    23:51:04.830 INFO Mutect2 - Shutting down engine
    [November 20, 2021 11:51:04 PM CST] org.broadinstitute.hellbender.tools.walkers.mute
    ct.Mutect2 done. Elapsed time: 0.04 minutes.
    Runtime.totalMemory()=2506620928
    Tool returned:
    SUCCESS

    0
    Comment actions Permalink
  • Avatar
    xiucz

    And  a raw subset but not downsampling recalibrate.bam was here https://drive.google.com/drive/folders/1VaHbD2_SJokCbnXNAp-UId-4bNIWGtf-?usp=sharing

    Best,

    xiucz

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi xiucz,

    Thanks for that extra information. We are looking into the reasons why this site would appear in the debug file and not be called with the normal Mutect2 mode, while with force calling it passed filtering.

    1. First, could you decrease the tumor lod score cutoff in the normal Mutect2 command (set -emit-lod to 0) and share if the site is called, and if so, share the VCF line?
    2. Then next, could you use --genotype-germline-sites, just in case it is being tagged as a germline site.
    3. Third, could you share with us the force calling VCF file you used?

    Thank you,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    xiucz

    Hi , Genevieve-Brandt-she-her

    1. First, I set  `-emit-lod ` to 0, the site still can't be called.

    2. Using --genotype-germline-sites true, and my command line as below, the site still can't be called.

    ~/gatk/gatk-4.2.3.0/gatk Mutect2 \
    -R ~/database/hg19/gatk_bundle/ucsc.hg19.fasta \
    -I XHF.recalibrated.bam \
    -tumor XHF -L tmp.bed --interval-padding 250 \
    --germline-resource ~/hg19/dbmore/af-only-gnomad.raw.sites.hg19.vcf.gz \
    --f1r2-tar-gz XHF.f1r2.tar.gz \
    -O XHF.vcf -bamout XHF.recalibrated.mt2.bam \
    --linked-de-bruijn-graph true \
    --debug-assembly-variants-out xxx.vcf -emit-lod 0 \
    --genotype-germline-sites true


    3.  The force calling VCF is uploaded here, https://drive.google.com/file/d/1mbQJk4WeBiiMNmn-Dvbe8sB1UwC6_M6d/view?usp=sharing . And the force calling VCF is imitative,  not from the same patient(I  thought that, only the coordinate of site from the force calling VCF was used). 

    Please focus on the `chr3:178936091`, the other two sites in the force calling VCF are just for testing.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    We are still looking into this issue. Many of our team members are out of the office because of the holidays, we will get back to you with more information as soon as we can.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi xiucz,

    We have not been able to isolate what is the cause of this problem and we would like to take a closer look so that we can give you some answers and hopefully a solution. Would you be able to upload a small bam file that recreates this issue, along with the other files needed to run Mutect2? Please follow the instructions in this article to upload your files to our FTP site: https://gatk.broadinstitute.org/hc/en-us/articles/360035889671. Let me know when you are done and the name of the file folder to look for.

    Thank you,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    xiucz

    Genevieve-Brandt-she-her

    I have uploaded the files, and the name is /xiucz .

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Thank you! I'll let you know when I have an update.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk