Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Why do I see a variant called by mutect2 in my matched-normal sample when checking the results?

Answered
0

10 comments

  • Official comment
    Avatar
    Genevieve Brandt (she/her)

    Hi Jenifer,

    Thanks for sending that in. I also asked our developers about the --tumor-sample argument and they said it should not cause any issues.

    We took a closer look at this site to figure out why it is being called. One note is that it is an STR site. In the germline, the call is being filtered out as a polymerase slippage error. In the somatic samples, the call has passed filtering as a very borderline call. We would recommend for your case just to remove this site from your analysis, we agree that it doesn't seem to be a true variant.

    This was a failure of the normal artifact filter. For some cases where the evidence is bad in the germline and good in the tumor, then yes, the tumor should be kept. However, if both evidence is bad, the call should not pass filtering. Thanks for writing in about this site to the forum. We'll keep this issue in mind as we continue to improve Mutect2 and FilterMutectCalls.

    Best,

    Genevieve

    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Jenifer,

    I recommend that you walk through this troubleshooting document for those sites, When HaplotypeCaller and Mutect2 do not call an expected variant. It is a great resource for finding out why certain variants are called.

    Let me know if you have any follow up questions after taking a look.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Jenifer

    Hi Genevieve Brandt (she/her)!

    I've read the troubleshooting but I haven't found an answer to my question.

    I thought Mutect2 was designed to call just somatic calls and, therefore, not to call those variants which also exist in the normal-matched. In this case, the variant in the normal sample doesn't seem an artifact, but its vaf is low. So, I was wondering if Mutect2 don't take into account those variants existing in the matched-normal at low VAFs and, thus, don't consider them germline variants.

    Thanks!

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Jenifer could you share the command line and the VCF lines from Mutect2 and the normal sample? 

    You shouldn't be getting a call that is present in the matched normal.

    0
    Comment actions Permalink
  • Avatar
    Jenifer

    Sure Genevieve Brandt (she/her):

    This is an example:

    First two genotype fields refer to two different tumour samples and the third refers to the matched-normal. As you can see, there are six reads supporting the variant in the normal matched, I can see them also in IGV and they don't seem artifacts. Shouldn't this be considered as germline?

    19      9047457 .       GA      G       .       PASS    AS_FilterStatus=SITE;AS_SB_TABLE=892,1000|10,15;DP=1984;ECNT=2;GERMQ=93;MBQ=20,20;MFRL=121,119;MMQ=60,60;MPOS=7;NALOD=1.09;NLOD=71.71;POPAF=6.00;ROQ=93;RPA=4,3;RU=A;STR;STRQ=93;TLOD=4.95   GT:AD:AF:DP:F1R2:F2R1:SB        0/1:394,6:0.010:400:2,0:381,6:184,210,3,3       0/1:1159,13:0.011:1172:24,0:1095,13:544,615,4,9 0/0:339,6:0.011:345:7,0:309,6:164,175,3,3

    The command line used is this:

    gatk Mutect2 -R "$reference" \
    --tumor-sample $tumour1ID --normal-sample $normalID --input $tumour1Bam --input $tumour2Bam --input $normalBam -germline-resource "$germlineResource" \
    --f1r2-tar-gz "$outputf1r2" --output $out

    gatk LearnReadOrientationModel -I "$outDir"/mutect2.f1r2.tar.gz -O "$out1"

    gatk FilterMutectCalls -V "$outDir"/mutect2.raw.vcf -R "$reference" \
    --contamination-table "$contTable1" --contamination-table "$contTable2" --ob-priors "$orientModel" -O $out2

    Thanks!

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Jenifer what version of Mutect2 is this?

    0
    Comment actions Permalink
  • Avatar
    Jenifer

    It is gatk version 4.1.7.0

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Could you share the program log? You are using a deprecated argument --tumor-sample so I'm wondering if there was an issue with sample identity.

    0
    Comment actions Permalink
  • Avatar
    Jenifer

    This is the log (I'm not showing most progressMeter lines):

    jdk/8u181 loaded
    gatk/4.1.7.0 loaded
    Using GATK jar /mnt/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /mnt/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar
    Mutect2 -R /mnt/hs37d5.fa --tumor-sample tumour9A --normal-sample linfo9 --input /mnt/tumour9A/aligned.sorted.markedduplicates.bam --input /mnt/tumour9B/aligned.sorted.markedduplicates.bam --input /mnt/linfo9/aligned.sorted.markedduplicates.bam -germline-resource /mnt/somatic-b37_af-only-gnomad.raw.sites.vcf --f1r2-tar-gz /mnt/outDir/mutect2.f1r2.tar.gz --output /mnt/outDir/mutect2.raw.vcf
    10:37:38.761 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/gatk/4.1.7.0/gatk-package-4.1.7.0-local.ja
    r!/com/intel/gkl/native/libgkl_compression.so
    Oct 16, 2021 10:37:39 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    10:37:39.051 INFO  Mutect2 - ------------------------------------------------------------
    10:37:39.051 INFO  Mutect2 - The Genome Analysis Toolkit (GATK) v4.1.7.0
    10:37:39.051 INFO  Mutect2 - For support and documentation go to https://software.broadinstitute.org/gatk/
    10:37:39.052 INFO  Mutect2 - Executing as uscmgjbi@c7113 on Linux v3.10.0-862.14.4.el7.x86_64 amd64
    10:37:39.052 INFO  Mutect2 - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_181-b13
    10:37:39.052 INFO  Mutect2 - Start Date/Time: October 16, 2021 10:37:38 AM CEST
    10:37:39.052 INFO  Mutect2 - ------------------------------------------------------------
    10:37:39.052 INFO  Mutect2 - ------------------------------------------------------------
    10:37:39.053 INFO  Mutect2 - HTSJDK Version: 2.21.2
    10:37:39.053 INFO  Mutect2 - Picard Version: 2.21.9
    10:37:39.053 INFO  Mutect2 - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    10:37:39.053 INFO  Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    10:37:39.053 INFO  Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    10:37:39.053 INFO  Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    10:37:39.053 INFO  Mutect2 - Deflater: IntelDeflater
    10:37:39.053 INFO  Mutect2 - Inflater: IntelInflater
    10:37:39.053 INFO  Mutect2 - GCS max retries/reopens: 20
    10:37:39.053 INFO  Mutect2 - Requester pays: disabled
    10:37:39.053 INFO  Mutect2 - Initializing engine
    10:37:39.527 INFO  FeatureManager - Using codec VCFCodec to read file file:///mnt/somatic-b37_af-only-gnomad.raw.
    sites.vcf
    10:37:41.506 WARN  IndexUtils - Feature file "/mnt/somatic-b37_af-only-gnomad.raw.sites.vcf" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file
    10:37:42.707 WARN  IndexUtils - Index file /mnt/somatic-b37_af-only-gnomad.raw.sites.vcf.idx is out of date (index older than input file). Use IndexFeatureFile to make a new index.
    10:37:42.729 INFO  Mutect2 - Done initializing engine
    10:37:42.748 INFO  NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/mnt/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
    10:37:42.758 INFO  NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/mnt/gatk/4.1.7.0/gatk-package-4.1.7.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
    10:37:42.815 INFO  IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
    10:37:42.815 INFO  IntelPairHmm - Available threads: 24
    10:37:42.815 INFO  IntelPairHmm - Requested threads: 4
    10:37:42.815 INFO  PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
    10:37:42.896 INFO  ProgressMeter - Starting traversal
    10:37:42.897 INFO  ProgressMeter -        Current Locus  Elapsed Minutes     Regions Processed   Regions/Minute
    10:37:52.900 INFO  ProgressMeter -           1:11751569              0.2                 40240         241415.9
    10:38:16.614 INFO  ProgressMeter -           1:27023801              0.6                 92480         164579.6
    10:38:27.352 INFO  ProgressMeter -           1:27057605              0.7                 92600         124983.1
    10:38:38.606 INFO  ProgressMeter -           1:27061115              0.9                 92620          99755.9
    10:38:50.150 INFO  ProgressMeter -           1:27088588              1.1                 92720          82721.7
    10:39:02.223 INFO  ProgressMeter -           1:27090176              1.3                 92730          70139.3
    .
    .
    .
    .
    .
    .
    13:06:13.342 INFO  Mutect2 - 422103 read(s) filtered by: MappingQualityReadFilter
    0 read(s) filtered by: MappingQualityAvailableReadFilter
    0 read(s) filtered by: MappingQualityNotZeroReadFilter
    0 read(s) filtered by: MappedReadFilter
    0 read(s) filtered by: NotSecondaryAlignmentReadFilter
    0 read(s) filtered by: NotDuplicateReadFilter
    0 read(s) filtered by: PassesVendorQualityCheckReadFilter
    0 read(s) filtered by: NonChimericOriginalAlignmentReadFilter
    0 read(s) filtered by: NonZeroReferenceLengthAlignmentReadFilter
    0 read(s) filtered by: ReadLengthReadFilter
    0 read(s) filtered by: GoodCigarReadFilter
    0 read(s) filtered by: WellformedReadFilter
    422103 total reads filtered
    13:06:13.342 INFO  ProgressMeter -      hs37d5:35475803            148.5              10614927          71477.4
    13:06:13.342 INFO  ProgressMeter - Traversal complete. Processed 10614927 total regions in 148.5 minutes.
    13:06:13.918 INFO  VectorLoglessPairHMM - Time spent in setup for JNI call : 5.061114558
    13:06:13.918 INFO  PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 6159.711480608001
    13:06:13.918 INFO  SmithWatermanAligner - Total compute time in java Smith-Waterman : 400.05 sec
    13:06:13.919 INFO  Mutect2 - Shutting down engine
    [October 16, 2021 1:06:13 PM CEST] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 148.59 minutes.
    Runtime.totalMemory()=8131706880

     

    0
    Comment actions Permalink
  • Avatar
    Jenifer

    Genevieve Brandt (she/her) Thanks for the clarification, I understand. I'll remove this kind of variants then.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk