Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Empty VCF file from Mutect2 v. GATK 4.1.6.0 in tumor only mode

0

7 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hello P M, is there any chance that your input BAM or reference files are empty?

    Could you send the error log of just running Mutect2?

    We do not provide support for minimap2 so I am not able to determine if there are issues with that part of your pipeline. Here is our support policy.

    0
    Comment actions Permalink
  • Avatar
    P M

    Thanks Genevieve Brandt (she/her) for your reply.

    Both my BAM & reference files are not empty. 

    From the log, the problem line appears to be

    265 read(s) filtered by: WellformedReadFilter

    265 total reads filtered

    However, I couldn't understand, why my reads are failing this filter. Would you have further insights?

    Here's my log from Mutect2. 

        java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /sysapps/cluster/software/GATK/4.1.8.1-Java-1.8.0_92/gatk-package-4.1.8.1-local.jar Mutect2 -R reference.fa -I sample1_rg.bam -O sample1_rg.vcf.gz

    17:14:03.165 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/sysapps/cluster/software/GATK/4.1.8.1-Java-1.8.0_92/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_compression.so

    Sep 29, 2020 5:14:03 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine

    INFO: Failed to detect whether we are running on Google Compute Engine.

    17:14:03.572 INFO  Mutect2 - ------------------------------------------------------------

    17:14:03.572 INFO  Mutect2 - The Genome Analysis Toolkit (GATK) v4.1.8.1

    17:14:03.573 INFO  Mutect2 - For support and documentation go to https://software.broadinstitute.org/gatk/

    17:14:03.588 INFO  Mutect2 - Executing as mudvarip2@ai-hpcn106.cm.cluster on Linux v3.10.0-327.36.1.el7.x86_64 amd64

    17:14:03.588 INFO  Mutect2 - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_92-b14

    17:14:03.588 INFO  Mutect2 - Start Date/Time: September 29, 2020 5:14:03 PM EDT

    17:14:03.588 INFO  Mutect2 - ------------------------------------------------------------

    17:14:03.588 INFO  Mutect2 - ------------------------------------------------------------

    17:14:03.589 INFO  Mutect2 - HTSJDK Version: 2.23.0

    17:14:03.589 INFO  Mutect2 - Picard Version: 2.22.8

    17:14:03.589 INFO  Mutect2 - HTSJDK Defaults.COMPRESSION_LEVEL : 2

    17:14:03.589 INFO  Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false

    17:14:03.589 INFO  Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true

    17:14:03.589 INFO  Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false

    17:14:03.589 INFO  Mutect2 - Deflater: IntelDeflater

    17:14:03.589 INFO  Mutect2 - Inflater: IntelInflater

    17:14:03.590 INFO  Mutect2 - GCS max retries/reopens: 20

    17:14:03.590 INFO  Mutect2 - Requester pays: disabled

    17:14:03.590 INFO  Mutect2 - Initializing engine

    17:14:04.265 INFO  Mutect2 - Done initializing engine

    17:14:04.311 INFO  NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/sysapps/cluster/software/GATK/4.1.8.1-Java-1.8.0_92/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_utils.so

    17:14:04.316 INFO  NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/sysapps/cluster/software/GATK/4.1.8.1-Java-1.8.0_92/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so

    17:14:04.356 WARN  NativeLibraryLoader - Unable to load libgkl_pairhmm_omp.so from native/libgkl_pairhmm_omp.so (/tmp/libgkl_pairhmm_omp90150183072861089.so: /sysapps/cluster/software/GCC/4.8.4/lib64/libgomp.so.1: version `GOMP_4.0' not found (required by /tmp/libgkl_pairhmm_omp90150183072861089.so))

    17:14:04.356 INFO  PairHMM - OpenMP multi-threaded AVX-accelerated native PairHMM implementation is not supported

    17:14:04.356 INFO  NativeLibraryLoader - Loading libgkl_pairhmm.so from jar:file:/sysapps/cluster/software/GATK/4.1.8.1-Java-1.8.0_92/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_pairhmm.so

    17:14:04.472 INFO  IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM

    17:14:04.473 WARN  IntelPairHmm - Ignoring request for 4 threads; not using OpenMP implementation

    17:14:04.473 INFO  PairHMM - Using the AVX-accelerated native PairHMM implementation

    17:14:04.546 INFO  ProgressMeter - Starting traversal

    17:14:04.546 INFO  ProgressMeter -        Current Locus  Elapsed Minutes     Regions Processed   Regions/Minute

    17:14:04.812 INFO  Mutect2 - 0 read(s) filtered by: MappingQualityReadFilter

    0 read(s) filtered by: MappingQualityAvailableReadFilter

    0 read(s) filtered by: MappingQualityNotZeroReadFilter

    0 read(s) filtered by: MappedReadFilter

    0 read(s) filtered by: NotSecondaryAlignmentReadFilter

    0 read(s) filtered by: NotDuplicateReadFilter

    0 read(s) filtered by: PassesVendorQualityCheckReadFilter

    0 read(s) filtered by: NonChimericOriginalAlignmentReadFilter

    0 read(s) filtered by: NonZeroReferenceLengthAlignmentReadFilter

    0 read(s) filtered by: ReadLengthReadFilter

    0 read(s) filtered by: GoodCigarReadFilter

    265 read(s) filtered by: WellformedReadFilter

    265 total reads filtered

    17:14:04.813 INFO  ProgressMeter -       Reference:2701              0.0                    19           4285.7

    17:14:04.813 INFO  ProgressMeter - Traversal complete. Processed 19 total regions in 0.0 minutes.

    17:14:04.841 INFO  VectorLoglessPairHMM - Time spent in setup for JNI call : 0.0

    17:14:04.841 INFO  PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 0.0

    17:14:04.841 INFO  SmithWatermanAligner - Total compute time in java Smith-Waterman : 0.00 sec

    17:14:04.841 INFO  Mutect2 - Shutting down engine

    [September 29, 2020 5:14:04 PM EDT] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 0.03 minutes.

    Runtime.totalMemory()=2491940864

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    P M here is documentation on the WelformedReadFilter. You can check if your BAM file has any issues with ValidateSamFile

    0
    Comment actions Permalink
  • Avatar
    P M

    Genevieve Brandt (she/her) Yes I have looked at the description of the WellformedReadFilter. I don't know which criteria my reads are not fulfilling. I have also used ValidateSamFile to check my BAM. There were no issues there. 

    I ran Mutect with the WellformedReadFilter disabled. I got a java.lang.ArrayIndexOutOfBoundsException: 251 error. Here's the log when I run with the filter disabled. 

    Running:

        java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /sysapps/cluster/software/GATK/4.1.8.1-Java-1.8.0_92/gatk-package-4.1.8.1-local.jar Mutect2 -DF WellformedReadFilter -R reference.fa -I sample1_rg.bam -O sample1_rg.vcf.gz

    22:12:32.393 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/sysapps/cluster/software/GATK/4.1.8.1-Java-1.8.0_92/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_compression.so

    Sep 29, 2020 10:12:32 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine

    INFO: Failed to detect whether we are running on Google Compute Engine.

    22:12:32.660 INFO  Mutect2 - ------------------------------------------------------------

    22:12:32.660 INFO  Mutect2 - The Genome Analysis Toolkit (GATK) v4.1.8.1

    22:12:32.660 INFO  Mutect2 - For support and documentation go to https://software.broadinstitute.org/gatk/

    22:12:32.663 INFO  Mutect2 - Executing as mudvarip2@ai-hpcn157.cm.cluster on Linux v3.10.0-327.36.1.el7.x86_64 amd64

    22:12:32.663 INFO  Mutect2 - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_92-b14

    22:12:32.663 INFO  Mutect2 - Start Date/Time: September 29, 2020 10:12:32 PM EDT

    22:12:32.663 INFO  Mutect2 - ------------------------------------------------------------

    22:12:32.663 INFO  Mutect2 - ------------------------------------------------------------

    22:12:32.663 INFO  Mutect2 - HTSJDK Version: 2.23.0

    22:12:32.664 INFO  Mutect2 - Picard Version: 2.22.8

    22:12:32.664 INFO  Mutect2 - HTSJDK Defaults.COMPRESSION_LEVEL : 2

    22:12:32.664 INFO  Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false

    22:12:32.664 INFO  Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true

    22:12:32.664 INFO  Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false

    22:12:32.664 INFO  Mutect2 - Deflater: IntelDeflater

    22:12:32.664 INFO  Mutect2 - Inflater: IntelInflater

    22:12:32.664 INFO  Mutect2 - GCS max retries/reopens: 20

    22:12:32.664 INFO  Mutect2 - Requester pays: disabled

    22:12:32.664 INFO  Mutect2 - Initializing engine

    22:12:33.171 INFO  Mutect2 - Done initializing engine

    22:12:33.188 INFO  NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/sysapps/cluster/software/GATK/4.1.8.1-Java-1.8.0_92/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_utils.so

    22:12:33.194 INFO  NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/sysapps/cluster/software/GATK/4.1.8.1-Java-1.8.0_92/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so

    22:12:33.226 WARN  NativeLibraryLoader - Unable to load libgkl_pairhmm_omp.so from native/libgkl_pairhmm_omp.so (/tmp/libgkl_pairhmm_omp1727963993330396650.so: /sysapps/cluster/software/GCC/4.8.4/lib64/libgomp.so.1: version `GOMP_4.0' not found (required by /tmp/libgkl_pairhmm_omp1727963993330396650.so))

    22:12:33.226 INFO  PairHMM - OpenMP multi-threaded AVX-accelerated native PairHMM implementation is not supported

    22:12:33.227 INFO  NativeLibraryLoader - Loading libgkl_pairhmm.so from jar:file:/sysapps/cluster/software/GATK/4.1.8.1-Java-1.8.0_92/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_pairhmm.so

    22:12:33.305 INFO  IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM

    22:12:33.306 WARN  IntelPairHmm - Ignoring request for 4 threads; not using OpenMP implementation

    22:12:33.306 INFO  PairHMM - Using the AVX-accelerated native PairHMM implementation

    22:12:33.387 INFO  ProgressMeter - Starting traversal

    22:12:33.389 INFO  ProgressMeter -        Current Locus  Elapsed Minutes     Regions Processed   Regions/Minute

    22:12:33.498 INFO  VectorLoglessPairHMM - Time spent in setup for JNI call : 0.0

    22:12:33.499 INFO  PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 0.0

    22:12:33.499 INFO  SmithWatermanAligner - Total compute time in java Smith-Waterman : 0.00 sec

    22:12:33.500 INFO  Mutect2 - Shutting down engine

    [September 29, 2020 10:12:33 PM EDT] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 0.02 minutes.

    Runtime.totalMemory()=2138046464

    java.lang.ArrayIndexOutOfBoundsException: 251

    at org.broadinstitute.hellbender.utils.read.SAMRecordToGATKReadAdapter.getBaseQuality(SAMRecordToGATKReadAdapter.java:304)

    at org.broadinstitute.hellbender.utils.pileup.PileupElement.getQual(PileupElement.java:161)

    at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2Engine.isNextToUsefulSoftClip(Mutect2Engine.java:537)

    at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2Engine.altQuals(Mutect2Engine.java:465)

    at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2Engine.isActive(Mutect2Engine.java:375)

    at org.broadinstitute.hellbender.engine.AssemblyRegionIterator.loadNextAssemblyRegion(AssemblyRegionIterator.java:136)

    at org.broadinstitute.hellbender.engine.AssemblyRegionIterator.<init>(AssemblyRegionIterator.java:96)

    at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:188)

    at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:173)

    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1049)

    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)

    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)

    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)

    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)

    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)

    at org.broadinstitute.hellbender.Main.main(Main.java:289)

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    P M I don't think the best solution is to disable that filter, from our documentation, if a read fails that filter it has "major internal inconsistencies and issues that could lead to errors downstream". 

    How many reads are in your input file for Mutect2?

    0
    Comment actions Permalink
  • Avatar
    P M

    Genevieve Brandt (she/her) I have 265 reads for this sample. It is amplicon sequencing. 

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    P M do you have reads that pass the criteria listed in the Welformed Read Filter documentation but are being filtered?

    • Alignment coordinates: start larger than 0 and end after the start position.
    • Alignment agrees with header: contig exists and start is within its range.
    • Read Group and Sequence are present
    • Consistent read length: bases match in length with the qualities and the CIGAR string.
    • Do not contain skipped regions: represented by the 'N' operator in the CIGAR string.
    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk