Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Empty final PON vcf file from 7 samples

0

17 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Yun Yu,

    Could you get more specific and let us know in which step the problem is occurring? Here are the details regarding information we need to solve problems.

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Yun Yu

    Hi Genevieve,

    I tried the 3 steps recommended to create a pon file. Firstly running Mutect2 in tumour model on the normal samples. I succeed in this step. Then use the GenomicsDBImport tool to creat a genomicDB. I think I also succeed in this step. Lastly, I run the CreateSomaticPanelOfNormals and I got a final vcf file, but with only header in the file, without any variants in the final pon vcf file. 

    gatk --java-options "-Xmx16G -XX:+UseParallelGC -XX:ParallelGCThreads=4" Mutect2 -R $reference -I $normal_bam -max-mnp-distance 0 -O ${sample}.mutect2.vcf.gz

    gatk --java-options "-Xmx16G -XX:+UseParallelGC -XX:ParallelGCThreads=4" GenomicsDBImport -R $reference -L canfam3.1.chr1.intervals --genomicsdb-workspace-path glp_tumour_genome/mutect2/pondb_chr1 -V DogWUR115.mutect2.vcf.gz -V DogWUR116.mutect2.vcf.gz -V DogWUR117.mutect2.vcf.gz -V DogWUR118.mutect2.vcf.gz -V DogWUR119.mutect2.vcf.gz -V DogWUR120.mutect2.vcf.gz -V DogWUR91.mutect2.vcf.gz

    gatk --java-options "-Xmx16G -XX:+UseParallelGC -XX:ParallelGCThreads=4" CreateSomaticPanelOfNormals -R $reference --min-sample-count 2 --germline-resource ${known_sites} -V gendb://pondb_chr1 -O Final_pon_chr1.vcf.gz

    The interval file included the first chromosome. By inspection into vcf files from each normal sample, I confimed that there are common SNPs across samples. However, the final pon vcf is empty, which is confusing me.

    The log info is as follows.

    Dec 17, 2020 2:01:27 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    14:01:27.566 INFO GenomicsDBImport - ------------------------------------------------------------
    14:01:27.566 INFO GenomicsDBImport - The Genome Analysis Toolkit (GATK) v4.1.8.1
    14:01:27.566 INFO GenomicsDBImport - For support and documentation go to https://software.broadinstitute.org/gatk/
    14:01:27.567 INFO GenomicsDBImport - Executing as yu052@node105 on Linux v3.10.0-957.el7.x86_64 amd64
    14:01:27.567 INFO GenomicsDBImport - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_201-b09
    14:01:27.567 INFO GenomicsDBImport - Start Date/Time: December 17, 2020 2:01:27 PM CET
    14:01:27.567 INFO GenomicsDBImport - ------------------------------------------------------------
    14:01:27.567 INFO GenomicsDBImport - ------------------------------------------------------------
    14:01:27.568 INFO GenomicsDBImport - HTSJDK Version: 2.23.0
    14:01:27.568 INFO GenomicsDBImport - Picard Version: 2.22.8
    14:01:27.568 INFO GenomicsDBImport - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    14:01:27.568 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    14:01:27.568 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    14:01:27.568 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    14:01:27.568 INFO GenomicsDBImport - Deflater: IntelDeflater
    14:01:27.568 INFO GenomicsDBImport - Inflater: IntelInflater
    14:01:27.568 INFO GenomicsDBImport - GCS max retries/reopens: 20
    14:01:27.568 INFO GenomicsDBImport - Requester pays: disabled
    14:01:27.568 INFO GenomicsDBImport - Initializing engine
    14:01:28.845 INFO IntervalArgumentCollection - Processing 122678785 bp from intervals
    14:01:28.942 INFO GenomicsDBImport - Done initializing engine
    14:01:29.231 INFO GenomicsDBLibLoader - GenomicsDB native library version : 1.3.0-e701905
    14:01:29.233 INFO GenomicsDBImport - Vid Map JSON file will be written to /lustre/nobackup/WUR/ABGC/yu052/wgs/glp_tumour_genome/mutect2/pondb_chr1/vidmap.json
    14:01:29.233 INFO GenomicsDBImport - Callset Map JSON file will be written to /lustre/nobackup/WUR/ABGC/yu052/wgs/glp_tumour_genome/mutect2/pondb_chr1/callset.json
    14:01:29.233 INFO GenomicsDBImport - Complete VCF Header will be written to /lustre/nobackup/WUR/ABGC/yu052/wgs/glp_tumour_genome/mutect2/pondb_chr1/vcfheader.vcf
    14:01:29.234 INFO GenomicsDBImport - Importing to workspace - /lustre/nobackup/WUR/ABGC/yu052/wgs/glp_tumour_genome/mutect2/pondb_chr1
    14:01:29.234 INFO ProgressMeter - Starting traversal
    14:01:29.234 INFO ProgressMeter - Current Locus Elapsed Minutes Batches Processed Batches/Minute
    14:01:29.807 INFO GenomicsDBImport - Importing batch 1 with 7 samples
    [E::vcf_parse_format] Invalid character '.' in 'AF' FORMAT field at 1:1862
    [E::vcf_parse_format] Invalid character '.' in 'AF' FORMAT field at 1:2094
    [E::vcf_parse_format] Invalid character '.' in 'AF' FORMAT field at 1:2877
    [E::vcf_parse_format] Invalid character '.' in 'AF' FORMAT field at 1:1221
    [E::vcf_parse_format] Invalid character '.' in 'AF' FORMAT field at 1:1862
    [E::vcf_parse_format] Invalid character '.' in 'AF' FORMAT field at 1:1862
    [E::vcf_parse_format] Invalid character '.' in 'AF' FORMAT field at 1:1862
    14:01:30.991 INFO GenomicsDBImport - Done importing batch 1/1
    14:01:30.992 INFO ProgressMeter - 1:1 0.0 1 34.1
    14:01:30.993 INFO ProgressMeter - Traversal complete. Processed 1 total batches in 0.0 minutes.
    14:01:30.993 INFO GenomicsDBImport - Import completed!
    14:01:30.993 INFO GenomicsDBImport - Shutting down engine
    [December 17, 2020 2:01:30 PM CET] org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport done. Elapsed time: 0.06 minutes.
    Runtime.totalMemory()=758120448
    Using GATK jar /lustre/nobackup/WUR/ABGC/yu052/program/gatk/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx16G -XX:+UseParallelGC -XX:ParallelGCThreads=4 -jar /lustre/nobackup/WUR/ABGC/yu052/program/gatk/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar CreateSomaticPanelOfNormals -R /lustre/backup/WUR/ABGC/yu052/CanFam3.1/ensembl/Canis_familiaris.CanFam3.1.dna.toplevel.fa --germline-resource /lustre/nobackup/WUR/ABGC/yu052/wgs/722g.990.SNP.INDEL.chrAll.vcf.chr.change.af.gz -V gendb://pondb_chr1 -O Final_pon_chr1.vcf.gz
    14:01:33.167 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/lustre/nobackup/WUR/ABGC/yu052/program/gatk/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
    Dec 17, 2020 2:01:33 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    14:01:33.355 INFO CreateSomaticPanelOfNormals - ------------------------------------------------------------
    14:01:33.356 INFO CreateSomaticPanelOfNormals - The Genome Analysis Toolkit (GATK) v4.1.8.1
    14:01:33.356 INFO CreateSomaticPanelOfNormals - For support and documentation go to https://software.broadinstitute.org/gatk/
    14:01:33.356 INFO CreateSomaticPanelOfNormals - Executing as yu052@node105 on Linux v3.10.0-957.el7.x86_64 amd64
    14:01:33.356 INFO CreateSomaticPanelOfNormals - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_201-b09
    14:01:33.356 INFO CreateSomaticPanelOfNormals - Start Date/Time: December 17, 2020 2:01:33 PM CET
    14:01:33.356 INFO CreateSomaticPanelOfNormals - ------------------------------------------------------------
    14:01:33.356 INFO CreateSomaticPanelOfNormals - ------------------------------------------------------------
    14:01:33.357 INFO CreateSomaticPanelOfNormals - HTSJDK Version: 2.23.0
    14:01:33.357 INFO CreateSomaticPanelOfNormals - Picard Version: 2.22.8
    14:01:33.357 INFO CreateSomaticPanelOfNormals - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    14:01:33.357 INFO CreateSomaticPanelOfNormals - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    14:01:33.357 INFO CreateSomaticPanelOfNormals - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    14:01:33.357 INFO CreateSomaticPanelOfNormals - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    14:01:33.357 INFO CreateSomaticPanelOfNormals - Deflater: IntelDeflater
    14:01:33.357 INFO CreateSomaticPanelOfNormals - Inflater: IntelInflater
    14:01:33.357 INFO CreateSomaticPanelOfNormals - GCS max retries/reopens: 20
    14:01:33.357 INFO CreateSomaticPanelOfNormals - Requester pays: disabled
    14:01:33.357 WARN CreateSomaticPanelOfNormals -

    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

    Warning: CreateSomaticPanelOfNormals is a BETA tool and is not yet ready for use in production

    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


    14:01:33.357 INFO CreateSomaticPanelOfNormals - Initializing engine
    14:01:33.745 INFO FeatureManager - Using codec VCFCodec to read file file:///lustre/nobackup/WUR/ABGC/yu052/wgs/722g.990.SNP.INDEL.chrAll.vcf.chr.change.af.gz
    14:01:34.131 INFO GenomicsDBLibLoader - GenomicsDB native library version : 1.3.0-e701905
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field AS_UNIQ_ALT_READ_COUNT - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field CONTQ - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field ECNT - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field GERMQ - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field MBQ - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field MFRL - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field MMQ - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field MPOS - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field NALOD - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field NCount - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field NLOD - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field OCM - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field PON - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field POPAF - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field ROQ - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field RPA - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field RU - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field SEQQ - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field STR - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.223 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field STRANDQ - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.224 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field STRQ - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.224 info NativeGenomicsDB - pid=93228 tid=93229 No valid combination operation found for INFO field TLOD - the field will NOT be part of INFO fields in the generated VCF records
    14:01:34.350 INFO CreateSomaticPanelOfNormals - Done initializing engine
    14:01:34.405 INFO ProgressMeter - Starting traversal
    14:01:34.405 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
    GENOMICSDB_TIMER,GenomicsDB iterator next() timer,Wall-clock time(s),0.0,Cpu time(s),0.0
    14:01:34.690 INFO ProgressMeter - unmapped 0.0 0 0.0
    14:01:34.690 INFO ProgressMeter - Traversal complete. Processed 0 total variants in 0.0 minutes.
    14:01:34.703 INFO CreateSomaticPanelOfNormals - Shutting down engine
    [December 17, 2020 2:01:34 PM CET] org.broadinstitute.hellbender.tools.walkers.mutect.CreateSomaticPanelOfNormals done. Elapsed time: 0.03 minutes.
    Runtime.totalMemory()=479199232

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Yun Yu,

    Thanks for all of the information, it is very helpful! Could you check if the variants are present after the GenomicsDBImport step? Use SelectVariants to confirm it was successful and see the combined VCF.

    0
    Comment actions Permalink
  • Avatar
    Yun Yu

    Hi Genevieve,

    Thanks for your reply!

    I got an empty combined VCF using SelectVariants from GenomicsDB. So it seems that the Genomics DBImport step failed even though some files presented in the genomicsdb work path. How can I resolve this problem?

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    No problem, glad I can help!

    Could you also check that the first Mutect2 step was successful? Check these files  ${sample}.mutect2.vcf.gz. You may also want to run one file at a time so you can examine the stack trace.

     

    0
    Comment actions Permalink
  • Avatar
    Yun Yu

    I checked the ${sample}.mutect2.vcf.gz files. I think the first Mutect2 step was successful. The vcf files are big in a few Mb and have SNPs there.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Could you share the stack trace to verify? I don't see any issues from the GenomicsDBImport step from what you have shared so far.

    0
    Comment actions Permalink
  • Avatar
    Yun Yu

    load bwa 0.7.15 (gcc) library and binaries.
    load htslib 1.9 (GCC) library and binaries.
    load samtools 1.9 (GCC) library and binaries.
    load java jdk 1.8.0_201 library and binaries.
    Using GATK jar /lustre/nobackup/WUR/ABGC/yu052/program/gatk/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx16G -XX:+UseParallelGC -XX:ParallelGCThread
    s=4 -jar /lustre/nobackup/WUR/ABGC/yu052/program/gatk/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar Mutect2 -R /lustre/backup/WUR/ABGC/yu052/CanFam3.1/ensembl/Canis_familiaris.CanFam3.1.dna.toplevel.fa -I /lustre/
    nobackup/WUR/ABGC/yu052/wgs/glp_tumour_genome/DogWUR91/DogWUR91.sorted.Mardup.bqsr.bam -max-mnp-distance 0 -O DogWUR91.mutect2.vcf.gz
    08:46:55.467 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/lustre/nobackup/WUR/ABGC/yu052/program/gatk/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_compress
    ion.so
    Dec 15, 2020 8:46:55 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    08:46:55.672 INFO Mutect2 - ------------------------------------------------------------
    08:46:55.672 INFO Mutect2 - The Genome Analysis Toolkit (GATK) v4.1.8.1
    08:46:55.672 INFO Mutect2 - For support and documentation go to https://software.broadinstitute.org/gatk/
    08:46:55.672 INFO Mutect2 - Executing as yu052@node100 on Linux v3.10.0-957.el7.x86_64 amd64
    08:46:55.672 INFO Mutect2 - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_201-b09
    08:46:55.672 INFO Mutect2 - Start Date/Time: December 15, 2020 8:46:55 AM CET
    08:46:55.672 INFO Mutect2 - ------------------------------------------------------------
    08:46:55.673 INFO Mutect2 - ------------------------------------------------------------
    08:46:55.673 INFO Mutect2 - HTSJDK Version: 2.23.0
    08:46:55.673 INFO Mutect2 - Picard Version: 2.22.8
    08:46:55.673 INFO Mutect2 - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    08:46:55.673 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    08:46:55.673 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    08:46:55.673 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    08:46:55.673 INFO Mutect2 - Deflater: IntelDeflater
    08:46:55.673 INFO Mutect2 - Inflater: IntelInflater
    08:46:55.673 INFO Mutect2 - GCS max retries/reopens: 20
    08:46:55.673 INFO Mutect2 - Requester pays: disabled
    08:46:55.673 INFO Mutect2 - Initializing engine
    08:46:56.215 INFO Mutect2 - Done initializing engine
    08:46:56.266 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/lustre/nobackup/WUR/ABGC/yu052/program/gatk/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_utils.so
    08:46:56.267 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/lustre/nobackup/WUR/ABGC/yu052/program/gatk/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_pairhmm_
    omp.so
    08:46:56.319 INFO IntelPairHmm - Using CPU-supported AVX-512 instructions
    08:46:56.319 INFO IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
    08:46:56.319 INFO IntelPairHmm - Available threads: 4
    08:46:56.319 INFO IntelPairHmm - Requested threads: 4
    08:46:56.319 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
    08:46:57.170 INFO ProgressMeter - Starting traversal
    08:46:57.170 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
    08:47:07.264 INFO ProgressMeter - 1:34912 0.2 220 1307.7
    08:47:19.699 INFO ProgressMeter - 1:77275 0.4 480 1278.4
    08:47:29.833 INFO ProgressMeter - 1:170481 0.5 860 1579.8
    08:47:39.835 INFO ProgressMeter - 1:377990 0.7 2000 2812.6
    08:47:49.852 INFO ProgressMeter - 1:762309 0.9 3910 4453.1
    08:47:59.855 INFO ProgressMeter - 1:1048975 1.0 5600 5360.1
    08:48:09.858 INFO ProgressMeter - 1:1653442 1.2 8230 6793.4
    08:48:20.024 INFO ProgressMeter - 1:2228589 1.4 10830 7842.8
    08:48:30.024 INFO ProgressMeter - 1:2811313 1.5 13490 8716.9
    08:48:40.038 INFO ProgressMeter - 1:3433598 1.7 16120 9402.3
    08:48:50.427 INFO ProgressMeter - 1:3968545 1.9 18610 9859.0

    ......

    13:58:48.287 INFO ProgressMeter - AAEX03025658.1:4043 1751.9 13347640 7619.2
    13:58:59.189 INFO ProgressMeter - AAEX03025670.1:2484 1752.0 13348010 7618.6
    13:59:10.608 INFO ProgressMeter - AAEX03025684.1:1319 1752.2 13348390 7618.0
    13:59:20.632 INFO ProgressMeter - AAEX03025691.1:770 1752.4 13348610 7617.4
    13:59:30.748 INFO ProgressMeter - AAEX03025705.1:901 1752.6 13349110 7616.9
    13:59:40.932 INFO ProgressMeter - AAEX03025718.1:1562 1752.7 13349430 7616.4
    13:59:51.187 INFO ProgressMeter - AAEX03025726.1:1434 1752.9 13349710 7615.8
    14:00:10.661 INFO ProgressMeter - AAEX03025726.1:4995 1753.2 13349730 7614.4
    14:00:20.663 INFO ProgressMeter - AAEX03025749.1:1 1753.4 13350350 7614.0
    14:00:30.720 INFO ProgressMeter - AAEX03025768.1:1053 1753.6 13350930 7613.6
    14:00:40.806 INFO ProgressMeter - AAEX03025782.1:2541 1753.7 13351370 7613.1
    14:00:52.133 INFO ProgressMeter - AAEX03025792.1:4310 1753.9 13351680 7612.5
    14:01:02.251 INFO ProgressMeter - AAEX03025805.1:222 1754.1 13351990 7611.9
    14:01:12.463 INFO ProgressMeter - AAEX03025819.1:4742 1754.3 13352440 7611.5
    14:01:22.482 INFO ProgressMeter - AAEX03025840.1:3827 1754.4 13352990 7611.0
    14:01:38.643 INFO ProgressMeter - AAEX03025857.1:1234 1754.7 13353420 7610.1
    14:01:51.566 INFO ProgressMeter - AAEX03025866.1:3846 1754.9 13353670 7609.3
    14:02:02.843 INFO ProgressMeter - AAEX03025878.1:1151 1755.1 13353980 7608.7
    14:02:14.942 INFO ProgressMeter - AAEX03025891.1:867 1755.3 13354360 7608.0
    14:02:25.285 INFO ProgressMeter - AAEX03025904.1:1738 1755.5 13354710 7607.5
    14:02:35.983 INFO ProgressMeter - AAEX03025918.1:1 1755.6 13355080 7606.9
    14:02:46.319 INFO ProgressMeter - AAEX03025942.1:3381 1755.8 13355670 7606.5
    14:02:56.584 INFO ProgressMeter - AAEX03025957.1:3273 1756.0 13356050 7606.0
    14:03:06.597 INFO ProgressMeter - AAEX03025972.1:1107 1756.2 13356410 7605.5
    14:03:16.760 INFO ProgressMeter - AAEX03025987.1:596 1756.3 13356790 7605.0
    14:03:26.777 INFO ProgressMeter - AAEX03026000.1:292 1756.5 13357100 7604.4
    14:03:36.827 INFO ProgressMeter - AAEX03026021.1:301 1756.7 13357540 7603.9
    14:03:47.121 INFO ProgressMeter - AAEX03026034.1:2521 1756.8 13357870 7603.4
    14:03:57.336 INFO ProgressMeter - AAEX03026053.1:2623 1757.0 13358240 7602.9
    14:04:01.466 INFO Mutect2 - 25054126 read(s) filtered by: MappingQualityReadFilter
    0 read(s) filtered by: MappingQualityAvailableReadFilter
    0 read(s) filtered by: MappingQualityNotZeroReadFilter
    0 read(s) filtered by: MappedReadFilter
    663399 read(s) filtered by: NotSecondaryAlignmentReadFilter
    69313706 read(s) filtered by: NotDuplicateReadFilter
    0 read(s) filtered by: PassesVendorQualityCheckReadFilter
    0 read(s) filtered by: NonChimericOriginalAlignmentReadFilter
    0 read(s) filtered by: NonZeroReferenceLengthAlignmentReadFilter
    415279 read(s) filtered by: ReadLengthReadFilter
    0 read(s) filtered by: GoodCigarReadFilter
    0 read(s) filtered by: WellformedReadFilter
    95446510 total reads filtered
    14:04:01.466 INFO ProgressMeter - AAEX03026072.1:601 1757.1 13358441 7602.7
    14:04:01.467 INFO ProgressMeter - Traversal complete. Processed 13358441 total regions in 1757.1 minutes.
    14:04:01.822 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 99.70680602600001
    14:04:01.822 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 10004.074472662001
    14:04:01.822 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 46637.97 sec
    14:04:01.823 INFO Mutect2 - Shutting down engine
    [December 16, 2020 2:04:01 PM CET] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 1,757.11 minutes.
    Runtime.totalMemory()=15318646784

    Is the info above what you asked for?

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Yun Yu yes, that is helpful! 

    How did you run these steps? Is there a chance that the GenomicsDBImport step started before the Mutect2 step finished?

    You may want to just re-run the GenomicsDBImport step to try it again.

    0
    Comment actions Permalink
  • Avatar
    Yun Yu

    I did start the GenomicsDBImport step after Mutect2 step finished for all the samples. I think that's not the reason.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Great! Could you run ValidateVariants on the seven samples to check for VCF issues?

    0
    Comment actions Permalink
  • Avatar
    Yun Yu

    Here are the output of the ValidateVariants on one sample. The outputs for the rest samples are similar.

    load bwa 0.7.15 (gcc) library and binaries.
    load htslib 1.9 (GCC) library and binaries.
    load samtools 1.9 (GCC) library and binaries.
    load java jdk 1.8.0_201 library and binaries.
    Using GATK jar /lustre/nobackup/WUR/ABGC/yu052/program/gatk/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx4G -XX:+UseParallelGC -XX:ParallelGCThreads
    =1 -jar /lustre/nobackup/WUR/ABGC/yu052/program/gatk/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar ValidateVariants -R /lustre/backup/WUR/ABGC/yu052/CanFam3.1/ensembl/Canis_familiaris.CanFam3.1.dna.toplevel.fa --d
    bsnp /lustre/nobackup/WUR/ABGC/yu052/wgs/722g.990.SNP.INDEL.chrAll.vcf.chr.change.af.gz -V DogWUR115.mutect2.vcf.gz
    08:57:45.944 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/lustre/nobackup/WUR/ABGC/yu052/program/gatk/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_compress
    ion.so
    Jan 07, 2021 8:57:46 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    08:57:46.505 INFO ValidateVariants - ------------------------------------------------------------
    08:57:46.505 INFO ValidateVariants - The Genome Analysis Toolkit (GATK) v4.1.8.1
    08:57:46.505 INFO ValidateVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
    08:57:46.505 INFO ValidateVariants - Executing as yu052@node027 on Linux v3.10.0-957.el7.x86_64 amd64
    08:57:46.505 INFO ValidateVariants - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_201-b09
    08:57:46.506 INFO ValidateVariants - Start Date/Time: January 7, 2021 8:57:45 AM CET
    08:57:46.506 INFO ValidateVariants - ------------------------------------------------------------
    08:57:46.506 INFO ValidateVariants - ------------------------------------------------------------
    08:57:46.506 INFO ValidateVariants - HTSJDK Version: 2.23.0
    08:57:46.506 INFO ValidateVariants - Picard Version: 2.22.8
    08:57:46.506 INFO ValidateVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    08:57:46.506 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    08:57:46.506 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    08:57:46.507 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    08:57:46.507 INFO ValidateVariants - Deflater: IntelDeflater
    08:57:46.507 INFO ValidateVariants - Inflater: IntelInflater
    08:57:46.507 INFO ValidateVariants - GCS max retries/reopens: 20
    08:57:46.507 INFO ValidateVariants - Requester pays: disabled
    08:57:46.507 INFO ValidateVariants - Initializing engine
    08:57:47.236 INFO FeatureManager - Using codec VCFCodec to read file file:///lustre/nobackup/WUR/ABGC/yu052/wgs/722g.990.SNP.INDEL.chrAll.vcf.chr.change.af.gz
    08:57:47.696 INFO FeatureManager - Using codec VCFCodec to read file file:///lustre/nobackup/WUR/ABGC/yu052/wgs/glp_tumour_genome/mutect2/DogWUR115.mutect2.vcf.gz
    08:57:48.125 INFO ValidateVariants - Done initializing engine
    08:57:48.125 INFO ProgressMeter - Starting traversal
    08:57:48.125 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
    08:57:58.132 INFO ProgressMeter - 1:25594139 0.2 58000 347791.3
    08:58:08.241 INFO ProgressMeter - 1:69683115 0.3 176000 524955.3
    08:58:18.245 INFO ProgressMeter - 1:113052017 0.5 289000 575716.3
    08:58:28.266 INFO ProgressMeter - 2:29720309 0.7 390000 582945.1
    08:58:38.348 INFO ProgressMeter - 2:75241376 0.8 505000 603309.2
    08:58:48.429 INFO ProgressMeter - 3:32988351 1.0 630000 626824.1
    08:58:58.471 INFO ProgressMeter - 3:77427655 1.2 756000 644812.8
    08:59:08.505 INFO ProgressMeter - 4:28873061 1.3 884000 659873.8
    08:59:18.530 INFO ProgressMeter - 4:73871483 1.5 1013000 672307.9

    ......

    08:59:48.648 INFO ProgressMeter - 6:29899981 2.0 1397000 695468.9
    08:59:58.654 INFO ProgressMeter - 6:75766885 2.2 1518000 697776.0
    09:00:08.695 INFO ProgressMeter - 7:47465306 2.3 1641000 700438.9
    09:00:18.710 INFO ProgressMeter - 8:10029825 2.5 1774000 706843.3
    09:00:28.767 INFO ProgressMeter - 8:57902079 2.7 1899000 709279.0
    09:00:38.807 INFO ProgressMeter - 9:23417162 2.8 2051000 720989.9
    09:00:48.840 INFO ProgressMeter - 10:10419774 3.0 2180000 723791.6
    09:00:58.919 INFO ProgressMeter - 10:55382041 3.2 2306000 725183.8
    09:01:08.937 INFO ProgressMeter - 11:33029145 3.3 2424000 724259.5
    09:01:18.997 INFO ProgressMeter - 12:3325193 3.5 2558000 727834.9
    09:01:29.039 INFO ProgressMeter - 12:50827176 3.7 2688000 730057.9
    09:01:39.043 INFO ProgressMeter - 13:23628070 3.8 2823000 733507.1
    09:01:49.130 INFO ProgressMeter - 14:1227646 4.0 2957000 736167.3
    09:01:59.131 INFO ProgressMeter - 14:48914324 4.2 3090000 738627.8
    09:02:09.170 INFO ProgressMeter - 15:33597804 4.4 3219000 739872.4
    09:02:19.238 INFO ProgressMeter - 16:11799187 4.5 3343000 739841.8
    09:02:29.265 INFO ProgressMeter - 16:54048539 4.7 3467000 739916.1
    09:02:39.266 INFO ProgressMeter - 17:36104445 4.9 3599000 741702.5
    09:02:49.301 INFO ProgressMeter - 18:15339993 5.0 3730000 743087.1
    09:02:59.340 INFO ProgressMeter - 18:55451292 5.2 3871000 746300.8
    09:03:09.348 INFO ProgressMeter - 19:39685515 5.4 4010000 749012.4
    09:03:19.348 INFO ProgressMeter - 20:31156499 5.5 4142000 750310.2
    09:03:29.348 INFO ProgressMeter - 21:19363669 5.7 4252000 747663.6
    09:03:39.447 INFO ProgressMeter - 22:10555836 5.9 4391000 749912.5
    09:03:49.513 INFO ProgressMeter - 22:55794979 6.0 4524000 751104.1
    09:03:59.541 INFO ProgressMeter - 23:39969240 6.2 4661000 752956.3
    09:04:09.559 INFO ProgressMeter - 24:33140656 6.4 4791000 753629.7
    09:04:19.581 INFO ProgressMeter - 25:28344457 6.5 4926000 755027.4
    09:04:29.581 INFO ProgressMeter - 26:18458894 6.7 5071000 757891.3
    09:04:39.620 INFO ProgressMeter - 27:14989969 6.9 5209000 759525.0
    09:04:49.644 INFO ProgressMeter - 28:15218068 7.0 5330000 758695.5
    09:04:59.710 INFO ProgressMeter - 29:15920981 7.2 5473000 760871.6
    09:05:09.781 INFO ProgressMeter - 30:18842671 7.4 5612000 762405.0
    09:05:19.882 INFO ProgressMeter - 31:20973434 7.5 5755000 764349.0
    09:05:29.915 INFO ProgressMeter - 32:22344951 7.7 5895000 765932.6
    09:05:39.982 INFO ProgressMeter - 33:27229899 7.9 6030000 766757.7
    09:05:50.046 INFO ProgressMeter - 34:38560293 8.0 6158000 766681.7
    09:06:00.060 INFO ProgressMeter - 36:11017513 8.2 6297000 768028.3
    09:06:10.104 INFO ProgressMeter - 37:23661377 8.4 6435000 769155.7
    09:06:20.108 INFO ProgressMeter - JH373233.1:546640 8.5 6743000 790221.6
    09:06:24.610 INFO ProgressMeter - AAEX03026068.1:485 8.6 7247010 841884.3
    09:06:24.610 INFO ProgressMeter - Traversal complete. Processed 7247010 total variants in 8.6 minutes.
    09:06:24.610 INFO ValidateVariants - Shutting down engine
    [January 7, 2021 9:06:24 AM CET] org.broadinstitute.hellbender.tools.walkers.variantutils.ValidateVariants done. Elapsed time: 8.65 minutes.
    Runtime.totalMemory()=1485832192

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Yun Yu,

    Great, thank you for running that to check, I do not see any issues. Could you confirm that there are variants in your GVCF files that overlap with the intervals given in this file (canfam3.1.chr1.intervals)? 

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Yun Yu

    I spcified only chromosome 1 in the canfam3.1.chr1.intervals in "1". And variants in vcf from Mutect2 are as follows:

    #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT DogWUR120
    1 1862 . C T . . AS_SB_TABLE=13,19|0,6;DP=38;ECNT=1;MBQ=35,34;MFRL=313,385;MMQ=27,37;MPOS=31;POPAF=7.30;TLOD=14.30 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:32,6:0.189:38:13,4:19,2:13,19,0,6
    1 2219 . T G . . AS_SB_TABLE=0,0|0,0;DP=2;ECNT=1;MBQ=0,34;MFRL=0,437;MMQ=60,34;MPOS=32;POPAF=7.30;TLOD=6.68 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:0,2:0.750:2:0,2:0,0:0,0,1,1
    1 2849 . G C . . AS_SB_TABLE=10,10|3,0;DP=23;ECNT=2;MBQ=35,35;MFRL=309,357;MMQ=40,23;MPOS=62;POPAF=7.30;TLOD=3.61 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:20,3:0.159:23:8,1:12,2:10,10,3,0
    1 2879 . C A . . AS_SB_TABLE=12,6|0,2;DP=20;ECNT=2;MBQ=34,33;MFRL=307,309;MMQ=40,40;MPOS=9;POPAF=7.30;TLOD=4.81 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:18,2:0.150:20:8,1:10,1:12,6,0,2
    1 4132 . C T . . AS_SB_TABLE=2,3|11,3;DP=21;ECNT=1;MBQ=35,35;MFRL=324,301;MMQ=40,44;MPOS=36;POPAF=7.30;TLOD=46.00 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:5,14:0.715:19:1,5:4,9:2,3,11,3
    1 4265 . G A . . AS_SB_TABLE=3,4|3,8;DP=19;ECNT=1;MBQ=34,34;MFRL=337,263;MMQ=40,40;MPOS=26;POPAF=7.30;TLOD=29.95 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:7,11:0.579:18:5,8:2,3:3,4,3,8
    1 4476 . G T . . AS_SB_TABLE=2,0|1,3;DP=7;ECNT=1;MBQ=27,33;MFRL=311,347;MMQ=30,32;MPOS=22;POPAF=7.30;TLOD=11.46 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:2,4:0.628:6:1,3:1,1:2,0,1,3
    1 4626 . T C . . AS_SB_TABLE=1,2|0,1;DP=4;ECNT=6;MBQ=34,35;MFRL=282,357;MMQ=20,40;MPOS=64;POPAF=7.30;TLOD=3.16 GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|1:3,1:0.329:4:3,1:0,0:0|1:4626_T_C:4626:1,2,0,1
    1 4630 . G C . . AS_SB_TABLE=1,2|0,1;DP=4;ECNT=6;MBQ=33,35;MFRL=282,357;MMQ=20,40;MPOS=68;POPAF=7.30;TLOD=3.16 GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|1:3,1:0.329:4:3,1:0,0:0|1:4626_T_C:4626:1,2,0,1
    1 4647 . A T . . AS_SB_TABLE=1,2|0,1;DP=4;ECNT=6;MBQ=24,33;MFRL=282,357;MMQ=20,40;MPOS=64;POPAF=7.30;TLOD=3.02 GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|1:3,1:0.286:4:2,1:0,0:0|1:4647_A_T:4647:1,2,0,1
    1 4649 . G T . . AS_SB_TABLE=1,3|0,1;DP=5;ECNT=6;MBQ=30,25;MFRL=311,357;MMQ=21,40;MPOS=62;POPAF=7.30;TLOD=3.02 GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|1:4,1:0.286:5:2,1:0,0:0|1:4647_A_T:4647:1,3,0,1
    1 4653 . G T . . AS_SB_TABLE=1,2|0,1;DP=4;ECNT=6;MBQ=36,33;MFRL=282,357;MMQ=22,40;MPOS=58;POPAF=7.30;TLOD=3.20 GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|1:3,1:0.333:4:2,1:1,0:0|1:4647_A_T:4647:1,2,0,1
    1 4782 . A G . . AS_SB_TABLE=6,1|0,2;DP=9;ECNT=6;MBQ=33,34;MFRL=295,301;MMQ=60,31;MPOS=32;POPAF=7.30;TLOD=3.90 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:7,2:0.272:9:4,1:3,1:6,1,0,2
    1 5474 . T C . . AS_SB_TABLE=3,1|3,3;DP=10;ECNT=1;MBQ=27,35;MFRL=343,405;MMQ=40,40;MPOS=33;POPAF=7.30;TLOD=18.60 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:4,6:0.584:10:2,3:1,3:3,1,3,3
    1 5607 . G T . . AS_SB_TABLE=12,1|5,0;DP=19;ECNT=1;MBQ=34,33;MFRL=321,283;MMQ=40,40;MPOS=38;POPAF=7.30;TLOD=13.35 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:13,5:0.318:18:5,0:7,5:12,1,5,0
    1 5958 . A G . . AS_SB_TABLE=0,16|0,2;DP=19;ECNT=1;MBQ=36,34;MFRL=331,336;MMQ=28,35;MPOS=35;POPAF=7.30;TLOD=4.18 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:16,2:0.149:18:11,1:5,1:0,16,0,2

    So there should be overlaps.

    0
    Comment actions Permalink
  • Avatar
    Yun Yu

    I suspect that the GenomicsDBImport step failed. I checked the gendb directory. Take the subdirectory of chromosome 1 as an example, there is one file no __array_schema.tdb and one subdirectory genomicsdb_meta_dir. In the genomicsdb_meta_dir folder, there are two files: genomicsdb_column_bounds.json

    genomicsdb_meta_1235912c-4d97-4d60-aa89-ea9091889dd0.json

    The content of the first file is as follows:

    {
    "min_column": 0,
    "max_column": 122678784
    }

    The content of second file is as follow:

    {
    "lb_row_idx": 0,
    "max_valid_row_idx_in_array": 6
    }

     

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Yun Yu,

    I brought this issue up with my team and they notified me that there was a large bug fix in CreateSomaticPanelOfNormals in GATK 4.1.9.0 (more information here). The bug was causing small or empty PONs.

    Could you re-run Mutect2, GenomicsDBImport, and CreateSomaticPanelOfNormals with GATK 4.1.9.0 to see if it fixes your issue?

    Genevieve

    0
    Comment actions Permalink
  • This issue is problem for badly documentation ....

    gatk --java-options "-Xmx16G -XX:+UseParallelGC -XX:ParallelGCThreads=4" Mutect2 -R $reference -I $normal_bam -max-mnp-distance 0 -O ${sample}.mutect2.vcf.gz

    CORRECT:

    gatk --java-options "-Xmx16G -XX:+UseParallelGC -XX:ParallelGCThreads=4" Mutect2 -R $reference  -tumor normal1_sample_name -I $normal_bam -max-mnp-distance 0 -O ${sample}.mutect2.vcf.gz

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk