Receiving zero variants processed after HaplotypeCaller and Select Variants
AnsweredHi GATK Community,
I am running HaplotypeCaller in reference confidence mode on non-model invertebrate data with gatk 4.1.9.0.
gatk --java-options "-Xmx4G" HaplotypeCaller -R $ref -I $file -ERC GVCF --sample-ploidy 2 -O $file\.raw_variants.g.vcf
15:04:42.203 INFO HaplotypeCallerEngine - Tool is in reference confidence mode and the annotation, the following changes will be made to any specified annotations: 'StrandBiasBySample' will be enabled. 'ChromosomeCounts', 'FisherStrand', 'StrandOddsRatio' and 'QualByDepth' annotations have been disabled
15:04:42.520 INFO HaplotypeCallerEngine - Standard Emitting and Calling confidence set to 0.0 for reference-model confidence output
15:04:42.520 INFO HaplotypeCallerEngine - All sites annotated with PLs forced to true for reference-model confidence output
15:04:42.530 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/mmfs1/tools/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
15:04:42.532 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/mmfs1/tools/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
15:04:42.560 INFO IntelPairHmm - Using CPU-supported AVX-512 instructions
15:04:42.560 INFO IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
15:04:42.560 INFO IntelPairHmm - Available threads: 32
15:04:42.560 INFO IntelPairHmm - Requested threads: 4
15:04:42.560 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
15:04:42.711 INFO ProgressMeter - Starting traversal
15:04:42.711 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
15:04:43.194 WARN InbreedingCoeff - InbreedingCoeff will not be calculated; at least 10 samples must have called genotypes
15:04:44.444 WARN DepthPerSampleHC - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:44.444 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:44.445 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:45.548 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:45.582 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:45.583 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:45.583 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:45.584 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:45.584 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:46.636 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:49.489 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:49.582 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:49.582 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:49.583 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:49.861 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:49.879 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:50.325 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:51.658 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:51.930 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
15:04:52.713 INFO ProgressMeter - ptg000002l:577443 0.2 4940 29634.1
15:04:56.473 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap:
07:51:23.095 INFO ProgressMeter - ptg034584l:11469 6766.7 28343230 4188.7
07:51:23.401 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
07:51:25.441 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
07:51:26.583 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
07:51:28.670 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
07:51:28.877 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
07:51:32.600 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
07:51:33.487 INFO ProgressMeter - ptg034593l:39338 6766.8 28345000 4188.8
07:51:35.753 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
07:51:35.754 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
07:51:36.078 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
07:51:36.079 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
07:51:36.079 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
07:51:36.310 INFO HaplotypeCaller - 383024808 read(s) filtered by: MappingQualityReadFilter
0 read(s) filtered by: MappingQualityAvailableReadFilter
0 read(s) filtered by: MappedReadFilter
3712570 read(s) filtered by: NotSecondaryAlignmentReadFilter
50558477 read(s) filtered by: NotDuplicateReadFilter
0 read(s) filtered by: PassesVendorQualityCheckReadFilter
0 read(s) filtered by: NonZeroReferenceLengthAlignmentReadFilter
0 read(s) filtered by: GoodCigarReadFilter
0 read(s) filtered by: WellformedReadFilter
437295855 total reads filtered
07:51:36.310 INFO ProgressMeter - ptg034602l:16801 6766.9 28346132 4188.9
07:51:36.310 INFO ProgressMeter - Traversal complete. Processed 28346132 total regions in 6766.9 minutes.
07:51:38.912 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 92.230655119
07:51:38.912 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 17340.198847879
VCF examples:
##INFO=<ID=BaseQRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt Vs. Ref base qualities">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth; some reads may have been filtered">
##INFO=<ID=END,Number=1,Type=Integer,Description="Stop position of the interval">
##INFO=<ID=ExcessHet,Number=1,Type=Float,Description="Phred-scaled p-value for exact test of excess heterozygosity">
##INFO=<ID=InbreedingCoeff,Number=1,Type=Float,Description="Inbreeding coefficient as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation">
##INFO=<ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed">
##INFO=<ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed">
##INFO=<ID=MQRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref read mapping qualities">
##INFO=<ID=RAW_MQandDP,Number=2,Type=Integer,Description="Raw data (sum of squared MQ and total depth) for improved RMS Mapping Quality calculation. Incompatible with deprecated RAW_MQ formulation.">
##INFO=<ID=ReadPosRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias">
##contig=<ID=ptg000001l,length=337966>
##contig=<ID=ptg000002l,length=922462>
##contig=<ID=ptg000003l,length=485413>
##contig=<ID=ptg000004l,length=2201902>
##contig=<ID=ptg000005l,length=73260>
##contig=<ID=ptg000006l,length=490383>
##contig=<ID=ptg000007l,length=422981>
##contig=<ID=ptg000008l,length=1409091>
##contig=<ID=ptg000009l,length=376753>
##contig=<ID=ptg000010l,length=720825>
##contig=<ID=ptg000011l,length=588086>
##contig=<ID=ptg000012l,length=72583>
##contig=<ID=ptg000013l,length=229458>
Later in the VCF:
ptg034601l 25899 . T <NON_REF> . . END=25899 GT:DP:GQ:MIN_DP:PL 0/0:15:6:15:0,6,90
ptg034601l 25900 . C <NON_REF> . . END=25904 GT:DP:GQ:MIN_DP:PL 0/0:14:0:12:0,0,155
ptg034601l 25905 . T <NON_REF> . . END=25905 GT:DP:GQ:MIN_DP:PL 0/0:8:6:8:0,6,90
ptg034601l 25906 . A <NON_REF> . . END=25906 GT:DP:GQ:MIN_DP:PL 0/0:8:0:8:0,0,0
ptg034601l 25907 . T <NON_REF> . . END=25907 GT:DP:GQ:MIN_DP:PL 0/0:8:6:8:0,6,90
ptg034601l 25908 . C <NON_REF> . . END=25909 GT:DP:GQ:MIN_DP:PL 0/0:8:0:8:0,0,182
ptg034601l 25910 . C <NON_REF> . . END=25911 GT:DP:GQ:MIN_DP:PL 0/0:7:6:7:0,6,90
ptg034601l 25912 . C <NON_REF> . . END=25912 GT:DP:GQ:MIN_DP:PL 0/0:7:5:7:0,5,234
ptg034601l 25913 . T <NON_REF> . . END=25922 GT:DP:GQ:MIN_DP:PL 0/0:7:0:7:0,0,10
ptg034601l 25923 . T <NON_REF> . . END=25925 GT:DP:GQ:MIN_DP:PL 0/0:7:6:6:0,6,90
ptg034601l 25926 . C <NON_REF> . . END=25926 GT:DP:GQ:MIN_DP:PL 0/0:6:0:6:0,0,182
ptg034601l 25927 . T <NON_REF> . . END=25927 GT:DP:GQ:MIN_DP:PL 0/0:6:6:6:0,6,90
ptg034601l 25928 . C <NON_REF> . . END=25932 GT:DP:GQ:MIN_DP:PL 0/0:5:0:4:0,0,2
ptg034601l 25933 . T <NON_REF> . . END=25933 GT:DP:GQ:MIN_DP:PL 0/0:4:6:4:0,6,90
ptg034601l 25934 . C <NON_REF> . . END=25934 GT:DP:GQ:MIN_DP:PL 0/0:4:0:4:0,0,84
ptg000002l 368020 . A G,<NON_REF> 65.83 . DP=3;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;RAW_MQandDP=2937,3 GT:AD:DP:GQ:PL:SB 1/1:0,3,0:3:9:79,9,0,79,9,79:0,0,1,2
ptg000002l 368021 . T <NON_REF> . . END=368052 GT:DP:GQ:MIN_DP:PL 0/0:3:9:3:0,9,56
ptg000002l 368053 . G <NON_REF> . . END=368053 GT:DP:GQ:MIN_DP:PL 0/0:1:3:1:0,3,37
ptg000002l 377744 . T TA,<NON_REF> 63.27 . DP=2;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;RAW_MQandDP=1313,2 GT:AD:DP:GQ:PL:SB 1/1:0,2,0:2:6:75,6,0,75,6,75:0,0,0,2
ptg000002l 377745 . A <NON_REF> . . END=377745 GT:DP:GQ:MIN_DP:PL 0/0:6:12:6:0,12,180
ptg000002l 377746 . G <NON_REF> . . END=377746 GT:DP:GQ:MIN_DP:PL 0/0:6:0:6:0,0,156
Because it seemed like variants were called, I ran SelectVariants on the output:
gatk SelectVariants -R $ref --variant $file --select-type-to-include SNP --output $file\.raw_snps.vcf
gatk SelectVariants -R $ref --variant $file --select-type-to-include INDEL --output $file\.raw_indels.vcf
Output:
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /mmfs1/tools/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar SelectVariants -R reference.fa --variant org262.raw_variants.g.vcf --select-type-to-include SNP --output org262.raw_variants.g.vcf.raw_snps.vcf
13:19:54.570 INFO SelectVariants - ------------------------------------------------------------
13:19:54.570 INFO SelectVariants - The Genome Analysis Toolkit (GATK) v4.1.9.0
13:19:54.570 INFO SelectVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
13:19:54.571 INFO SelectVariants - Executing as cjg0067@node061 on Linux v3.10.0-1062.12.1.el7.x86_64 amd64
13:19:54.571 INFO SelectVariants - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_281-b09
13:19:54.571 INFO SelectVariants - Start Date/Time: April 13, 2021 1:19:52 PM CDT
13:19:54.571 INFO SelectVariants - ------------------------------------------------------------
13:19:54.571 INFO SelectVariants - ------------------------------------------------------------
13:19:54.572 INFO SelectVariants - HTSJDK Version: 2.23.0
13:19:54.572 INFO SelectVariants - Picard Version: 2.23.3
13:19:54.572 INFO SelectVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
13:19:54.572 INFO SelectVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
13:19:54.572 INFO SelectVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
13:19:54.572 INFO SelectVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
13:19:54.572 INFO SelectVariants - Deflater: IntelDeflater
13:19:54.572 INFO SelectVariants - Inflater: IntelInflater
13:19:54.572 INFO SelectVariants - GCS max retries/reopens: 20
13:19:54.572 INFO SelectVariants - Requester pays: disabled
13:19:54.572 INFO SelectVariants - Initializing engine
13:19:56.636 INFO FeatureManager - Using codec VCFCodec to read file file:///mmfs1/org262.raw_variants.g.vcf
13:20:09.603 INFO SelectVariants - Done initializing engine
13:20:10.584 INFO ProgressMeter - Starting traversal
13:20:10.584 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
13:31:42.740 INFO ProgressMeter - unmapped 11.5 0 0.0
13:31:42.741 INFO ProgressMeter - Traversal complete. Processed 0 total variants in 11.5 minutes.
13:31:43.527 INFO SelectVariants - Shutting down engine
[April 13, 2021 1:31:43 PM CDT] org.broadinstitute.hellbender.tools.walkers.variantutils.SelectVariants done. Elapsed time: 11.86 minutes.
Runtime.totalMemory()=5041553408
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /mmfs1/tools/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar SelectVariants -R reference.fa --variant org262.raw_variants.g.vcf --select-type-to-include INDEL --output org262.raw_variants.g.vcf.raw_indels.vcf
13:31:45.163 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mmfs1/tools/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Apr 13, 2021 1:31:45 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
13:31:45.262 INFO SelectVariants - ------------------------------------------------------------
13:31:45.262 INFO SelectVariants - The Genome Analysis Toolkit (GATK) v4.1.9.0
13:31:45.262 INFO SelectVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
13:31:45.262 INFO SelectVariants - Executing as cjg0067@node061 on Linux v3.10.0-1062.12.1.el7.x86_64 amd64
13:31:45.263 INFO SelectVariants - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_281-b09
13:31:45.263 INFO SelectVariants - Start Date/Time: April 13, 2021 1:31:45 PM CDT
13:31:45.263 INFO SelectVariants - ------------------------------------------------------------
13:31:45.263 INFO SelectVariants - ------------------------------------------------------------
13:31:45.263 INFO SelectVariants - HTSJDK Version: 2.23.0
13:31:45.263 INFO SelectVariants - Picard Version: 2.23.3
13:31:45.263 INFO SelectVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
13:31:45.263 INFO SelectVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
13:31:45.263 INFO SelectVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
13:31:45.263 INFO SelectVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
13:31:45.263 INFO SelectVariants - Deflater: IntelDeflater
13:31:45.263 INFO SelectVariants - Inflater: IntelInflater
13:31:45.263 INFO SelectVariants - GCS max retries/reopens: 20
13:31:45.263 INFO SelectVariants - Requester pays: disabled
13:31:45.263 INFO SelectVariants - Initializing engine
13:31:46.758 INFO FeatureManager - Using codec VCFCodec to read file file:///mmfs1/org262.raw_variants.g.vcf
13:31:51.219 INFO SelectVariants - Done initializing engine
13:31:51.712 INFO ProgressMeter - Starting traversal
13:31:51.712 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
13:43:21.819 INFO ProgressMeter - unmapped 11.5 0 0.0
13:43:21.819 INFO ProgressMeter - Traversal complete. Processed 0 total variants in 11.5 minutes.
13:43:21.910 INFO SelectVariants - Shutting down engine
[April 13, 2021 1:43:21 PM CDT] org.broadinstitute.hellbender.tools.walkers.variantutils.SelectVariants done. Elapsed time: 11.61 minutes.
Runtime.totalMemory()=5363466240
I am not sure what the problem is or what my next step should be, so I appreciate any and all help. Thank you for your time, comments, and suggestions!
Cheers,
Candace
-
Hi Candace Grimes,
It looks like you do have some sites in your gvcf that are variant sites (Not reference blocks):
ptg000002l 368020 . A G,<NON_REF> 65.83 . DP=3;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;RAW_MQandDP=2937,3 GT:AD:DP:GQ:PL:SB 1/1:0,3,0:3:9:79,9,0,79,9,79:0,0,1,2
As of now, I don't think there are issues with the HaplotypeCaller step.
Have you run GenotypeGVCFs yet? I would recommend running that tool and then seeing the output to determine if your variant sites are expected. I'm thinking that your SelectVariants commands didn't pick up the variant sites because you restricted the command to only SNPs and INDELs, and you haven't genotyped the gvcf yet.
Let me know what you find.
Best,
Genevieve
-
Thank you Genevieve!
-
Yes, thank you Genevieve! I believe that worked. I am now getting variants processed.
I do have another question: I am trying to analyze shorter sequences with this same genome. They have a low mapping rate (~10%), and when I run them through the HaplotypeCaller tool, I receive the USER ERROR: Input files reference and reads have incompatible contigs: No overlapping contigs found. I checked the duplication metrics and the percent duplication seems to be above 80%.
## METRICS CLASS picard.sam.DuplicationMetrics
LIBRARY UNPAIRED_READS_EXAMINED READ_PAIRS_EXAMINED SECONDARY_OR_SUPPLEMENTARY_RDS UNMAPPED_READS UNPAIRED_READ_DUPLICATES READ_PAIR_DUPLICATES READ_PAIR_OPTICAL_DUPLICATES PERCENT_DUPLICATION ESTIMATED_LIBRARY_SIZE
Unknown Library 73608 0 0 544979 64370 0 0 0.874497Please let me know if you have any suggestions and thank you again!
All the best,
Candace
-
Hi Candace Grimes,
We have an article about how to address these issues, see this link: https://gatk.broadinstitute.org/hc/en-us/articles/360035891131-Errors-about-input-files-having-missing-or-incompatible-contigs
Best,
Genevieve
Please sign in to leave a comment.
4 comments