HaplotypeCaller with additional annotation?
AnsweredHI all,
I am trying to run HaplotypeCaller with the --annotation/-A parameter, but it doesn't work, so I'm obviously doing something wrong. I am running it on RNA-seq data.
a) GATK version used: 4.1.3 (from Docker image: broadinstitute/gatk:4.1.3.0)
b) Exact command used:
gatk HaplotypeCaller -R $FASTA -I $bam -O test.vcf.gz -ERC GVCF -stand-call-conf 20 --annotation QualByDepth --annotation FisherStrand
I also tried:
gatk HaplotypeCaller -R $FASTA -I $bam -O ${pref}.vcf.gz -ERC GVCF -stand-call-conf 20 --annotation QualByDepth,FisherStrandand:
Without the "--annotation" parameter, the tool works properly and I get a VCF file.
c) Entire error log:
12:55:12.806 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.1.3.0-local.jar!/com/intel/gkl/native/libgkl_co
mpression.so
Jun 30, 2021 12:55:14 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
12:55:14.509 INFO HaplotypeCaller - ------------------------------------------------------------
12:55:14.510 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.1.3.0
12:55:14.510 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
12:55:14.510 INFO HaplotypeCaller - Executing as sbonnin@node-hp0001.linux.crg.es on Linux v3.10.0-693.5.2.el7.x86_64 amd64
12:55:14.511 INFO HaplotypeCaller - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12
12:55:14.511 INFO HaplotypeCaller - Start Date/Time: June 30, 2021 12:55:12 PM UTC
12:55:14.511 INFO HaplotypeCaller - ------------------------------------------------------------
12:55:14.511 INFO HaplotypeCaller - ------------------------------------------------------------
12:55:14.512 INFO HaplotypeCaller - HTSJDK Version: 2.20.1
12:55:14.512 INFO HaplotypeCaller - Picard Version: 2.20.5
12:55:14.512 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
12:55:14.512 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
12:55:14.512 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
12:55:14.512 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
12:55:14.513 INFO HaplotypeCaller - Deflater: IntelDeflater
12:55:14.513 INFO HaplotypeCaller - Inflater: IntelInflater
12:55:14.513 INFO HaplotypeCaller - GCS max retries/reopens: 20
12:55:14.513 INFO HaplotypeCaller - Requester pays: disabled
12:55:14.513 INFO HaplotypeCaller - Initializing engine
12:55:15.057 INFO HaplotypeCaller - Done initializing engine
12:55:15.059 INFO HaplotypeCallerEngine - Tool is in reference confidence mode and the annotation, the following changes will be made to any specified a
nnotations: 'StrandBiasBySample' will be enabled. 'ChromosomeCounts', 'FisherStrand', 'StrandOddsRatio' and 'QualByDepth' annotations have been disabled
12:55:15.069 INFO HaplotypeCallerEngine - Standard Emitting and Calling confidence set to 0.0 for reference-model confidence output
12:55:15.069 INFO HaplotypeCallerEngine - All sites annotated with PLs forced to true for reference-model confidence output
12:55:15.069 INFO HaplotypeCaller - Shutting down engine
[June 30, 2021 12:55:15 PM UTC] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=2290089984
USAGE: HaplotypeCaller [arguments]
Call germline SNPs and indels via local re-assembly of haplotypes
Version:4.1.3.0
Required Arguments:
(there comes all the help page...)
Thank you!
-
Hi Sarah Bonnin,
From what I can tell it looks like you are running the command correctly. Could you try running this on the most recent version of GATK to see if that works?
Kind regards,
Pamela
-
Hi Pamela,
Thank you for your reply. Today I have tried to run HaplotypeCaller with the newest version of GATK (4.2.0.0).
While I do get a VCF file, I unfortunately don't get the fields "QD" and "FS" in it, which I need later on for filtering the variants.
I ran HaplotypeCaller like this:
gatk HaplotypeCaller -R $FASTA -I $bam -O test.vcf.gz -ERC GVCF -stand-call-conf 20 -A QualByDepth -A FisherStrand
And here are the few first rows of the vcf file (skipping the header):
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT CCW
6_01543AAC_GCCAAT
chr1 1 . N <NON_REF> . . END=3058518
GT:DP:GQ:MIN_DP:PL 0/0:0:0:0:0,0,0
chr1 3058519 . T <NON_REF> . . END=3058558
GT:DP:GQ:MIN_DP:PL 0/0:1:3:1:0,3,34
chr1 3058559 . T <NON_REF> . . END=3096935
GT:DP:GQ:MIN_DP:PL 0/0:0:0:0:0,0,0
chr1 3096936 . A <NON_REF> . . END=3096959
GT:DP:GQ:MIN_DP:PL 0/0:1:3:1:0,3,31
chr1 3096960 . A <NON_REF> . . END=3121908
GT:DP:GQ:MIN_DP:PL 0/0:0:0:0:0,0,0
chr1 3121909 . A <NON_REF> . . END=3121948
GT:DP:GQ:MIN_DP:PL 0/0:1:3:1:0,3,20
chr1 3121949 . G <NON_REF> . . END=3153289 -
Hi Sarah Bonnin,
Thank you for providing this information. Given that you get an output from HaplotypeCaller just without your annotations, you might find this document helpful which lists reasons why annotations might be missing from your VCF file:
Please let me know if this does not answer your question.
Kind regards,
Pamela
-
Hi Pamela Bretscher,
Thank you. I've tried several things and, apparently, the `-ERC GVCF` option was the issue.
When I remove it, I get the annotation fields. Do you know why this is happening?
From the documentation: "the GVCF has records for all sites, whether there is a variant call there or not."
Is it because all sites,are reported that the annotations can't be added?
Thank you!
-
Hi Sarah Bonnin,
I'm glad you were able to get it to work and thank you for sharing the solution! This is because additional annotations are meant to apply to variant calls. Most likely, the annotations were not appearing because the non-variant sites in the GVCF were incompatible with the annotations.
Kind regards,
Pamela
Please sign in to leave a comment.
5 comments