Query regarding CombineGVCFs
AnsweredHi Genevieve Brandt (she/her) and GATK community,
I would like to know about the issue regarding CombineGVCFs. When I am trying to combine all vcf files together using CombineGVCFs as you can see. It is showing a warning to me. Is It Okay to ignore this warning or is it an error? If it is an error, then Kindly help me, how to solve it?
17:20:58.478 WARN ReferenceConfidenceVariantContextMerger - Detected invalid annotations: When trying to merge variant contexts at location 10:1547 the annotation MLEAC=[1, 0] was not a numerical value and was ignored
#Command starts from down here
/home/shazia/Software/gatk-4.1.9.0/gatk CombineGVCFs --java-options -Xmx30g -O combineGVCFs_10_Sample.g.vcf.gz -R /media/shazia/TargetSequencing/TargetSequencing_Example/Bos_taurus_UMD_3.1.1_genome/Bos_taurus_NCBI_UMD_3.1.1/Bos_taurus/NCBI/UMD_3.1.1/Sequence/WholeGenomeFasta/genome.fa --variant gvcfs_10.list
Using GATK jar /home/shazia/Software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx30g -jar /home/shazia/Software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar CombineGVCFs -O combineGVCFs_10_Sample.g.vcf.gz -R /media/shazia/TargetSequencing/TargetSequencing_Example/Bos_taurus_UMD_3.1.1_genome/Bos_taurus_NCBI_UMD_3.1.1/Bos_taurus/NCBI/UMD_3.1.1/Sequence/WholeGenomeFasta/genome.fa --variant gvcfs_10.list
17:20:56.943 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/shazia/Software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Dec 19, 2021 5:20:57 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
17:20:57.057 INFO CombineGVCFs - ------------------------------------------------------------
17:20:57.057 INFO CombineGVCFs - The Genome Analysis Toolkit (GATK) v4.1.9.0
17:20:57.057 INFO CombineGVCFs - For support and documentation go to https://software.broadinstitute.org/gatk/
17:20:57.057 INFO CombineGVCFs - Executing as shazia@shazia-Lin on Linux v4.15.0-142-generic amd64
17:20:57.057 INFO CombineGVCFs - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
17:20:57.057 INFO CombineGVCFs - Start Date/Time: December 19, 2021 5:20:56 PM CET
17:20:57.057 INFO CombineGVCFs - ------------------------------------------------------------
17:20:57.057 INFO CombineGVCFs - ------------------------------------------------------------
17:20:57.057 INFO CombineGVCFs - HTSJDK Version: 2.23.0
17:20:57.057 INFO CombineGVCFs - Picard Version: 2.23.3
17:20:57.057 INFO CombineGVCFs - HTSJDK Defaults.COMPRESSION_LEVEL : 2
17:20:57.057 INFO CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
17:20:57.058 INFO CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
17:20:57.058 INFO CombineGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
17:20:57.058 INFO CombineGVCFs - Deflater: IntelDeflater
17:20:57.058 INFO CombineGVCFs - Inflater: IntelInflater
17:20:57.058 INFO CombineGVCFs - GCS max retries/reopens: 20
17:20:57.058 INFO CombineGVCFs - Requester pays: disabled
17:20:57.058 INFO CombineGVCFs - Initializing engine
17:20:57.288 INFO FeatureManager - Using codec VCFCodec to read file file:///media/shazia/TargetSequencing/TargetSequencing_Example/samples/9_Sort_Coordinate_DuplicateMarked_NmMdAndUqTag_ValidateSam_HaplotypeCaller/2K_FCH7L2KCCX2_L8_BISvveXAAFNAAA-73_marked_duplicates_NmMdAndUqTag_fixed.g.vcf.gz
17:20:57.397 INFO FeatureManager - Using codec VCFCodec to read file file:///media/shazia/TargetSequencing/TargetSequencing_Example/samples/9_Sort_Coordinate_DuplicateMarked_NmMdAndUqTag_ValidateSam_HaplotypeCaller/4K_FCH7L2KCCX2_L6_BISvveXAABTAAA-69_marked_duplicates_NmMdAndUqTag_fixed.g.vcf.gz
17:20:57.476 INFO FeatureManager - Using codec VCFCodec to read file file:///media/shazia/TargetSequencing/TargetSequencing_Example/samples/9_Sort_Coordinate_DuplicateMarked_NmMdAndUqTag_ValidateSam_HaplotypeCaller/5K_FCH7L2KCCX2_L7_BISvveXAACWAAA-168_marked_duplicates_NmMdAndUqTag_fixed.g.vcf.gz
17:20:57.550 INFO FeatureManager - Using codec VCFCodec to read file file:///media/shazia/TargetSequencing/TargetSequencing_Example/samples/9_Sort_Coordinate_DuplicateMarked_NmMdAndUqTag_ValidateSam_HaplotypeCaller/11K_FCH7L2KCCX2_L8_BISvveXAAFOAAA-74_marked_duplicates_NmMdAndUqTag_fixed.g.vcf.gz
17:20:57.678 INFO FeatureManager - Using codec VCFCodec to read file file:///media/shazia/TargetSequencing/TargetSequencing_Example/samples/9_Sort_Coordinate_DuplicateMarked_NmMdAndUqTag_ValidateSam_HaplotypeCaller/16K_FCH7L2KCCX2_L6_BISvveXAABJAAA-50_marked_duplicates_NmMdAndUqTag_fixed.g.vcf.gz
17:20:57.735 INFO FeatureManager - Using codec VCFCodec to read file file:///media/shazia/TargetSequencing/TargetSequencing_Example/samples/9_Sort_Coordinate_DuplicateMarked_NmMdAndUqTag_ValidateSam_HaplotypeCaller/17K_FCH7L2KCCX2_L5_BISvveXAAAZAAA-154_marked_duplicates_NmMdAndUqTag_fixed.g.vcf.gz
17:20:57.765 INFO FeatureManager - Using codec VCFCodec to read file file:///media/shazia/TargetSequencing/TargetSequencing_Example/samples/9_Sort_Coordinate_DuplicateMarked_NmMdAndUqTag_ValidateSam_HaplotypeCaller/21K_FCH7L2KCCX2_L8_BISvveXAAFLAAA-71_marked_duplicates_NmMdAndUqTag_fixed.g.vcf.gz
17:20:57.794 INFO FeatureManager - Using codec VCFCodec to read file file:///media/shazia/TargetSequencing/TargetSequencing_Example/samples/9_Sort_Coordinate_DuplicateMarked_NmMdAndUqTag_ValidateSam_HaplotypeCaller/22K_FCH7L2KCCX2_L6_BISvveXAABIAAA-46_marked_duplicates_NmMdAndUqTag_fixed.g.vcf.gz
17:20:57.830 INFO FeatureManager - Using codec VCFCodec to read file file:///media/shazia/TargetSequencing/TargetSequencing_Example/samples/9_Sort_Coordinate_DuplicateMarked_NmMdAndUqTag_ValidateSam_HaplotypeCaller/23K_FCH7L2KCCX2_L7_BISvveXAACSAAA-141_marked_duplicates_NmMdAndUqTag_fixed.g.vcf.gz
17:20:57.860 INFO FeatureManager - Using codec VCFCodec to read file file:///media/shazia/TargetSequencing/TargetSequencing_Example/samples/9_Sort_Coordinate_DuplicateMarked_NmMdAndUqTag_ValidateSam_HaplotypeCaller/24K_FCH7L2KCCX2_L6_BISvveXAABHAAA-43_marked_duplicates_NmMdAndUqTag_fixed.g.vcf.gz
17:20:58.350 INFO CombineGVCFs - Done initializing engine
17:20:58.362 INFO ProgressMeter - Starting traversal
17:20:58.362 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
17:20:58.478 WARN ReferenceConfidenceVariantContextMerger - Detected invalid annotations: When trying to merge variant contexts at location 10:1547 the annotation MLEAC=[1, 0] was not a numerical value and was ignored
-
Hi Abrish,
Yes, this warning is totally fine. Here is more information about the MLEAC annotation (from the GVCF article):
##INFO=<ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed">
It is only calculated when there are alt alleles at a certain position. In your GVCF, you'll have sites with no alt alleles and so this annotation will not be present. If you want to double check that this is true, you can take a look at the position 10:1547 to see what that site looks like.
Best,
Genevieve
-
Hi Genevieve Brandt (she/her) ,
Thank you so much.
Sorry for the silly question, But where can I check position 10:1547. I should check it in vcf file?
-
No problem! Yes, it would be in your GVCF inputs.
-
Dear Genevieve Brandt (she/her),
Thank you so much.
Please sign in to leave a comment.
4 comments