Some stop positions of .interval_list is smaller than the start positions
Hi GATK team,
GATK version: 4.1.9.0
Picard Version: 2.23.3
I am working on exome-seq data and used HaplotypeCaller to create VCF file from my bam files. I downloaded the interval list (hg19) from the bucket that was suggested in another post. However, I keep facing an error.
A USER ERROR has occurred: Badly formed genome unclippedLoc: Parameters to GenomeLocParser are incorrect:The stop position 14096821 is less than start 14096822 in contig 1
Here is my full command:
gatk HaplotypeCaller -R fasta/hg19_v0_Homo_sapiens_assembly19.fasta -I CD_1.bam -O variant.vcf -L ~/hg19_v0_HybSelOligos_whole_exome_illumina_coding_v1_whole_exome_illumina_coding_v1.Homo_sapiens_assembly19.targets.interval_list
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /Users/kazempour/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar HaplotypeCaller -R fasta/hg19_v0_Homo_sapiens_assembly19.fasta -I CD_1.bam -O /Users/kazempour/temp/variant.vcf -L /Users/kazempour/proj/dementia/data/white/sequences/hg19_v0_HybSelOligos_whole_exome_illumina_coding_v1_whole_exome_illumina_coding_v1.Homo_sapiens_assembly19.targets.interval_list -ip 100
09:02:47.590 INFO NativeLibraryLoader - Loading libgkl_compression.dylib from jar:file:/Users/kazempour/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_compression.dylib
Dec 02, 2020 9:02:47 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
09:02:47.728 INFO HaplotypeCaller - ------------------------------------------------------------
09:02:47.729 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.1.9.0
09:02:47.729 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
09:02:47.729 INFO HaplotypeCaller - Executing as kazempour@g11mzarel152240.lan on Mac OS X v10.13.6 x86_64
09:02:47.729 INFO HaplotypeCaller - Java runtime: Java HotSpot(TM) 64-Bit Server VM v11.0.1+13-LTS
09:02:47.729 INFO HaplotypeCaller - Start Date/Time: December 2, 2020 at 9:02:47 AM CST
09:02:47.729 INFO HaplotypeCaller - ------------------------------------------------------------
09:02:47.729 INFO HaplotypeCaller - ------------------------------------------------------------
09:02:47.730 INFO HaplotypeCaller - HTSJDK Version: 2.23.0
09:02:47.730 INFO HaplotypeCaller - Picard Version: 2.23.3
09:02:47.730 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
09:02:47.730 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
09:02:47.730 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
09:02:47.730 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
09:02:47.730 INFO HaplotypeCaller - Deflater: IntelDeflater
09:02:47.730 INFO HaplotypeCaller - Inflater: IntelInflater
09:02:47.730 INFO HaplotypeCaller - GCS max retries/reopens: 20
09:02:47.730 INFO HaplotypeCaller - Requester pays: disabled
09:02:47.730 INFO HaplotypeCaller - Initializing engine
09:02:47.886 INFO FeatureManager - Using codec IntervalListCodec to read file file:///Users/kazempour/proj/dementia/data/white/sequences/hg19_v0_HybSelOligos_whole_exome_illumina_coding_v1_whole_exome_illumina_coding_v1.Homo_sapiens_assembly19.targets.interval_list
09:02:47.980 INFO HaplotypeCaller - Shutting down engine
[December 2, 2020 at 9:02:47 AM CST] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=645922816
***********************************************************************
A USER ERROR has occurred: Badly formed genome unclippedLoc: Parameters to GenomeLocParser are incorrect:The stop position 14096821 is less than start 14096822 in contig 1
When I took a look at the interval list, I realized there are some lines that the start position is larger than the end position, for example:
1 7202191 7202190 + CEX-chr1-7202190-7202190
1 14096822 14096821 + CEX-chr1-14096821-14096821
This list was provided from GATK team, is there any way to fix this issue?
Many thanks,
Shiva.
-
Hi Shiva, could you confirm if you are using the ICE exome capture kit? It looks like that file is specific to an exome capture kit used with the Broad sequencing.
-
Could you also confirm if you used the reference version also in the google bucket? https://console.cloud.google.com/storage/browser/gcp-public-data--broad-references/hg19/v0;tab=objects?prefix=&forceOnObjectsSortingFiltering=false
Please sign in to leave a comment.
2 comments