Error running mutect2 on SAM files remapped to hg38 (original reads were aligned to hg19).
AnsweredIf you are seeing an error, please provide(REQUIRED) :
a) GATK version used: latest version
b) Exact command used:
c) Entire error log:
Hello,
I have a problem in running mutect2 and I would be grateful for your help.
My reads are spliced sequences of aligned WGS reads downloaded from ICGC using score-client 5. The format of the downloaded read files is SAM. and according to ICGC they are originally aligned tO H19 (GRCh37). To be able to run mutect with the PON and Germline references which belong to hg38 assembly, I first had to remap the reads to hg38 which was done using crossmap and GALAXY. followed you can see the error an the commands used in mutect2 performed on the remapped files I obtained from galaxy or cross map:
Cross map
"A USER ERROR has occurred: Unknown file is malformed: Could not read sequence dictionary from given fasta file references_hg38_v0_Homo_sapiens_assembly38.dict".
this is the command I used:
gatk Mutect2 \
-R references_hg38_v0_Homo_sapiens_assembly38.fasta \
-I test.hg38.sam \
-I ctrl.sam \
-normal UCR_1 \
-L resources_broad_hg38_v0_wgs_calling_regions.hg38.interval_list\
--sequence-dictionary references_hg38_v0_Homo_sapiens_assembly38.dict\
--germline-resource somatic-hg38_af-only-gnomad.hg38.vcf.gz \
--panel-of-normals somatic-hg38_1000g_pon.hg38.vcf.gz\
--output somatic.vcf.gz\
using Galaxy to remap my spliced reads to hg38, I now get this error with mutect2:
" Badly formed genome unclippedLoc: Contig chr1 given as location, but this contig isn't present in the Fasta sequence dictionary"
Please help!!:(((
This is the command I run:
gatk Mutect2 \
-R hg38_1.fasta \
-I Galaxy1_tumor.hg38.sam \
-I Galaxy1_ctrl.hg38.sam \
-normal UCR_1 \
-L resources_broad_hg38_v0_wgs_calling_regions.hg38.interval_list\
--sequence-dictionary hg38_1.dict\
--germline-resource somatic-hg38_af-only-gnomad.hg38.vcf.gz \
--panel-of-normals somatic-hg38_1000g_pon.hg38.vcf.gz\
--ignore-itr-artifacts\
--output somatic.vcf.gz\
-R hg38_1.fasta \
> -I Galaxy1_tumor.hg38.sam \
> -I Galaxy1_ctrl.hg38.sam \
> -normal UCR_1 \
> -L resources_broad_hg38_v0_wgs_calling_regions.hg38.interval_list\
> --sequence-dictionary hg38_1.dict\
> --germline-resource somatic-hg38_af-only-gnomad.hg38.vcf.gz \
> --panel-of-normals somatic-hg38_1000g_pon.hg38.vcf.gz\
> --ignore-itr-artifacts\
> --output somatic.vcf.gz\
>
Using GATK jar /gatk/gatk-package-4.1.9.0-SNAPSHOT-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.1.9.0-SNAPSHOT-local.jar Mutect2 -R hg38_1.fasta -I Galaxy1_tumor.hg38.sam -I Galaxy1_ctrl.hg38.sam -normal UCR_1 -L resources_broad_hg38_v0_wgs_calling_regions.hg38.interval_list --sequence-dictionary hg38_1.dict --germline-resource somatic-hg38_af-only-gnomad.hg38.vcf.gz --panel-of-normals somatic-hg38_1000g_pon.hg38.vcf.gz --ignore-itr-artifacts --output somatic.vcf.gz
05:04:02.368 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.1.9.0-SNAPSHOT-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jul 15, 2021 5:04:02 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
05:04:02.770 INFO Mutect2 - ------------------------------------------------------------
05:04:02.771 INFO Mutect2 - The Genome Analysis Toolkit (GATK) v4.1.9.0-SNAPSHOT
05:04:02.771 INFO Mutect2 - For support and documentation go to https://software.broadinstitute.org/gatk/
05:04:02.772 INFO Mutect2 - Executing as root@3535daec1e46 on Linux v5.4.39-linuxkit amd64
05:04:02.772 INFO Mutect2 - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_242-8u242-b08-0ubuntu3~18.04-b08
05:04:02.772 INFO Mutect2 - Start Date/Time: July 15, 2021 5:04:02 AM GMT
05:04:02.772 INFO Mutect2 - ------------------------------------------------------------
05:04:02.772 INFO Mutect2 - ------------------------------------------------------------
05:04:02.773 INFO Mutect2 - HTSJDK Version: 2.23.0
05:04:02.773 INFO Mutect2 - Picard Version: 2.23.3
05:04:02.773 INFO Mutect2 - HTSJDK Defaults.COMPRESSION_LEVEL : 2
05:04:02.773 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
05:04:02.774 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
05:04:02.774 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
05:04:02.774 INFO Mutect2 - Deflater: IntelDeflater
05:04:02.774 INFO Mutect2 - Inflater: IntelInflater
05:04:02.774 INFO Mutect2 - GCS max retries/reopens: 20
05:04:02.774 INFO Mutect2 - Requester pays: disabled
05:04:02.775 INFO Mutect2 - Initializing engine
05:04:03.437 INFO FeatureManager - Using codec VCFCodec to read file file:///gatk/my_data/somatic-hg38_1000g_pon.hg38.vcf.gz
05:04:03.647 INFO FeatureManager - Using codec VCFCodec to read file file:///gatk/my_data/somatic-hg38_af-only-gnomad.hg38.vcf.gz
05:04:03.780 INFO FeatureManager - Using codec IntervalListCodec to read file file:///gatk/my_data/resources_broad_hg38_v0_wgs_calling_regions.hg38.interval_list
05:04:04.261 INFO Mutect2 - Shutting down engine
[July 15, 2021 5:04:04 AM GMT] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 0.03 minutes.
Runtime.totalMemory()=251133952
A USER ERROR has occurred: Unknown file is malformed: Could not read sequence dictionary from given fasta file references_hs37d5.dict
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
I would be thankful for any help.
Best
-
Hello zdr j,
For your CrossMap error, it looks like the sequence dictionary is malformed for some reason. You can recreate it using a gatk tool: CreateSequenceDictionary (Picard).
It looks like you are getting a very similar error message from your Galaxy output. Try out the above solution and let me know if it helps.
Best,
Genevieve
Please sign in to leave a comment.
1 comment