Mutect2 panel of normals and germline source files with incompatible contigs
If you are seeing an error, please provide(REQUIRED) :
a) GATK version used: 4.1.8.1
b) Exact command used:
c) Entire error log:
A USER ERROR has occurred: Input files reference and features have incompatible contigs: No overlapping contigs found.
reference contigs = [NC_000001.11, NT_187361.1, NT_187362.1, NT_187363.1, NT_187364.1, NT_187365.1, NT_187366.1, NT_187367.1, NT_187368.1, NT_187369.1, NC_000002.12, NT_187370.1,.........(where I omitted many more items)
features contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY, chrM, chr1_KI270706v1_random, chr1_KI270707v1_random, chr1_KI270708v1_random, chr1_KI270709v1_random, chr1_KI270710v1_random, chr1_KI270711v1_random, chr1_KI270712v1_random,........
If not an error, choose a category for your question(REQUIRED):
a)How do I get the VCF files (PoN and germline source) matching my bam files and reference?
Note:
My bam files were generated by mapping fasta files to either NCBI grch38 or grch38patch13. Under both circumstances this error occured.
The previous handling process of data includes:
- Mapping fasta to references (index and dictionary generated with Samtools Faidx and GATK -launch createsequencedictionary. Not knowing why, but another set of index files generated with BWA index)
- Sam to Bam transfer and sort, with samtools
- Duplicate reads removal by MarkDuplicatesSpark
I have acquired the PoN and Germline source files from the following link. This is from a discussion threads.
https://console.cloud.google.com/storage/browser/gatk-best-practices/somatic-hg38/
The files I downloaded are
-
Hi Field -Ye Tian, you will not be able to use Mutect2 with HG38 resources and a GRCh38 reference. You can either find GRCh38 resources to match your data, use LiftOver to change your files to the same reference version, or change the reference version that you are using in your analysis. All files need to be using the same reference version to work with Mutect2 because they need the same contig names.
-
Hi Genevieve,
Thank you for the prompt reply. It seems from your reply that GRCh38 and HG38 are slightly different. I've been searching webpages but all referred the two identically. I'd like to confirm that I understood you correctly.
I tried but failed to find a HG38 reference.
Would you please kindly provide a source?
Thank you so much.
-
-
Hi Genevieve,
Thank you for the reply.
I'd like to quickly update that the resources you provided are useful.
Just a quick note for whomever encountered the same issue, the GATK germline resource and PoN files are not compatible with the assembly grch38 that I downloaded from NCBI.
If I align my fasta results to the references generated from the file
resources_broad_hg38_v0_Homo_sapiens_assembly38.fasta (which can be found from the link below)
https://console.cloud.google.com/storage/browser/gatk-best-practices/somatic-hg38
then the Mutect2 worked fine.
-
Field -Ye Tian Thank you for the update and for posting the solution for other GATK users!
Please sign in to leave a comment.
5 comments