Creating Panel of Normals Error
Hello GATK,
I am creating a panel of normals from whole exome data from 1000 genomes project. I used a custom interval list which i created successfully using hg19.fa from UCSC database (the one on the GATK google bucked did not work.). I then run mutect2 in tumor mode and created the gendb as well.
I am left with the final stage 'CreateSomaticOfNormals' which requires the --germline-resource gnomad. However the error message I got indicates different contig names for my data and gnomad. I tried the gnomad files on google bucket as well as the onces on ucsc but they all do not work.
Files from GATK Bundle
af-only-gnomad.raw.sites.b37.vcf
af-only-gnomad.raw.sites.vcf
somatic-b37_af-only-gnomad.raw.sites.vcf
somatic-b37_af-only-gnomad.raw.sites.vcf
Files from UCSC browser
gnomad.exomes.r2.1.1.sites.vcf.bgz
gnomad.exomes.r2.0.2.sites.vcf.gz
Below is an extract of the error message
A USER ERROR has occurred: Input files reference and features have incompatible contigs: No overlapping contigs found.
reference contigs = [chr10, chr11, chr11_gl000202_random, chr12, chr13, chr14, chr15, chr16, chr17_ctg5_hap1, chr17, chr17_gl000203_random, chr17_gl000204_random, chr17_gl000205_random, chr17_gl000206_random, chr18, chr18_gl000207_random, chr19, chr19_gl000208_random, chr19_gl000209_random, chr1, chr1_gl000191_random, chr1_gl000192_random, chr20, chr21, chr21_gl000210_random, chr22, chr2, chr3, chr4_ctg9_hap1, chr4, chr4_gl000193_random, chr4_gl000194_random, chr5, chr6_apd_hap1, chr6_cox_hap2, chr6_dbb_hap3, chr6, chr6_mann_hap4, chr6_mcf_hap5, chr6_qbl_hap6, chr6_ssto_hap7, chr7, chr7_gl000195_random, chr8, chr8_gl000196_random, chr8_gl000197_random, chr9, chr9_gl000198_random, chr9_gl000199_random, chr9_gl000200_random, chr9_gl000201_random, chrM, chrUn_gl000211, chrUn_gl000212, chrUn_gl000213, chrUn_gl000214, chrUn_gl000215, chrUn_gl000216, chrUn_gl000217, chrUn_gl000218, chrUn_gl000219, chrUn_gl000220, chrUn_gl000221, chrUn_gl000222, chrUn_gl000223, chrUn_gl000224, chrUn_gl000225, chrUn_gl000226, chrUn_gl000227, chrUn_gl000228, chrUn_gl000229, chrUn_gl000230, chrUn_gl000231, chrUn_gl000232, chrUn_gl000233, chrUn_gl000234, chrUn_gl000235, chrUn_gl000236, chrUn_gl000237, chrUn_gl000238, chrUn_gl000239, chrUn_gl000240, chrUn_gl000241, chrUn_gl000242, chrUn_gl000243, chrUn_gl000244, chrUn_gl000245, chrUn_gl000246, chrUn_gl000247, chrUn_gl000248, chrUn_gl000249, chrX, chrY]
features contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, Y]
Please advice. GATK version is 4.1.1
-
Hi Vincent Appiah, the contigs from gnomad and your reference will need to match. The reference contigs are named with "chr" and the features are just the number. You can use our tool LiftOverVcf (Picard) to change the names.
-
Thanks Genevieve Brandt (she/her) . I was able to do the lift over. However I had some logs which I was not quite clear when doing the lift over. Below is an extract
INFO 2020-09-26 01:45:44 LiftOver Interval chr1:396626-396629 failed to match chain 381 because intersection length 3 < minMatchSize 4.0 (0.75 < 1.0)
INFO 2020-09-26 01:45:44 LiftOver Interval chr1:396627-396630 failed to match chain 381 because intersection length 2 < minMatchSize 4.0 (0.5 < 1.0)
INFO 2020-09-26 01:45:44 LiftOver Interval chr1:396628-396629 failed to match chain 381 because intersection length 1 < minMatchSize 2.0 (0.5 < 1.0)
INFO 2020-09-26 01:45:44 LiftOver Interval chr1:417963-417965 failed to match chain 381 because intersection length 2 < minMatchSize 3.0 (0.6666667 < 1.0)
INFO 2020-09-26 01:45:44 LiftOver Interval chr1:454973-455027 failed to match chain 381 because intersection length 54 < minMatchSize 55.0 (0.9818182 < 1.0)
INFO 2020-09-26 01:45:44 LiftOver Interval chr1:455118-455119 failed to match chain 381 because intersection length 1 < minMatchSize 2.0 (0.5 < 1.0)
INFO 2020-09-26 01:45:44 LiftOver Interval chr1:455443-455444 failed to match chain 381 because intersection length 1 < minMatchSize 2.0 (0.5 < 1.0)
INFO 2020-09-26 01:45:44 LiftOver Interval chr1:456431-456433 failed to match chain 381 because intersection length 2 < minMatchSize 3.0 (0.6666667 < 1.0)
INFO 2020-09-26 01:45:44 LiftOver Interval chr1:464465-464467 failed to match chain 381 because intersection length 1 < minMatchSize 3.0 (0.33333334 < 1.0) -
Vincent Appiah these are examples of intervals that did not match your chain file. It is not an error message, so you can still proceed, but it may be something you want to examine.
Please sign in to leave a comment.
3 comments