Funcotator and mutect2 coordinate system
If you are seeing an error, please provide(REQUIRED) :
a) GATK version used:
b) Exact command used:
c) Entire error log:
If not an error, choose a category for your question(REQUIRED):
a)How do I (......)?
b) What does (......) mean?
c) Why do I see (......)?
d) Where do I find (...
whether mutect2 and funcotator either use a different numbering system for multi-base deletions?
I discovered this subtle inconsistency only by accident.
mutect 2 detects a common in PIK3R1 mutant at chr5 68293750 whereas
that mutect2 file sent to funcotator returns the same mutation at
g.chr5:68293751_68293765delAAATTACATGAATAT
i.e. a 0-->1 based shift
...)?
e) Will (......) be in future releases?
-
Robert Bremel could you supply the Mutect2 and Funcotator commands you used, as well as the version numbers?
-
I am using the docker version 4.1.8.1
vcf file
##source=FilterMutectCalls
##source=Mutect2 -default commandFuncotator --output mydata/P58772/analysis/P58772_7_mutect2_funcotator_hg38.maf --ref-version hg38 --data-sources-path mydata/dataSourcesFolder/funcotator_dataSources.v1.6.20190124s/ --output-file-format MAF --variant mydata/P58772/analysis/P58772_7_mutect2_filtered_hg38.vcf --reference mydata/refs/Homo_sapiens_assembly38.fasta --verbosity ERROR --remove-filtered-variants false --five-prime-flank-size 5000 --three-prime-flank-size 0 --force-b37-to-hg19-reference-contig-conversion false --transcript-selection-mode CANONICAL --lookahead-cache-bp 100000 --min-num-bases-for-segment-funcotation 150 --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false
-
I combined the funcotator.maf files and mutect2.vcf files from 20 variant sets and tallied-up the differences.
I have only been working at this for a couple of months but the issue seems to arise in both 4.1.7.1 and 4.1.8.1 docker versions
The issue seems to be confined to the following variant class deletions.
The 'POS' coordinate in the .vcf matches the 'Start_Position' -1 in the .maf
For others they coordinates match.
In_Frame_Del,DEL
5'Flank,DEL
Intron,DEL
RNA,DEL
3'UTR,DEL
IGR,DEL
Splice_Site,DEL
Frame_Shift_Del,DEL -
Robert Bremel is this issue present in older GATK versions?
-
Sorry, I really don't know about anything prior to 4.1.7.1 (only Docker). I am relatively new to this whole area having spent my career downstream in the protein world! :-)
I only happened on it when one of my collaborators asked what had happened to a common mutation that had gone missing from a dataset? It happened when we hooked the .maf to the .vcf to cross check a couple of things. After having convinced myself that I hadn't screwed up I set about trying to figure out what had happened.
Indels can really be a bear to 'proteinate' with only a single nucleotide coordinate, a couple of oligos and a half dozen protein variants.
For example, with deletions, the "Reference_Allele' and "Tumor_allele" oligos, although they are the same, are many times difficult to unambiguously assign in the proper reading frame So, when the coordinate is different it really can really make a mess of things. I guess loading the sequences into Lasergene helps to figure things out most of the time.
Actually, inclusion of some type of contextual upstream and downstream oligos in the mutect2 output would really help. It would seem while in operation mutect2 could do that unambiguously.
-
Robert Bremel thank you for the info, I'll look into this and keep you up to date when I have more information.
-
Robert Bremel I have heard back from the team and confirmed that this is not a bug/issue. The annotation you are referring to for this mutation has a different meaning than what you are writing here. Here the documentation where we go over the annotations:
This annotation (g.chr5:68293751_68293765delAAATTACATGAATAT) is field 12 - genomeChange (link), which is demonstrating which of the bases were changed, not where the mutation occurs. For this annotation, you are seeing the bases that have been deleted in this mutation, which is 68293751-68293755, because those are the bases that were specifically changed. If you have the VCF output from funcotator, you will see that the position matches the mutect2 position (chr5 68293750).
Please sign in to leave a comment.
7 comments