Funcotator missing cDNA, Codon, and Protein Change for some chromosomes
Answered
I am using the :latest docker version of GATK, running on Docker Desktop on a high-end Windows 10 workstation with 128GB RAM
I had a strange occurrence that seems to be a glitch in the system somewhere. A week or so ago I had something crop up with Funcotator which led me to install v1.7 of the source materials. I then ran a set of files through our workflow that we have used for nearly a year an sent the results to a colleague to do her part of the process. She came back saying that a significant number (not all) of the 'Protein_Change' column items were missing. Thinking that maybe it was something amiss with v1.7 I re-funcotated with v1.6 and got the same results. So, I went back to an unaligned .bam and re-ran everything -- exactly the same results. In perusing the output there seemed to be a pattern in that Chr 1 seemed to be totally absent?
This led me to a bit further check extracted the missense SNPs with an AS_FilterStatus as 'SITE' from the .maf and tallied three columns that seemed to be problematic in the .maf. Transcript_Position, cDNA_Change, Codon_Change, Protein_change
Below are tab delimited columns and Nrows = number of SITEs in that particular chromosome.
Chr Nrows Transcript_Position cDNA_Change Codon_Change Protein_change
chr1 44 0 0 0 0
chr2 14 0 9 0 0
chr3 20 0 20 0 0
chr4 5 0 5 0 0
chr5 13 0 13 0 0
chr6 12 0 12 0 0
chr7 17 0 17 0 0
chr8 7 0 7 0 0
chr9 16 0 16 0 0
chr10 10 0 10 0 0
chr11 15 0 15 0 0
chr12 12 0 12 0 0
chr13 9 0 9 0 0
chr14 5 0 5 0 0
chr15 14 9 14 10 8
chr16 15 15 15 15 15
chr17 12 12 12 12 12
chr18 2 2 2 2 2
chr19 22 22 22 22 22
chr20 4 4 4 4 4
chr21 5 5 5 5 5
chr22 13 13 13 13 13
chrX 14 14 14 14 14
set 2
chr1 54 0 0 0
chr2 19 9 0 0
chr3 13 13 0 0
chr4 18 18 0 0
chr5 9 9 0 0
chr6 15 15 0 0
chr7 26 26 0 0
chr8 20 20 0 0
chr9 13 13 0 0
chr10 10 10 0 0
chr11 30 30 0 0
chr12 19 19 0 0
chr13 19 19 0 0
chr14 4 4 1 0
chr15 12 12 12 12
chr16 17 17 17 17
chr17 16 16 16 16
chr18 10 10 10 10
chr19 29 29 29 29
chr20 6 6 6 6
chr21 3 3 3 3
chr22 14 14 14 14
chrX 12 12 12 12
As you can see Chr1 is totally missing and the results are variable up to Chr15 and everything beyond that is okay.
I ran a second set of files from a different subject through the best practices, mutect2, and funcotator. Same pattern.
:-) we use anonymal to anonymize the data (random adjective + random animal) This is not a goat sequence!
## GATKCommandLine=<ID=Funcotator,CommandLine="Funcotator --output mydata/GBM_00067_NiceGoat/analysis/GBM_00067-92007_DT_NiceGoat_mutect2_funcotator_hg38_1.7.maf --ref-version hg38 --data-sources-path mydata/dataSourcesFolder/funcotator_dataSources.v1.7.20200521s/ --output-file-format MAF --variant mydata/GBM_00067_NiceGoat/analysis/GBM_00067-92007_DT_NiceGoat_mutect2_filtered_hg38.vcf --reference mydata/refs/Homo_sapiens_assembly38.fasta --verbosity ERROR --remove-filtered-variants false --five-prime-flank-size 5000 --three-prime-flank-size 0 --force-b37-to-hg19-reference-contig-conversion false --transcript-selection-mode CANONICAL --lookahead-cache-bp 100000 --min-num-bases-for-segment-funcotation 150 --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false",Version="4.2.0.0",Date="June 2, 2021 7:17:41 PM GMT
If not an error, choose a category for your question(REQUIRED):
a)How do I (......)?
b) What does (......) mean?
c) Why do I see (......)?
d) Where do I find (......)?
e) Will (......) be in future releases?
-
Hi Robert Bremel,
Thanks for writing in! Could you provide sample VCF data with 3-4 variants that should have these annotations and don't as well as 3-4 variants that are correct?
Please follow these instructions to upload data:
https://gatk.broadinstitute.org/hc/en-us/articles/360035889671
Best,
Genevieve
Please sign in to leave a comment.
1 comment