Funcotator misses many Cosmic fields - update possible?
(REQUIRED) Please provide:
a) GATK version used: 4.1.7.0
b) Exact command used: gatk Funcotator --variant $file --reference $ref --ref-version hg19 --data-sources-path $db --output $out --output-file-format VCF --transcript-list $transcript
I use funcotator_dataSources.v1.7.20200521s which I have download from ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/funcotator/
Some datasources I have zipped but cosmic.db I did not alter.
Everything seem to work fine, I do not get any warnings or error, but I don't see any cosmic fields in my output VCF like for example the Cosmic ID and FATHMM prediction. (which are in the cosmic.db file) But I do see a field Cosmic_overlapping_mutations, so it does use the cosmic database, is it supposed to be like this?
I have searched Github, and there I did find: https://github.com/broadinstitute/gatk/blob/a14917d7e64f0109fe7b053c256c81ae102f144b/src/test/resources/org/broadinstitute/hellbender/tools/funcotator/validationTestData/regressionTestHg19Large_expected.vcf
in which also no other Cosmic fields are added to the vcf, but it still doen't seem logic.
Can this be changed in future updates?
underneath the informative rows from the vcf file
##Funcotator Version=4.1.7.0 | Gencode 19 CANONICAL | CGC full_2012_03-15 | ClinVar_VCF 20180401 | Cosmic v84 | HGNC Nov302017 | dbSNP 9606_b151
##GATKCommandLine=<ID=Funcotator,CommandLine="Funcotator --output ../funcotated.vcf --ref-version hg19 --data-sources-path funcotator_dataSources.v1.7.20200521s --output-file-format VCF --transcript-list transcriptlist.txt --variant annot.vcf --reference hg19.fa --remove-filtered-variants false --five-prime-flank-size 5000 --three-prime-flank-size 0 --force-b37-to-hg19-reference-contig-conversion false --transcript-selection-mode CANONICAL --lookahead-cache-bp 100000 --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays -
##INFO=<ID=FUNCOTATION,Number=A,Type=String,Description="Functional annotation from the Funcotator tool. Funcotation fields are: Gencode_19_hugoSymbol|Gencode_19_ncbiBuild|Gencode_19_chromosome|Gencode_19_s
tart|Gencode_19_end|Gencode_19_variantClassification|Gencode_19_secondaryVariantClassification|Gencode_19_variantType|Gencode_19_refAllele|Gencode_19_tumorSeqAllele1|Gencode_19_tumorSeqAllele2|Gencode_19_gen
omeChange|Gencode_19_annotationTranscript|Gencode_19_transcriptStrand|Gencode_19_transcriptExon|Gencode_19_transcriptPos|Gencode_19_cDnaChange|Gencode_19_codonChange|Gencode_19_proteinChange|Gencode_19_gcCon
tent|Gencode_19_referenceContext|Gencode_19_otherTranscripts|CGC_Name|CGC_GeneID|CGC_Chr|CGC_Chr_Band|CGC_Cancer_Somatic_Mut|CGC_Cancer_Germline_Mut|CGC_Tumour_Types__(Somatic_Mutations)|CGC_Tumour_Types_(Ge
rmline_Mutations)|CGC_Cancer_Syndrome|CGC_Tissue_Type|CGC_Cancer_Molecular_Genetics|CGC_Mutation_Type|CGC_Translocation_Partner|CGC_Other_Germline_Mut|CGC_Other_Syndrome/Disease|ClinVar_VCF_AF_ESP|ClinVar_VC
F_AF_EXAC|ClinVar_VCF_AF_TGP|ClinVar_VCF_ALLELEID|ClinVar_VCF_CLNDISDB|ClinVar_VCF_CLNDISDBINCL|ClinVar_VCF_CLNDN|ClinVar_VCF_CLNDNINCL|ClinVar_VCF_CLNHGVS|ClinVar_VCF_CLNREVSTAT|ClinVar_VCF_CLNSIG|ClinVar_V
CF_CLNSIGCONF|ClinVar_VCF_CLNSIGINCL|ClinVar_VCF_CLNVC|ClinVar_VCF_CLNVCSO|ClinVar_VCF_CLNVI|ClinVar_VCF_DBVARID|ClinVar_VCF_GENEINFO|ClinVar_VCF_MC|ClinVar_VCF_ORIGIN|ClinVar_VCF_RS|ClinVar_VCF_SSR|ClinVar_
VCF_ID|ClinVar_VCF_FILTER|Cosmic_overlapping_mutations|HGNC_HGNC_ID|HGNC_Approved_Name|HGNC_Status|HGNC_Locus_Type|HGNC_Locus_Group|HGNC_Previous_Symbols|HGNC_Previous_Name|HGNC_Synonyms|HGNC_Name_Synonyms|H
GNC_Chromosome|HGNC_Date_Modified|HGNC_Date_Symbol_Changed|HGNC_Date_Name_Changed|HGNC_Accession_Numbers|HGNC_Enzyme_IDs|HGNC_Entrez_Gene_ID|HGNC_Ensembl_Gene_ID|HGNC_Pubmed_IDs|HGNC_RefSeq_IDs|HGNC_Gene_Fam
ily_ID|HGNC_Gene_Family_Name|HGNC_CCDS_IDs|HGNC_Vega_ID|HGNC_Entrez_Gene_ID(supplied_by_NCBI)|HGNC_OMIM_ID(supplied_by_OMIM)|HGNC_RefSeq(supplied_by_NCBI)|HGNC_UniProt_ID(supplied_by_UniProt)|HGNC_Ensembl_ID
(supplied_by_Ensembl)|HGNC_UCSC_ID(supplied_by_UCSC)|dbSNP_ASP|dbSNP_ASS|dbSNP_CAF|dbSNP_CDA|dbSNP_CFL|dbSNP_COMMON|dbSNP_DSS|dbSNP_G5|dbSNP_G5A|dbSNP_GENEINFO|dbSNP_GNO|dbSNP_HD|dbSNP_INT|dbSNP_KGPhase1|dbS
NP_KGPhase3|dbSNP_LSD|dbSNP_MTP|dbSNP_MUT|dbSNP_NOC|dbSNP_NOV|dbSNP_NSF|dbSNP_NSM|dbSNP_NSN|dbSNP_OM|dbSNP_OTH|dbSNP_PM|dbSNP_PMC|dbSNP_R3|dbSNP_R5|dbSNP_REF|dbSNP_RS|dbSNP_RSPOS|dbSNP_RV|dbSNP_S3D|dbSNP_SAO
|dbSNP_SLO|dbSNP_SSR|dbSNP_SYN|dbSNP_TOPMED|dbSNP_TPA|dbSNP_U3|dbSNP_U5|dbSNP_VC|dbSNP_VLD|dbSNP_VP|dbSNP_WGT|dbSNP_WTD|dbSNP_dbSNPBuildID|dbSNP_ID|dbSNP_FILTER">
-
Hi Tjitske de Vries, the GATK support team is focused on resolving questions about GATK tool-specific errors and abnormal results from the tools. For all other questions, such as this one, we are building a backlog to work through when we have the capacity.
Please continue to post your questions because we will be mining them for improvements to documentation, resources, and tools.
We cannot guarantee a reply, however, we ask other community members to help out if you know the answer.
For context, check out our support policy.
Please sign in to leave a comment.
1 comment