I am running a Somatic variant discovery protocol on plant samples.
Currently at the GetPileupSummaries stage.
This stage requires a common germline variant sites VCF file.
However, there is no such data for studied genome.
1) Is there a way to run GetPileupSummaries without this file?
* It seems that result from GetPileupSummaries is required for the next stage of CalculateContamination, and results form the latter is used by FilterMutectCalls. This makes GetPileupSummaries essential in this piepline.
2) What is the best way to generate such vcf?
Would it make sense to use vcf resulting from Mutect2 (although it is not germlined based)?
If so, could not use the vcf produced by Mutect2 for GetPileupSummaries as GetPileupSummaries exits with ERROR.
a) GATK version used: gatk4-v188.8.131.52
b) Commands used:
c) Entire program log:
1. Running Mutect2 to find somatic mutations in a plant.
Used "Tumor-only mode":
gatk Mutect2 -R Ref_genome.fa -I PlantSample1.bam -O PlantSample1.vcf.gz
[This command made also these 2 files: PlantSample1.vcf.gz.stats and PlantSample1.vcf.gz.tbi].
2. Running GetPileupSummaries
gatk GetPileupSummaries -I PlantSample1.bam -V PlantSample1.vcf.gz -L PlantSample1.vcf.gz -O PlantSample1_pileups.table
Exit on Error: "A USER ERROR has occurred: Bad input: Population vcf does not have an allele frequency (AF) info field in its header."
It looks like GetPileupSummaries requires AF input in this format: "AF=0.063" inside the info column of the vcf file.
[based on https://gatk.broadinstitute.org/hc/en-us/articles/360037593451-GetPileupSummaries]
However, inspecting vcf file generated by Mutect2, it seems that it contains AF in a different format
GT:AD:AF:DP:F1R2:F2R1:SB in "FORMAT" coulmn
0/1:0,1:0.667:1:0,0:0,1:0,0,1,0 in the last column
Thus, in this vcf file generated by Mutect2 AF is 0.667.
However there is no explicit "AF=" in the file.
I assume this is the reason GetPileupSummaries exit on this error: "Population vcf does not have an allele frequency (AF) info field in its header.".
Thank you for your help in advance,
Data from gatk protocols:
Please sign in to leave a comment.