How to find or generate common germline variant sites VCF required by GetPileupSummaries
Hi,
I'm trying to follow "(How to) Call somatic mutations using GATK4 Mutect2"(https://gatk.broadinstitute.org/hc/en-us/articles/360035531132), and I'm using the latest GATK 4.1.7.0. So far the tutorial has been easy to follow(!), but now I'm having a bit of a problem with the GetPileupSummaries step, and specifically with which file to use with -V and -L parameters.
When I read the GetPileupSummaries instructions (https://gatk.broadinstitute.org/hc/en-us/articles/360042913771-GetPileupSummaries), it says: "The tool requires a common germline variant sites VCF, e.g. derived from the gnomAD resource, with population allele frequencies (AF) in the INFO field. This resource must contain only biallelic SNPs and can be an eight-column sites-only VCF.". I'm using the recommended af-only-gnomad.hg38.vcf.gz with --germline-resource parameter in the Mutect2 step. My question is: does GATK provide instructions of how to derive the common germline variant sites VCF from the recommended gnomAD resource, or is there already a derived sites VCF file that GATK provides and recommends to use? I tried to find either of those, instructions or recommended resource file, with no luck, so any help to get past this step is appreciated!
-Antti
-
You can find it in our best practices google bucket. Depending on your reference, gs://gatk-best-practices/somatic-b37/small_exac_common_3.vcf or gs://gatk-best-practices/somatic-hg38/small_exac_common_3.hg38.vcf.gz (the accompanying VCF indices are also there).
-
David, thank you very much for confirming this!
-
Dear GATK Team,
Related to the above query, I can see there are common germline variant sites VCFs for both b37 and hg38 in the GATK Best Practices Google bucket. However, not for hg19.
I have lifted over the --germline resource from b37 (af-only-gnomad.raw.sites.b37.vcf.gz) to hg19 using LiftoverVcf. However, for the common germline variant sites VCF, would it be recommended to lift over the existing b37 VCF (small_exac_common_3.vcf) to hg19 using LiftoverVcf?
Is there a method to generate the common germline variant sites VCF directly from the hg19 --germline-resource?
Thank you for your time and help.
-
Hi ISmolicz,
The GATK support team is focused on resolving questions about GATK tool-specific errors and abnormal results from the tools. For all other questions, such as this one, we are building a backlog to work through when we have the capacity.
Please continue to post your questions because we will be mining them for improvements to documentation, resources, and tools.
We cannot guarantee a reply, however, we ask other community members to help out if you know the answer.
For context, check out our support policy.
Please sign in to leave a comment.
4 comments