I'm trying to follow "(How to) Call somatic mutations using GATK4 Mutect2"(https://gatk.broadinstitute.org/hc/en-us/articles/360035531132), and I'm using the latest GATK 184.108.40.206. So far the tutorial has been easy to follow(!), but now I'm having a bit of a problem with the GetPileupSummaries step, and specifically with which file to use with -V and -L parameters.
When I read the GetPileupSummaries instructions (https://gatk.broadinstitute.org/hc/en-us/articles/360042913771-GetPileupSummaries), it says: "The tool requires a common germline variant sites VCF, e.g. derived from the gnomAD resource, with population allele frequencies (AF) in the INFO field. This resource must contain only biallelic SNPs and can be an eight-column sites-only VCF.". I'm using the recommended af-only-gnomad.hg38.vcf.gz with --germline-resource parameter in the Mutect2 step. My question is: does GATK provide instructions of how to derive the common germline variant sites VCF from the recommended gnomAD resource, or is there already a derived sites VCF file that GATK provides and recommends to use? I tried to find either of those, instructions or recommended resource file, with no luck, so any help to get past this step is appreciated!
Please sign in to leave a comment.