converting multisample vcf to fasta
AnsweredI have posted on Biostars about this but have no replies. How can I use the GATK tool kit to convert a multisample VCf file to a fasta file for phylogenetic analysis?
-
Hi ,
The GATK support team is focused on resolving questions about GATK tool-specific errors and abnormal results from the tools. For all other questions, such as this one, we are building a backlog to work through when we have the capacity.
Please continue to post your questions because we will be mining them for improvements to documentation, resources, and tools.
We cannot guarantee a reply, however, we ask other community members to help out if you know the answer.
For context, check out our support policy.
-
First, it is incorrect to call such data transformation as conversion. The data in VCF and fasta are nonequivalent. Second, googling can help. There is the same question on biostars (https://www.biostars.org/p/360900/) with answers. I guess the best option is to use BCFtools (see this page for exact commands for your query https://samtools.github.io/bcftools/howtos/consensus-sequence.html). You can also use https://gatk.broadinstitute.org/hc/en-us/articles/360042914811-FastaAlternateReferenceMaker
You can handle multisampling manually in a script wrapper which selects one sample out of an array of samples and creates a single-sample VCF for it, followed by running any of the above-mentioned tools.
Please sign in to leave a comment.
2 comments