Splitting multiple sample VCF to single sample VCF before VariantFilteration
Do you recommend Splitting a multiple sample VCF to single sample VCFs before VariantFilteration step or is it better to do the split after the filtering? I have a cohort of 100 patients from a small isolated community and I'm interested in number of high quality variants per sample.
gatk SelectVariants \
-R ${GENOME_FASTA} \
-V $WDIR/VCF/jointcalling_recal.vcf \
-O $WDIR/VCF/sample_jointcalling_recal.vcf \
-sn ${SAMPLE_ID} \
--exclude-non-variants \
--remove-unused-alternates
-
Hi Are M
Both approaches are fine and valid for your purpose but in general our team suggests filtering first and splitting later approach. Filtering in multisample VCFs provide better evidence for rare variants therefore it is more preferable.
I hope this helps.
Please sign in to leave a comment.
1 comment