Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

gatk SelectVariant identified no snps in the vsf file while it has 63702926 snps

0

2 comments

  • Avatar
    Chris Kachulis

    Hi Vishal,

    Based on the file extensions, it looks like you are running on a GVCF.  Currently, when SelectVariants is run on a GVCF to select snps, the output will always be empty, due to the inclusion of <NON_REF> alleles confusing SelectVariants (see https://github.com/broadinstitute/gatk/issues/7111).

    There is a PR to fix this here, but it hasn't been merged yet: https://github.com/broadinstitute/gatk/pull/7193 

    If you need to do this with SelectVariants, you could either convert your gvcf to a vcf first, or try running on the branch for that pr (mg_gvcf_aware_varianttypesvariantfilter).  Though given that it has not yet been merged, definitely "buyer beware".  You could also try bcftools or vcftools, one of them might be able to do this subsetting for currently.

    0
    Comment actions Permalink
  • Avatar
    Chris Kachulis

    Vishal Negi the PR I reference above was merged recently, so this should be fixed as of 4.4.0.0

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk