Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

SelectVariants Follow

1 comment

  • Avatar
    Gianfilippo Coppola

    Hi,

    I have a VCF resulting from running Mutect2 (joint calling) on multiple Tumor-only samples (using Sarek).

    The specific file has 2 samples and 1842870 calls. If I count all the PASS calls,

    cat test.vcf | grep PASS | wc -l

    I get 39389

    Now, if I run 

    gatk SelectVariants -R $genFILE -V $dataFILE.vcf -O test.vcf

    gatk SelectVariants -R $genFILE -V $dataFILE.vcf -O test.vcf --exclude-filtered

    gatk SelectVariants -R $genFILE -V $dataFILE.vcf -O test.vcf --exclude-non-variants

    gatk SelectVariants -R $genFILE -V $dataFILE.vcf -O test.vcf --exclude-filtered --exclude-non-variants

    and I count all the PASS calls, I get, respectively

    3533, 3407, 1116, 2975

    I must be doing something stupid. Can someone comment ?

    NOTE: I am running  GATK v4.6.0.0

    Thanks

     

     

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk