Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

I am unable to use VQSR (recalibration) to filter variants Follow

1 comment

  • Avatar
    Mike Keehan

    I'd like to suggest a tool that makes it easier to visualize the INFO and FORMAT fields that get computed in an unfiltered VCF. All the work that VQSR does is it "magically" shows the impact of thresholds on precision recall  curves. VQSR uses it's own terminology which adds to the mystique when trying to explain it to other aka normal people.  So the tool(s) should do the following in steps of importance.

    1) Rip out all the VCF INFO and FORMAT statistics into everyone's favourite format (or two) (CSV?). This should be a commonly used format for R/SAS/Excel/IGV as well. i.e. easy and ergonomic Export of the VCF annotations.

    This might be all people need so they can do their own plots.

     

    2) make it easy to draw histograms and density bivarite plots of the data exported from step 1. This is purposefully close to how VQSR operates but the visualization should be described in standard statistical terminology. People can then start to understand how their own data is distributed and where they would set their hard filters. reuse an existing tool like RGobi, GGPlot? Or build your own to use a domain appropriate user interface.

    3) optional credit is to add a slider bar tool to plot precision/recall curves given a known good dataset. Again very close to VQSR in thinking but "tilted" to using a standard terminology to make it as approachable as any other data analysis.

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk