Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

SelectVariants discordance gives no discordance

0

4 comments

  • Avatar
    Gökalp Çelik

    Hi Jordan Brungardt

    Have you removed HOMREF calls from  both sets before you used SelectVariants with discordance option? 

    If both sets contain same sites then it may not be possible to pick discordant calls from one set when compared to the other. 

    I hope this helps. 

    0
    Comment actions Permalink
  • Avatar
    Jordan Brungardt

    Wishful thinking strikes again.  It sounded like this was for selecting alleles unique to an individual.  If discordance is filtering for sites unique sites/loci than it should be working fine.  Is there a GATK function for what I'm looking for?  It sounds like there's a third party script for vcftools that selects unique alleles so I might go that route.   

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi Jordan Brungardt

    Instead of a single step work for this job you may be able to split this whole work into multiple steps. 

    1- Annotate your VCF files with VariantAnnotator and include SampleList annotation therefore you may be able to see which samples contain the alt allele. 

    2- Convert your VCF files to table and make sure that SampleList annotation is present in the columns.

    3- You may be able to filter your variants using any tool you want such as R or python or bash etc.  to get sites that are unique for each taxa. 

    I hope this helps. 

    0
    Comment actions Permalink
  • Avatar
    Jordan Brungardt

    Thanks for the reply Gökalp,

     

    The vcf-tools vcf-contrast function works nicely for one-to-many individuals analysis when reducing the vcf file to only individuals used in the query.  It seems to misbehave when doing several-to-many comparisons as far as I can tell.  It still cuts down on a lot of work.  Seems like this should be a very relevant research question.  Maybe a function that can be added to GATK in the future.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk