Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

How do I call variants from somatic RNA data?

Answered
0

6 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Peiwen Li,

    I am going to move your post into our Community Discussions -> Special GATK Use Cases topic, as the Non-Human topic is for reporting bugs and issues with GATK.

    You can read more about our forum guidelines and the topics here: Forum Guidelines.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Peiwen Li, it depends on if you are looking to find differences between your samples and a reference? Or, your samples to a normal + reference? You can read more about the distinction in this article: Somatic calling is NOT simply a difference between two callsets

    0
    Comment actions Permalink
  • Avatar
    Peiwen Li

    Hi Genevieve Brandt (she/her),

    Thank you for your replies and I am sorry I posted my questions to the wrong place. 

    I am looking to find differences between my samples and a reference. And after reading the article you shared, I think my situation is more fitted into the HaplotypeCaller's algorithm?

    Thank you,

    Peiwen

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    I think so! Glad it helped to clarify!

    Have a good weekend,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Peiwen Li

    Hi Genevieve Brandt (she/her),

    I have a confusion on the "RNAseq short variant discovery (SNPs + Indels)" GATK Best practices workflow. It suggests to use per-sample variant calling for RNAseq data. What does that mean exactly? If I have multiple BAM inputs, should I run HaplotypeCaller for each of them one at a time, and get multiple VCF outputs? Then how can I get ONE single concordant multi-sample VCF for all VCF outputs?

    Thank you so much and I hope you have a great weekend!

    Peiwen 

     

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Peiwen,

    Joint calling is not supported yet for this method. With this best practices pipeline you will get multiple VCF outputs if you have multiple BAM inputs.

    Genevieve

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk