Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Queey regarding Read group effect on Haplotype caller and Pipeline steps

Answered
0

6 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Abrish,

    I am going to move your post into our Community Discussions -> General Discussion topic, as the Non-Human topic is for reporting bugs and issues with GATK.

    You can read more about our forum guidelines and the topics here: Forum Guidelines.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Abrish,

    HaplotypeCaller treats reads as the same sample if they share the same SM (Sample) tag. With HaplotypeCaller, you'll want different samples to have different SM tags. You can manually change the read group SM tag in your input BAM and then you should have no issues with HaplotypeCaller. 

    You can read more about read groups in this article: https://gatk.broadinstitute.org/hc/en-us/articles/360035890671-Read-groups

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Abrish

    Hi Genevieve Brandt (she/her) ,

     

    Thank you so much .. I had read the article, But it is kind of confusing for me. I would like to clarify only one thing that both ways are correct regarding adding read group for Haplotypecaller. According to your answer, it looks like both ways are correct, as far as I understood. It would be appreciable, if could clarify it.

     

     

    Thank you so much in advance.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    The way you assign read groups depends on how your sequencing was performed. Sometimes one sample contains multiple read groups. In that case, there would be multiple read group identifiers (ID) but the same sample name (SM).

    Does this answer your question?

    0
    Comment actions Permalink
  • Avatar
    Abrish

    Dear Genevieve Brandt (she/her),

    Thank you so much.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    You're welcome!

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk