Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

every exon junctions in Bam files got softclipped after run SplitNCigarReads



  • Avatar
    Genevieve Brandt (she/her)


    Are you working with Somatic data? We don't have any best practices resources for Somatic RNA SNPs & Indels, so I don't have resources to point to potential issues with your command line. Other community members might know more about possible issues there.

    If you want to start looking into why certain variants are called (for example, your indels), you can take a look at this resource about troubleshooting the output of Mutect2. I would recommend taking a look at some of the sites in IGV with the option -bamout to visualize what is happening at those sites. You can also take a look at the Mutect2 algorithm descriptions in our documentation, and the Mutect2 white paper.

    Hope this helps to figure out what is going on!


    Comment actions Permalink
  • Avatar

    Hi Genevieve Brandt (she/her)

    Thanks for your fast and kind answer.

    yes i'm working with somatic data but not cancer.

    i just wanna know how many somatic mutation burden piled in specific environment in mouse thyroid tissues

    so... there is no matched normal sample. 

    so i used mutect2 tumor only mode with mouse germline resource(made through all mouse samples that out from haplotypecaller.(because my mice were B6 strain that inbred)).


    yes i took a look my false indel, snv calls using IGV

    the upper window : before SplitNCigar run (only aligned with STAR and markduplicated)

    the lower window : after SplitNCigar run 

    as you can see all exon junctions got softclipped and those were called as indels in Mutect2 output. 

    those are not eliminated during BaseRecalibrator, applyBQSR, or FilterMutectCalls.

    so my question is ... is there any way to remove those reads?

    thanks for your answer again.

    and many thanks in advance for next answer.

    have a great day!


    -Han sai lee

    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Han Sai Lee,

    Thanks for the follow up, I'll see if I can take a closer look in the future, though at the moment I have a lot of other issues to go through first. (Here is our support policy for more details).

    I quickly looked around the forum though and found some examples of the Mutect2 devs working with users who are using Mutect2 for RNA data. See if the insight in these threads helps out as well.


    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hello Han Sai Lee,

    I know this is a late follow up, but I recently had some time with my team where we were able to look at your screenshots and the code to see if this is expected behavior from SplitNCigarReads. We found that SplitNCigarReads does soft clipping by default in GATK4 and there is no way to change it to hard clipping. In GATK3, SplitNCigarReads did hard clipping, so there can be some confusion if you see posts or results from GATK3.

    A workaround for this issue would be to run Mutect2 with the argument --dont-use-soft-clipped-bases set to true. These soft clipped regions are probably what is messing with your INDELs. We have added a ticket to the GATK repo so that we can add a feature to SplitNCigarReads to use hard clipping instead:

    We also have added a somatic RNA best practices pipeline to our feature request backlog, since this is important for many of our users. 

    Thank you for writing in with this question so we can address this issue!


    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk