Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Error in GATK CollectAllelicCounts

0

4 comments

  • Avatar
    Gökalp Çelik

    Hi Saloni Sinha

    Are you using any bam modifying tool during post processing steps that may clip reads. CIGAR strings usually do not start with a deletion therefore most GATK tools are not very fond of this notion although it may theoretically be possible in the SAM spec. 

    Also you may try running the tool with 

    --read-validation-stringency SILENT

    to see if the tool will reach successful completion. Additionally you may wish to check any bam modifying tool parameters you use to see if produce an out-of-spec bam. 

    If you still observe issues let us know. 

    0
    Comment actions Permalink
  • Avatar
    Saloni Sinha

    I am told that we use ABRA for post processing of the bam files. Otherwise we use bwa and GATK.

    0
    Comment actions Permalink
  • Avatar
    Saloni Sinha

    I did try rerunning one of the failed sample with --read-validation-stringency SILENT, but it gives same error.

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi Saloni Sinha

    Looks like ABRA could be producing out-of-spec bam files for analysis. If the sole use for ABRA is to realign indels then we may suggest using GATK3 tools for indel realignment which would certainly produce in-spec bam files for further processing. Although we do not support GATK3 tools anymore using them alongside with GATK4 tools for bam processing could still be valid. 

    Another suggestion from our team (Louis Bergelson) is to enable read filter 

    --read-filter GoodCigarReadFilter

    This may eliminate reads with bad CIGAR strings but if your data all have these kind of CIGAR strings then you may need to use some kind of read transformer tool to fix all CIGAR strings  before the analysis. 

    We hope this helps. 

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk