I used Mutect2 Terra workflow to successfully call mutations in 232/241 samples.
Mutect2 version: mutect2:18.104.22.168
The 9 samples that fail produce one out of the two errors, both of which seem to be associated with bam files. Examples are below:
1) htsjdk.samtools.util.RuntimeIOException: example.bam has invalid uncompressedLength: -995523422
2)htsjdk.samtools.SAMFormatException: Invalid GZIP header example.bam
By googling around, I see that the issue in 1) and 2) may be to do with idx files. However, because .bam and .bam.bai files were all produced by Broad's GP over standardized processes, I'd be surprised if something went wrong there. Especially as 232 samples from the same study ran successfully. One option is to try re-indexing, but before I do, I would like to know:
1) Is it reasonable to assume that these are idxing problem, and if so, based on what?
2) If so, could the source of either one of these problems be the bam file itself?
3) How could I address the issues?
4) If I was to regenerate the idx file, can you please let me know about the GATK tools compatible with Mutect2 to do that? I found this recommendation (https://gatk.broadinstitute.org/hc/en-us/articles/360037057032-BamIndexStats-Picard-), but I do not know what is picard.jar, nor where to find it.
I'd really appreciate help on this.
Please sign in to leave a comment.