Can you please provide
a) GATK version used 126.96.36.199
b) Exact GATK commands (question about options)
c) The entire error log (n/a)
I've been running Mutect2 for a while now, currently 188.8.131.52. Just today while I was looking at how to split the runs so that I could run in parallel on multiple genomic regions in parallel, I realized that the documentation is a bit unclear on things and I wanted to clairify.
The documention reads thus:
--intervals,-L:String One or more genomic intervals over which to operate This argument may be specified 0 or more times. Default value: null.
- Can the "String" above be a file (i.e. BED or .Picard Intervals) or is this intended to JUST be textual genomic coordinates? I note that -XL has similar documentation. I think this is a "File or coordinates" but I'm making sure.
- In the tutorial: https://gatk.broadinstitute.org/hc/en-us/articles/360035531132--How-to-Call-somatic-mutations-using-GATK4-Mutect2 it uses "1..22" as the intervals for the split. Will Mutect accept that if my genome labels the contigs "chr1", "chr2" etc? I note that the mitochondrial mode uses "chrM" there.
We're running on targeted and exome sequencing, so my hope is that I can do the following:
gatk Mutect2 --intervals $CAPTURE_INTERVALS_FILE --intervals $CONTIG_NAME --intervals-set-rule INTERSECTION [...]
I.e. our $CAPTURE_INTERVALS_FILE can be a .BED or a .picard.intervals with genome-wide coordinates, and the $CONTIG_NAME would be something like "chr1", "chr2", etc (or "1","2",etc depending on Q2 above). And the "INTERSECTION" rule would intersect as a set (so all chr1 intervals in the $CAPTURE_INTERVALS_FILE would be included as a group).
Please sign in to leave a comment.