Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Different read statistics for a common control sample in two Mutect2 runs

0

2 comments

  • Avatar
    David Benjamin

    Roman Jaksik You're right that active regions can theoretically affect this but that you shouldn't expect such a big difference.  I think that the bigger factor here is that the different tumors are yielding different assembled haplotypes.  The annotations that look like 1:229654515_C_A in your VCF are phasing tags, and you can see that they are different from one run to another.

    If your pipeline demands consistency you can generate a single assembly for the normal and both tumors in Mutect2's multi-sample mode, where in a single command you specify -I normal.bam I tumor1.bam -I tumor2.bam -normal G4_C -f1r2-tar-gz joint-f1r2.tar.gz etc.  You can do this for an arbitrary number of tumor and normal samples from the same individual.

    1
    Comment actions Permalink
  • Avatar
    Roman Jaksik

    Thank you David. I will try this out.

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk