Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Joint calling with Mutect2 with mitochondria mode

0

9 comments

  • Avatar
    Gökalp Çelik

    Hi Caroline Meguerditchian

    GVCF mode for Mutect2 is already wired up and ready in theory and practice for this kind of analysis but we do not have a definitive published way of running such cohort and trio runs for mitochondria joint calling. On the other hand we have some questions to you in terms of how this kind of work may not be what you are actually looking for

    1- Mitochondria is inherited from the mother to the child so we were not so sure about why joint genotyping the father makes any sense in this kind of work. Can you elaborate on that?

    2- Child's mitochondria can be considered as a sub tissue/sub sample of the mother's mitochondria therefore feeding both aligned data to Mutect2 and generating calls with the intention of separating events that are significantly different from mother's plasmy levels could be a better way of checking variants that could affect the child's health. So would you consider such an approach be a better way of pursuing your goal?

    Regards

    0
    Comment actions Permalink
  • Avatar
    Caroline Meguerditchian

    Thank you very much for your answer.

    1- Yes, we are indeed interested in looking at the mother's data, sorry for not being clear about that.

    2- I'm sorry but I'm not sure I fully understand the methodology you're proposing. By feeding both aligned data, do you mean giving both the mother and child data to Mutect2 as if they were samples from the same individual, like it is done in the "Tumor with matched normal" mode?

    Thank you,
    Caroline

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi again. 

    Yes you may call it tumor with matched normal where normal is mother and tumor would be the child's data or Pre and post treatment tumor samples as the pre will be mother and post would be child and pre sample would act as a pseudo normal. In any case heteroplasmy levels that significantly differ from the pre sample will be marked as PASS and anything that has similar plasmy levels will be marked as germline (Filter is named that way). 

    You may be able to leverage this filtering to figure out differentiating variants for possible phenotype cause. 

    I hope this helps.

    Regards. 

    0
    Comment actions Permalink
  • Avatar
    Caroline Meguerditchian

    Hi again,
    Thank you very much for your answer, this strategy does seem like it would be useful for what we want to do.

    I'd also like to know if you have any advice for the situation where we do not have the mother's data but we have siblings data and would want to compare them. In that case, do you think joint calling would make sense and if so would you have any recommended way of doing it?
    Thank you,

    Caroline

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi Caroline Meguerditchian

    Joint calling would make sense for siblings if there are no maternal data present. As I mentioned before tools are all available and ready for that purpose. 

    I hope this helps. 

    0
    Comment actions Permalink
  • Avatar
    Caroline Meguerditchian

    Hi,

    Thank you again for your answer. Just to clarify, could you let me know which tools you are referring to? 

    I did see the pipeline below which mentions joint calling but I'm not quite sure which tools are used for this step.

    https://github.com/broadinstitute/gatk/blob/master/scripts/mitochondria_m2_wdl/MitochondriaPipeline.wdl 

    I've also looked at the --emit-ref-confidence option for Mutect2 which can produce gvcf, would that be an option?

    Thank you in advance,
    Caroline

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi Caroline Meguerditchian

    GVCF mode of Mutect2 must be enabled for this purpose and you may need to combine or import those GVCF files using CombineGVCFs or GenomicsDBImport (which ever you prefer). 

    When using GenotypeGVCFs you may need to use our experimental parameter named

    --input-is-somatic <Boolean>  Finalize input GVCF according to somatic (i.e. Mutect2) TLODs (BETA feature)  Default
                                  value: false. Possible values: {true, false}

    to make sure that final result will be of somatic nature. 

    I hope this helps.

    Regards. 

    0
    Comment actions Permalink
  • Avatar
    Le Qi

    Hi Gökalp Çelik,

    I'm trying to produce gVCF files using the wdl pipeline suggested here  https://github.com/broadinstitute/gatk/tree/master/scripts/mitochondria_m2_wdl

    There is no problem producing gVCFs after calling mutect2, but it seems like liftoverVcf does not work with gVCF files very well. Here's a brief error message:

    Badly formed variant context at location chrM:16069; getEnd() was 16069 but this VariantContext contains an END key with value 8069

    I managed to bypass this by using a newer version of picard, but the step that split multiallelic variants and remove variants that didn't pass doesn't like gVCF formats. I'm not sure how to proceed from here. I think FilterMutectCalls only works for VCFs with single sample, so I can't joint call before the filtering step either.

    Thanks!

    Le

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi Le Qi

    Our mitochondria workflow does not include a joint calling flow therefore you may be on your own to devise a methodology. If you are only interested in generating a list of sites for future studies then the best method would be to combine all samples as a single sample and process it with Mutect2 to get a VCF that you can filter and maybe convert to a sites only VCF file for future reference. There will not be any allele frequencies that you can calculate from mutect2 VCF files as there is only allele fractions listed. Each and every single sample will contain a different allele fraction there so joint calling is not a very useful strategy for this purpose. 

    I hope this helps. 

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk