Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

cnv_germline_case_workflow.wdl

0

3 comments

  • Avatar
    Gökalp Çelik

    Hi Sheryl

    GATK gCNV workflows have 2 seperate use cases. 

    • COHORT mode: All samples are processed and modeled simultaneously without a prior model. Normal samples can be processed this way to produce models.
    • CASE mode: Samples are processed based on a previously generated model from another COHORT mode run. 

    Unless you have a set of normal files to run and generate models for your case mode you cannot use this workflow. Ideally one may need at least 30 or more (depending on the sample type) normal samples to generate a model or that many samples to run as cohort and process for gCNV detection. 

    You may check the documentation for gCNV workflows from the following link

    https://gatk.broadinstitute.org/hc/en-us/articles/360035531152 

    I hope this helps. 

     

    0
    Comment actions Permalink
  • Avatar
    Sheryl

    So are you saying that unless you have >30 samples you can't do CNV calling at all?

    If I have 30 samples, do I have to use DetermineGermlineContigPloidyCohortMode beforehand to generate the required in put for the WDL?

     

    0
    Comment actions Permalink
  • Avatar
    Gökalp Çelik

    Hi Sheryl

    GATK gCNV workflow has multiple steps and tools to complete and DetermineGermlineContigPloidy and GermlineCNVCaller are 2 main steps of this workflow in which you create models for normalization and calling. 

    DetermineGermlineContigPloidy tool normalizes readcounts per chromosome and determines the number of chromosomes per sample. This tool also generates a model which could be used as a model in case mode to call singlet samples' per contig ploidy values. 

    GermlineCNVCaller tool uses the baseline number of chromosomes calculated by the above tool and normalizes and calls segments per sample. This tool generates the a model for how to normalize segments per sample which can be used as a model in the case mode of this tool. 

    If you wish to use the gCNV workflows for single samples you need pre-generated models for both tools to make it work for you otherwise you cannot use these tools directly. Pre-generated models require numerous samples generated with the same wet-lab and sequencing methods to be useful. Samples coming from different wet-lab conditions  will not be compatible and will cause lots of errors in the inference steps. 

    I hope this helps. 

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk