Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

How to set neutral-segment-copy-ratio for somatic CNV calling



  • Official comment
    Andrey Smirnov

    Bilyana Stoilova,

    Thank you for your kind words about our tools. I can try to answer your questions.

    1. You can definitely use `--neutral-segment-copy-ratio-upper-bound` and `--neutral-segment-copy-ratio-lower-bound` arguments to tune sensitivity of the caller. However, note that this caller is not ploidy/purity aware and, as you mentioned, does not take noisiness of the copy ratio data into consideration. That means that you would have to find the parameters that fit your analysis and your dataset manually. One way to get the background noise estimate of the sample is to use posteriors output in `.param` files. For example the `` file contains `VARIANCE` line that contains different percentiles for posterior of global variance parameter of the log2 copy ratio points of each segment (all segments share this parameter). You can use this distribution to estimate the background noise of your tumor sample and adjust the calling arguments accordingly.

    2. `NUM_POINTS_COPY_RATIO` field is the number of intervals that lie in a given segment (from the interval list you passed to the workflow). The more intervals there are in a particular segment more likely that it represents an underlying event. You can also observe a direct relationship between number of intervals and the tightness of the posterior that is also output in `.seg` file. You can definitely filter by `NUM_POINTS_COPY_RATIO` to improve your specificity - however, the exact threshold would depend on your analysis.

    Let me know if you have any more questions!




    Comment actions Permalink
  • Avatar
    Bhanu Gandham

    Hi ,

    The GATK support team is focused on resolving questions about GATK tool-specific errors and abnormal results from the tools. For all other questions, such as this one, we are building a backlog to work through when we have the capacity.

    Please continue to post your questions because we will be mining them for improvements to documentation, resources, and tools.

    We cannot guarantee a reply, however, we ask other community members to help out if you know the answer.

    For context, check out our support policy.


    Comment actions Permalink
  • Avatar

    hi Bilyana,

    I am NOT a GATK developer and am not answering your question but am looking for info.

    i am returning after many months to investigate the CNV tools developed by  GATK and am eager to see if changes in GATK v4.1.7 have fixed the problems I had in running the CNV pipelines previously.

    I see that you are somewhat satisfied with your results so reach out to you for :

    Can you point me to the two tutorials that you refer to:of all, "....thank you for putting the two CNV calling tutorials together - this is the best WES CNV pipeline I have used."



    Comment actions Permalink
  • Avatar
    Bilyana Stoilova

    Hi steveb,

    The two tutorials I meant are:

    I think they are the updated version of an older tutorial.

    Good luck!

    Comment actions Permalink
  • Avatar

    thanks for the prompt reply Bilyana.

    These are the two tutorials I had been working with months ago and they haven't changed since I last used them.  Maybe some of the tools in GATK4.1 perform better now.  I will carry on and try Somatic CNV pipeline again.


    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk