GATK CNV somatic panels of normal designAnswered
Hi, I am trying to call somatic CNVs on targeted panel of ~ 525 genes using GATK somatic CNV pipeline. I do not have a matched normal for the tumor sample. To help with denoising, I am looking to create a panel of normals (PoN) for somatic CNV calling.
I had a few questions related to PoN creation :
- The minimum recommended number of samples is 40 for WES according to documentation by GATK team. Would 40 samples be sufficient number of samples for creation of PoNs for calling somatic CNV on targeted panel of 525 genes?
- The type of sample recommended in the documentation is blood normals with no large CNV events. Our assay has a mixture of cytology and FFPE. Would mixing multiple sample types (cytology, FFPE) be okay for creation of PoNs?
- Is it alright to use normals that are tumor tissue but silent for CNVs for creation of PoNs?
The documentation recommends using different set of PoN for different sequencing platforms. Would there be recommendation for creation of PoNs when there is variability in
- Coverage variability due to underfilling of flowcells resulting in high coverage in few samples
- Flow cell type I.e., (S1,S2, Sprime) i.e., normals coming from different flowcell types
- Is there a resource bundle for hg19 build for WES CNV PON hdf5 ?
- Does GATK somatic CNV pipeline exclude normals in PoNs based on match to test samples profile?
- Is there range of purity of sample(test) at which GATK somatic CNV pipeline performs best?
Please let me know. Thanks in advance for your help and looking forward to hearing from the GATK team.
Hi Mallika Gandham,
Thanks for writing into the GATK forum with these questions! Here are the responses for each question:
- Yes, even though it's a small panel of genes, 40 samples is still enough. Reducing the number of genes will not reduce the quality because each gene is independent.
- We would recommend being careful with this approach, specifically with FFPE samples. FFPE can have many unmappable reads which could start to effect what your CNV data looks like. It could potentially be okay if these FFPE samples are high quality. But it might be extra challenging if you are also trying to combine sample types.
- This is not recommended, only if you are truly positive they have no CNVs. If they do have CNVs, it will not be good for the results.
- These two differences should not be a problem
1) This would just result in a multiplier of the read count, so it wouldn't be an issue
2) We're not really familiar with the different flowcell types but even if it changes something in the normals, it could be helpful for the model
- For somatic CNV analysis, we don't store any publicly available PONs. The PONs for depth denoising of somatic samples should be generated using normal samples from similar sequencing technology and sample preparation. A generic PON does not do a good job, especially since the target sets for WES should be matching.
- No, we don't do this because the germline CNVs will be removed from the samples anyway.
- GATK somatic CNV pipeline can perform well on a wide range of purities. The more pure the better, but we don't have a specific cutoff for usage of the pipeline.
Please let us know if you have any other questions.
Thank you for your post, Mallika Gandham! I want to let you know we have received your question. We'll get back to you if we have any updates or follow up questions.
Please see our Support Policy for more details about how we prioritize responding to questions.
Hi Genevieve Brandt (she/her) Thank you for the response.
Just need a bit of clarification on
Is there a way to exclude samples from a Panel of Normals using CreatePanelofNormals module based on the profile of test samples being normalized? profile here I mean sample source, etc.
And to add to that
Does GATK's CreatePanelofNormals exclude normals if they don't meet certain criteria?
Please sign in to leave a comment.