I would like to use GermlineCNVCaller to identify CNVs in 10,000s of exome samples, but have a few questions before I get going:
1) The documentation states that the memory requirements for 10,000 intervals on 100 samples is 16GB and that requirements scale linearly. Am I right in thinking that the requirement for 10,000s of whole exomes will therefore be >=8TB? Do you have any recommendations for processing this size of data?
2) I have noticed forum posts suggesting that this tool is not yet optimised for detecting common CNVs. Do you have any timescales for when this functionality will be available?
3) The documentation states that the tool is sensitive to the pre-set hyperparameters (eg. p-alt, p-active, cnv-coherence-length, class-coherence-length, interval-psi-scale, and sample-psi-scale). How does the user work out what values to use for these parameters?
Please sign in to leave a comment.