I am working on a project looking at CNVs in African genomes. We have succesfully run Genome STRiP on ~500 genomes but now want to run it on ~1000 genomes. On running it on this new larger dataset we started to have memory limit problems. These were initially easily overcome, but in CNVDiscovery we started to reach Cluster limits.
I read in other forum posts that it was suggested that large cohorts be run in batches. If samples are run in batches, is there a recommend tool/pipeline for merging the VCF outputs of the batches?
Please sign in to leave a comment.