GATK-SV HPC VM hardware requirements
Hi,
I would like to run GATK-SV on 30 WGS samples on our local HPC (it is pre-installed), where we can request from one core (4gb) to 16 cores (64GB of RAM). We can also request multiple instances of the 16 core version.
What is the minimum RAM/number of cores required to run GATK SV on one sample vs 30 samples?
Additionally, typically what size are the output VCF files and intermediate files (we also need to request storage space).
Any help would be much appreciated.
Cheers,
Georgia
-
As you can see from the documentation at the github repository https://github.com/broadinstitute/gatk-sv
gatk-sv pipeline has been tested only on GCP and is currently unsupported on any other ways of execution. It consists of multiple steps of data collection and filtering from individual samples therefore you might need to run each sample in parallel or separate depending on the resources you can get from the HPC. Each tool utilized by the workflow may need different levels of resources and may have different ways of parallelization so estimating what is ultimately necessary is not trivial.
Main suggestion is to have a Cromwell instance using HPC as a back-end to begin with. As you may begin to execute different steps and tools you may end up facing different errors and issues which may all be addressed separately and will most likely to be different from what the team is experiencing with the GCP.
Once you begin your journey we wish to listen your experiences and we hope that we may have suggestions for your problems in gatk-sv workflow.
Regards.
Please sign in to leave a comment.
1 comment