Following "GATK Best Practices for Structural Variation Discovery on Single Samples" without Docker
Hi,
I am trying to follow the "GATK Best Practices for Structural Variation Discovery on Single Samples", but I normally use high performance computing facility that does not support Docker. It's unclear to me how to generate the reference panel. Am I correct in thinking I can follow the Cromwell example through conda version of GATK instead?
> python scripts/inputs/create_test_batch.py \
--execution-bucket gs://my-exec-bucket \
--final-workflow-outputs-dir gs://my-outputs-bucket \
metadata.json \
> inputs/values/my_ref_panel.json
> # Define your google project id (for Cromwell inputs) and Terra billing project (for workspace inputs)
> echo '{ "google_project_id": "my-google-project-id", "terra_billing_project_id": "my-terra-billing-project" }' > inputs/values/google_cloud.my_project.json
> # Build test files for batched workflows (google cloud project id required)
> python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/test \
inputs/build/my_ref_panel/test \
-a '{ "test_batch" : "ref_panel_1kg", "cloud_env": "google_cloud.my_project" }'
> # Build test files for the single-sample workflow
> python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/test/GATKSVPipelineSingleSample \
inputs/build/NA19240/test_my_ref_panel \
-a '{ "single_sample" : "test_single_sample_NA19240", "ref_panel" : "my_ref_panel" }'
> # Build files for a Terra workspace
> python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/terra_workspaces/single_sample \
inputs/build/NA12878/terra_my_ref_panel \
-a '{ "single_sample" : "test_single_sample_NA12878", "ref_panel" : "my_ref_panel" }'
-
Hi Barbara Shih,
The standard GATK conda environment is not sufficient for creating the reference panel. Eventually you'll need all of the tools that get run in the GATK SV pipeline, which are going to be difficult to set up outside of docker. I'm going to ask our GATK SV development team if they have any suggestions for running the pipeline on-prem.
-Laura
-
Hi Barbara Shih,
The team has had limited success running the WDL with cromwell locally (i.e. on a laptop) using docker, but they think it could be done with a few small fixes. An on-prem cluster without docker would require a lot of changes based on what we have today.
-L
Please sign in to leave a comment.
2 comments