MitochondriaPipeline on merged Bams
Hi everyone,
i'm trying to analyse a single cell ATAC-seq dataset of bams with ~2000 cells/bam files, using the MitochondriaPipeline.
While only ~600 cells or bams interest me, i figured it's nearly impossible to run the MitochondrialPipeline on all those cells individually (make 600 runs).
Therefore i want to merge/merged the bam files (samtools merge) to make the analysis feasable.
The pipeline is working fine for individual bams, however when i try to give it a merged bam (of e.g. two bams) it fails after some minutes.
I checked for Read Groups, both bams have one and the merged bam has two read groups.
Versions:
GATK: gatk.4.1.8.0
Cromwell: cromwell-51.jar
Command used:
"java -jar cromwell-51.jar run ./mitochondria_m2_wdl/MitochondriaPipeline.wdl --inputs ./mitochondria_m2_wdl/InputsMitochondriaPipeline.json"
Output (last part of it before failure):
...
# return exit code
exit $rc
[2020-11-12 11:54:14,75] [info] BackgroundConfigAsyncJobExecutionActor [c9a5400dAlignAndCall.GetContamination:NA:1]: job id: 100910
[2020-11-12 11:54:14,75] [info] BackgroundConfigAsyncJobExecutionActor [c9a5400dAlignAndCall.GetContamination:NA:1]: Status change from - to Done
[2020-11-12 11:54:16,27] [info] WorkflowManagerActor Workflow 35b98ea8-05f2-4050-868d-b3f30b42f4a2 failed (during ExecutingWorkflowState): cromwell.backend.standard.StandardAsyncExecutionActor$$anon$2: Failed to evaluate job outputs:
Bad output 'GetContamination.major_level': Failed to read_float("mean_het_major.txt") (reason 1 of 1): For input string: "0.935
0.935"
Bad output 'GetContamination.minor_level': Failed to read_float("mean_het_minor.txt") (reason 1 of 1): For input string: "0.049
0.025"
at cromwell.backend.standard.StandardAsyncExecutionActor.$anonfun$handleExecutionSuccess$1(StandardAsyncExecutionActor.scala:916)
at scala.util.Success.$anonfun$map$1(Try.scala:255)
at scala.util.Success.map(Try.scala:213)
at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:92)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:85)
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:92)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:49)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
[2020-11-12 11:54:16,31] [info] WorkflowManagerActor WorkflowActor-35b98ea8-05f2-4050-868d-b3f30b42f4a2 is in a terminal state: WorkflowFailedState
[2020-11-12 11:54:21,36] [info] SingleWorkflowRunnerActor workflow finished with status 'Failed'.
[2020-11-12 11:54:24,75] [info] Workflow polling stopped
[2020-11-12 11:54:24,77] [info] 0 workflows released by cromid-3cf049e
[2020-11-12 11:54:24,77] [info] Shutting down WorkflowStoreActor - Timeout = 5 seconds
[2020-11-12 11:54:24,77] [info] Shutting down WorkflowLogCopyRouter - Timeout = 5 seconds
[2020-11-12 11:54:24,77] [info] Shutting down JobExecutionTokenDispenser - Timeout = 5 seconds
[2020-11-12 11:54:24,78] [info] JobExecutionTokenDispenser stopped
[2020-11-12 11:54:24,78] [info] Aborting all running workflows.
[2020-11-12 11:54:24,78] [info] WorkflowStoreActor stopped
[2020-11-12 11:54:24,79] [info] WorkflowLogCopyRouter stopped
[2020-11-12 11:54:24,79] [info] Shutting down WorkflowManagerActor - Timeout = 3600 seconds
[2020-11-12 11:54:24,79] [info] WorkflowManagerActor All workflows finished
[2020-11-12 11:54:24,79] [info] WorkflowManagerActor stopped
[2020-11-12 11:54:24,97] [info] Connection pools shut down
[2020-11-12 11:54:24,98] [info] Shutting down SubWorkflowStoreActor - Timeout = 1800 seconds
[2020-11-12 11:54:24,98] [info] Shutting down JobStoreActor - Timeout = 1800 seconds
[2020-11-12 11:54:24,98] [info] Shutting down CallCacheWriteActor - Timeout = 1800 seconds
[2020-11-12 11:54:24,98] [info] Shutting down ServiceRegistryActor - Timeout = 1800 seconds
[2020-11-12 11:54:24,98] [info] Shutting down DockerHashActor - Timeout = 1800 seconds
[2020-11-12 11:54:24,98] [info] Shutting down IoProxy - Timeout = 1800 seconds
[2020-11-12 11:54:24,98] [info] SubWorkflowStoreActor stopped
[2020-11-12 11:54:24,98] [info] CallCacheWriteActor Shutting down: 0 queued messages to process
[2020-11-12 11:54:24,98] [info] JobStoreActor stopped
[2020-11-12 11:54:24,98] [info] WriteMetadataActor Shutting down: 0 queued messages to process
[2020-11-12 11:54:24,98] [info] KvWriteActor Shutting down: 0 queued messages to process
[2020-11-12 11:54:24,98] [info] CallCacheWriteActor stopped
[2020-11-12 11:54:24,98] [info] IoProxy stopped
[2020-11-12 11:54:24,98] [info] DockerHashActor stopped
[2020-11-12 11:54:24,99] [info] ServiceRegistryActor stopped
[2020-11-12 11:54:25,01] [info] Shutting down connection pool: curAllocated=0 idleQueues.size=0 waitQueue.size=0 maxWaitQueueLimit=256 closed=false
[2020-11-12 11:54:25,01] [info] Shutting down connection pool: curAllocated=1 idleQueues.size=1 waitQueue.size=0 maxWaitQueueLimit=256 closed=false
[2020-11-12 11:54:25,01] [info] Shutting down connection pool: curAllocated=0 idleQueues.size=0 waitQueue.size=0 maxWaitQueueLimit=256 closed=false
[2020-11-12 11:54:25,01] [info] Shutting down connection pool: curAllocated=0 idleQueues.size=0 waitQueue.size=0 maxWaitQueueLimit=256 closed=false
[2020-11-12 11:54:25,04] [info] Database closed
[2020-11-12 11:54:25,04] [info] Stream materializer shut down
[2020-11-12 11:54:25,04] [info] WDL HTTP import resolver closed
Workflow 35b98ea8-05f2-4050-868d-b3f30b42f4a2 transitioned to state Failed
-------------------------------------------------------------------------------------------------------------------------
So the problem seems to be in the AlignAndCall.GetContamination Step.
Not even a stderr file to look at the detailed error log is described. Do you know what could trigger this behaviour? Or is it even possible to analyse merged bam files.
Thanks for suggestions,
Valentin
-
Hello Gatk-team Genevieve Brandt (she/her),
so I figures the more important questions in this issue are:
- Does the mitochondrial Pipeline work on merged bams? Is that tested or known?
- Does it allow only one Read Group? In my merged bams there are two or more
The answers to those questions should solve it. Now maybe another alternative if the pipeline doesn't work on merged bams, would be to make it run atomatically for larger sample numbers.
So do you know of any way or maybe script that could make this Pipeline run for e.g. 20 Bam files (individually but) automatically?
Thanks
-
-
Thanks for the feedback Genevieve Brandt (she/her).
Terra requires a google billing account, to be linked to, right? I'm currently on that
-
Please sign in to leave a comment.
4 comments