BwaMemIndexImageCreator:: Expected running time for 1 TB db
AnsweredI've been running BwaMemIndexImageCreator for 1TB db and it still runs after a month. Do you have a benchmark data that I can use to predict it's expected finish.
I am running it in a high-performance server::
64 cores
512 gb ram + 4TB swap memory
Thank you!
-
Thanks Renald James Legaspi for the update. It does look like it's still actively running so let it keep running for now and I'll reach out to Mark Walker regarding the 64 GB db.
-
The pre-built database is within the pathseq resource bundle hosted on google cloud platform: gs://gatk-best-practices/pathseq/resources/.
A 1 TB database can easily take longer than a week, the one we have took a few days. You are using a lot of swap memory so the process could have slowed down. You might consider decreasing the database size because 1TB is very large and may not be practical for running. The readme in the pathseq resource bundle bucket describes some strategies we used to get the database size down.
Let me know if you have any other questions.
Best,
Genevieve
-
I'm glad the resources are helpful! Let us know if you have other questions.
-
Do you think that BwaMemIndexImageCreator is still running or is it hung at a certain position?
We don't have any benchmark data for this tool.
Best,
Genevieve
-
Hi Genevieve,
I believe that it is still running the process for
'[bwa_index] Construct BWT for the packed sequence...'
And the memory and cpu are being engaged as depicted in ff photo
Oh thank you for informing, i will just perform the benchmarking.Anyway, I've seen from this post here that the largest microbe db you have built is around 64 GB for the PathSeq Pipeline. Is this publicly available? Or by any chance, I can look into this.
Thank you! -
Hi Genevieve,
We have decided to terminate run and just proceed to db truncation just like what you have suggested.
Thank you for the resource link. It is of great help to us.
Please sign in to leave a comment.
6 comments