Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

MarkduplicatesSpark failed



    Genevieve Brandt (she/her)

    Hi lid.zigh,

    It can sometimes be complicated to find the root of the issue when running spark tools. The part of your stack trace that seems illuminating is here:

    21/11/21 05:44:43 ERROR ShuffleBlockFetcherIterator: Error occurred while fetching local blocks
    java.nio.file.NoSuchFileException: /scratch/lzeighami/WLCH3023/DownSample.60X/tmp/blockmgr-5057f706-90a5-4ef2-bc8f-7fd9d6e1832c/3a/shuffle_3_38190_0.index

    Spark is not finding one of the index files here. This could be from a few different problems with the command but first I would recommend checking that your temporary directory has enough space to hold the temporary files that spark creates. You'll also want to make sure that there is not a limit for the number of files that can exist in the temporary directory. 

    You might find more success changing your temporary directory to the system temp folder and not within your subdirectory. Potentially changing

    --conf 'spark.local.dir=tmp' \
    --conf 'spark.local.dir=/tmp' \

    Let me know if this works or you have further issues/questions!



    Hello Genevieve,

    Thank you for your kind help. I will try my script with system tmp folder instead of my tmp folder and hope it solves the issue. I will keep you updated.

    Thank you,



    Genevieve Brandt (she/her)

    Sounds good, let me know!

