Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

MarkDuplicatesSpark running, but not sorting/creating deduped bam files

0

3 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Vai Pathak,

    Spark opens a large number of temporary files while it runs and your issue is most likely from your system not allowing many files open at once. You can raise the limit for the allowed number of files on your system to get around this issue. We don't have any information on our site but we found this article that can help:

    3 Methods to Change the Number of Open File Limit in Linux

    You can quickly check or set the limits to the number of open files with the ulimit command. If that helps the issue, then you can permanently change the limits in your configuration files.

    Best,

    Genevieve

    1
    Comment actions Permalink
  • Avatar
    Vai Pathak

    Thanks Genevieve, that actually did the trick =) 

    Appreciate the help! 

     

    Thanks, 

    Vai

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Great, thank you for the update Vai Pathak!

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk