Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

MarkDuplicatesSpark Follow

1 comment

  • Avatar
    Ury Alon

    Hi,

    I'm using MarkDuplicatesSpark to merge large lane BAM files.

    The command fails with the following output:

    ...
    WARN HtsjdkReadsRddStorage: Unrecognized write option: DISABLE
    Using GATK jar /opt/bin/gatk/gatk-package-4.2.0.0-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx1000G -Xms400G -jar /opt/bin/gatk/gatk-package-4.2.0.0-local.jar MarkDuplicatesSpark --spark-master local[96] --input in.1.bam --input in.2.bam --input in.bam --input in.4.bam --input in.5.bam --input in.5.bam --output out.bam --metrics-file out.txt --create-output-bam-splitting-index false --create-output-variant-index false --tmp-dir ./tmp --output-shard-tmp-dir ./tmp/shard --QUIET false --verbosity WARNING --spark-verbosity WARN --conf spark.network.timeout=10000s --conf spark.executor.heartbeatInterval=1000s
    Command exited with non-zero status 247

    I have found no documentation for this error.  Can you please assist?

    Thanks,

      Ury

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk