Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

PathSeqBwaSpark Follow

1 comment

  • Avatar
    Amir Hozhabrpour

    Hi, I run PathSeqBwaSpark for the tutorial sample bam file in a Linux server environment with 12 threads, and 130G RAM. it uses just 63 G RAM and 95 % of the CPU is used but only one thread is used and this thread is changed  between 12 cores.

    About 130 hours have passed so far, but it is not over yet. 

    my command is this:

    gatk --java-options "-Xmx130g -Xms95g -Djava.io.tmpdir=${TMP_DIR} -Dsamjdk.compression_level=1" PathSeqBwaSpark

    --paired-input o1_reads_paired.bam

    --unpaired-input o1_reads_unpaired.bam

    --paired-output o2_reads_paired.bam

    --unpaired-output o2_reads_unpaired.bam
    --microbe-bwa-image /mnt/disk1/PathSeq/pathseq_microbe.fa.img

    --microbe-fasta /mnt/disk1/PathSeq/pathseq_microbe.fa 

     

    How should I increase the analysis speed?

    Thank you for your reply

     

     

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk