Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Sample does not have a non-negative sample median when running DenoiseReadCounts to test the example data Tutorial11682



  • Avatar
    Genevieve Brandt (she/her)

    Hi Yuanyuan Wu,

    Could you confirm that you are following all the steps in the tutorial you listed? Please give the previous commands as well.

    Thank you,


    Comment actions Permalink
  • Avatar
    Yuanyuan Wu

    Yes, I can confirm that. The only difference is I used "CollectReadCounts" instead of "CollectFragmentCounts" because the GATK version is different.

    The previous steps: 

    gatk --java-options "-Xmx16G" PreprocessIntervals \
        -L /tutorial_11682/targets_C.interval_list  \
        --reference /reference/Homo_sapiens_assembly38.fasta \
        --bin-length 0 \
        --interval-merging-rule OVERLAPPING_ONLY \
        --output /tutorial_11682/test/targets_C.preprocessed.interval_list
    gatk --java-options "-Xmx16G" CollectReadCounts \
    -L /tutorial_11682/tutorial_11682/targets_C.preprocessed.interval_list \
    -I /tutorial_11682/tutorial_11682/tumor.bam \
    --interval-merging-rule OVERLAPPING_ONLY \
    --output /tutorial_11682/test/tumor.counts.hdf5
    gatk --java-options "-Xmx16G" CreateReadCountPanelOfNormals \
                --input /tutorial_11682/HG00133.alt_bwamem_GRCh38DH.20150826.GBR.exome.counts.hdf5 \
                --input /tutorial_11682/HG00733.alt_bwamem_GRCh38DH.20150826.PUR.exome.counts.hdf5 \
                --input /tutorial_11682/NA19654.alt_bwamem_GRCh38DH.20150826.MXL.exome.counts.hdf5 \
                --minimum-interval-median-percentile 5.0 \
                --output /tutorial_11682/test/cnvponC.pon.hdf5


    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Yuanyuan Wu,

    The error message is reporting that you do not have a non-negative sample median (it is mostly likely zero) which is in your tumor.counts.hdf5 file. You can examine the count file to verify if that is true. 

    Could you post the stack trace from when you ran CollectReadCounts?

    Thank you,


    Comment actions Permalink
  • Avatar
    Yuanyuan Wu

    Please see below, which I got after ran CollectReadCounts. How could I examine the count file to verify if that is true?

    Using GATK jar /gatk/gatk-package-
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx16G -jar /gatk/gatk-package- CollectReadCounts -L /tutorial_11682/targets_C.preprocessed.interval_list -I /test/tumor.counts.hdf5
    23:23:56.580 INFO NativeLibraryLoader - Loading from jar:file:/gatk/gatk-package-!/com/intel/gkl/native/
    Jan 14, 2021 11:23:58 PM runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    23:23:58.331 INFO CollectReadCounts - ------------------------------------------------------------
    23:23:58.332 INFO CollectReadCounts - The Genome Analysis Toolkit (GATK) v4.1.2.0
    23:23:58.332 INFO CollectReadCounts - For support and documentation go to
    23:23:58.332 INFO CollectReadCounts - Executing as ywu244@login005 on Linux v3.10.0-957.21.3.el7.x86_64 amd64
    23:23:58.332 INFO CollectReadCounts - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12
    23:23:58.332 INFO CollectReadCounts - Start Date/Time: January 14, 2021 11:23:56 PM UTC
    23:23:58.332 INFO CollectReadCounts - ------------------------------------------------------------
    23:23:58.332 INFO CollectReadCounts - ------------------------------------------------------------
    23:23:58.333 INFO CollectReadCounts - HTSJDK Version: 2.19.0
    23:23:58.333 INFO CollectReadCounts - Picard Version: 2.19.0
    23:23:58.333 INFO CollectReadCounts - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    23:23:58.333 INFO CollectReadCounts - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    23:23:58.333 INFO CollectReadCounts - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    23:23:58.333 INFO CollectReadCounts - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    23:23:58.333 INFO CollectReadCounts - Deflater: IntelDeflater
    23:23:58.333 INFO CollectReadCounts - Inflater: IntelInflater
    23:23:58.333 INFO CollectReadCounts - GCS max retries/reopens: 20
    23:23:58.333 INFO CollectReadCounts - Requester pays: disabled
    23:23:58.334 INFO CollectReadCounts - Initializing engine
    WARNING: BAM index file tutorial_11682/tumor.bai is older than BAM /tutorial_11682/tumor.bam
    23:23:59.162 INFO IntervalArgumentCollection - Processing 115110897 bp from intervals
    23:23:59.197 INFO CollectReadCounts - Done initializing engine
    23:23:59.200 INFO CollectReadCounts - Collecting read counts...
    23:23:59.201 INFO ProgressMeter - Starting traversal
    23:23:59.201 INFO ProgressMeter - Current Locus Elapsed Minutes Reads Processed Reads/Minute
    23:24:09.205 INFO ProgressMeter - chr17:59963133 0.2 2475000 14847030.6
    23:24:13.057 INFO CollectReadCounts - 1180729 read(s) filtered by: ((((WellformedReadFilter AND MappedReadFilter) AND NonZeroReferenceLengthAlignmentReadFilter) AND NotDuplicateReadFilter) AND MappingQualityReadFilter)
    920554 read(s) filtered by: (((WellformedReadFilter AND MappedReadFilter) AND NonZeroReferenceLengthAlignmentReadFilter) AND NotDuplicateReadFilter)
    28682 read(s) filtered by: ((WellformedReadFilter AND MappedReadFilter) AND NonZeroReferenceLengthAlignmentReadFilter)
    28682 read(s) filtered by: (WellformedReadFilter AND MappedReadFilter)
    28682 read(s) filtered by: MappedReadFilter
    891872 read(s) filtered by: NotDuplicateReadFilter
    260175 read(s) filtered by: MappingQualityReadFilter

    23:24:13.057 INFO ProgressMeter - chr22:26297032 0.2 3608572 15626033.5
    23:24:13.057 INFO ProgressMeter - Traversal complete. Processed 3608572 total reads in 0.2 minutes.
    23:24:13.058 INFO CollectReadCounts - Writing read counts to /tutorial_11682/test/tumor.counts.hdf5...
    log4j:WARN No appenders could be found for logger (org.broadinstitute.hdf5.HDF5Library).
    log4j:WARN Please initialize the log4j system properly.
    log4j:WARN See for more info.
    23:24:13.499 INFO CollectReadCounts - CollectReadCounts complete.
    23:24:13.500 INFO CollectReadCounts - Shutting down engine
    [January 14, 2021 11:24:13 PM UTC] done. Elapsed time: 0.28 minutes.

    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Hi Yuanyuan Wu,

    I brought this up with our developer team to confirm it is not a bug and I think we have found the issue! The input for DenoiseReadCounts should be a file downloaded from the tutorial, hcc1143_T_clean.counts.hdf5. (Not the file tutorial_11682/test/tumor.counts.hdf5). 

    For CollectFragmentCounts, there is a sentence we missed in the tutorial: The tutorial does not use the resulting file in subsequent steps. 

    So, the CollectReadCounts command is run as an example, but it is not meant to be used in the next steps. You can use the provided file (hcc1143_T_clean.counts.hdf5) as input to DenoiseReadCounts.

    Hope this resolves the problem! Have a good weekend,


    Comment actions Permalink
  • Avatar
    Yuanyuan Wu

    Ok, great. Thanks for your clarification. I need to run CollectReadCounts on the tumor samples from my own project anyway, am I right?

    Except for that, I got two other errors when I plot figures by using

    PlotDenoisedCopyRatios and PlotModeeledSegments. The issue is I can get the results on this step but it throws an error. Please see below.
    gatk PlotDenoisedCopyRatios \
    --standardized-copy-ratios /tutorial_11682/test_01122021/hcc1143_T_clean.standardizedCR.tsv \
    --denoised-copy-ratios /test_01122021/hcc1143_T_clean.denoisedCR.tsv \
    --sequence-dictionary /reference/Homo_sapiens_assembly38.dict \
    --minimum-contig-length 46709983 \
    --output /tutorial_11682/test/plots \
    --output-prefix hcc1143_T_clean
    Using GATK jar /gatk/gatk-package-
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tri/tutorial_11682/test/hcc1143_T_clean.standardizedCR.tsv --denoised-copy-ratios /tutorial_11682/test/hcc1143_T_clean.denoisedCR.tsv --sequence-dictionary /reference/Homo_sapiens_assembly38.dict --minimum-contig-length 46709983 --output tutorial_11682/test/plots --output-prefix hcc1143_T_clean
    17:02:14.660 INFO NativeLibraryLoader - Loading from jar:file:/gatk/gatk-package-!/com/intel/gkl/native/
    Jan 13, 2021 5:02:16 PM runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    17:02:16.325 INFO PlotDenoisedCopyRatios - ------------------------------------------------------------
    17:02:16.325 INFO PlotDenoisedCopyRatios - The Genome Analysis Toolkit (GATK) v4.1.2.0
    17:02:16.325 INFO PlotDenoisedCopyRatios - For support and documentation go to
    17:02:16.326 INFO PlotDenoisedCopyRatios - Executing as ywu244@login004 on Linux v3.10.0-957.21.3.el7.x86_64 amd64
    17:02:16.326 INFO PlotDenoisedCopyRatios - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12
    17:02:16.326 INFO PlotDenoisedCopyRatios - Start Date/Time: January 13, 2021 5:02:14 PM UTC
    17:02:16.326 INFO PlotDenoisedCopyRatios - ------------------------------------------------------------
    17:02:16.326 INFO PlotDenoisedCopyRatios - ------------------------------------------------------------
    17:02:16.327 INFO PlotDenoisedCopyRatios - HTSJDK Version: 2.19.0
    17:02:16.327 INFO PlotDenoisedCopyRatios - Picard Version: 2.19.0
    17:02:16.327 INFO PlotDenoisedCopyRatios - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    17:02:16.327 INFO PlotDenoisedCopyRatios - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    17:02:16.327 INFO PlotDenoisedCopyRatios - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    17:02:16.327 INFO PlotDenoisedCopyRatios - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    17:02:16.328 INFO PlotDenoisedCopyRatios - Deflater: IntelDeflater
    17:02:16.328 INFO PlotDenoisedCopyRatios - Inflater: IntelInflater
    17:02:16.328 INFO PlotDenoisedCopyRatios - GCS max retries/reopens: 20
    17:02:16.328 INFO PlotDenoisedCopyRatios - Requester pays: disabled
    17:02:16.328 INFO PlotDenoisedCopyRatios - Initializing engine
    17:02:16.328 INFO PlotDenoisedCopyRatios - Done initializing engine
    17:02:16.373 INFO PlotDenoisedCopyRatios - Reading and validating input files...
    17:02:17.461 INFO PlotDenoisedCopyRatios - Contigs above length threshold: {chr1=248956422, chr2=242193529, chr3=198295559, chr4=190214555, chr5=181538259, chr6=170805979, chr7=159345973, chr8=145138636, chr9=138394717, chr10=133797422, chr11=135086622, chr12=133275309, chr13=114364328, chr14=107043718, chr15=101991189, chr16=90338345, chr17=83257441, chr18=80373285, chr19=58617616, chr20=64444167, chr21=46709983, chr22=50818468, chrX=156040895, chrY=57227415}
    17:02:17.558 INFO PlotDenoisedCopyRatios - Writing plots to /tutorial_11682/test/plots...
    17:02:20.487 INFO PlotDenoisedCopyRatios - PlotDenoisedCopyRatios complete.
    17:02:20.487 INFO PlotDenoisedCopyRatios - Shutting down engine
    [January 13, 2021 5:02:20 PM UTC] done. Elapsed time: 0.10 minutes.
    Exception in thread "Thread-1" htsjdk.samtools.util.RuntimeIOException: java.nio.file.NoSuchFileException: /tmp/Rlib.7048755237054159199
    at htsjdk.samtools.util.IOUtil.recursiveDelete(
    Caused by: java.nio.file.NoSuchFileException: /tmp/Rlib.7048755237054159199
    at sun.nio.fs.UnixException.translateToIOException(
    at sun.nio.fs.UnixException.rethrowAsIOException(
    at sun.nio.fs.UnixException.rethrowAsIOException(
    at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(
    at sun.nio.fs.UnixFileSystemProvider.readAttributes(
    at sun.nio.fs.LinuxFileSystemProvider.readAttributes(
    at java.nio.file.Files.readAttributes(
    at java.nio.file.FileTreeWalker.getAttributes(
    at java.nio.file.FileTreeWalker.visit(
    at java.nio.file.FileTreeWalker.walk(
    at java.nio.file.Files.walkFileTree(
    at java.nio.file.Files.walkFileTree(
    at htsjdk.samtools.util.IOUtil.recursiveDelete(
    ... 3 more
       gatk PlotModeledSegments \
        --denoised-copy-ratios //tutorial_11682/test/hcc1143_T_clean.denoisedCR.tsv \
        --allelic-counts /tutorial_11683/test/hcc1143_T_clean.hets.tsv \
        --segments /tutorial_11683/test/hcc1143_T_clean.modelFinal.seg \
        --sequence-dictionary /reference/Homo_sapiens_assembly38.dict \
        --minimum-contig-length 46709983 \
        --output /tutorial_11683/test/plots \
        --output-prefix hcc1143_T_clean
    Using GATK jar /gatk/gatk-package-
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package- PlotModeledSegments --denoised-copy-ratios /scratch/ywu244/training/fastqc/ael_Cavnar/12042020_CNV/tutorial_11682/test/hcc1143_T_clean.denoisedCR.tsv --allelic-counts /tutorial_11683/test/hcc1143_T_clean.hets.tsv --segments /scratch/ywu244/training/fastqc/Mic
    tutorial_11683/test/hcc1143_T_clean.modelFinal.seg --sequence-dictionary /reference/Homo_sapiens_assembly38.dict --minimum-contig-length 46709983 --output /tutorial_11683/test/plots --output-prefix hcc1143_T_clean
    20:35:51.183 INFO NativeLibraryLoader - Loading from jar:file:/gatk/gatk-package-!/com/intel/gkl/native/
    Jan 13, 2021 8:35:52 PM runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    20:35:52.846 INFO PlotModeledSegments - ------------------------------------------------------------
    20:35:52.847 INFO PlotModeledSegments - The Genome Analysis Toolkit (GATK) v4.1.2.0
    20:35:52.847 INFO PlotModeledSegments - For support and documentation go to
    20:35:52.847 INFO PlotModeledSegments - Executing as ywu244@login002 on Linux v3.10.0-957.21.3.el7.x86_64 amd64
    20:35:52.847 INFO PlotModeledSegments - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12
    20:35:52.847 INFO PlotModeledSegments - Start Date/Time: January 13, 2021 8:35:51 PM UTC
    20:35:52.847 INFO PlotModeledSegments - ------------------------------------------------------------
    20:35:52.847 INFO PlotModeledSegments - ------------------------------------------------------------
    20:35:52.848 INFO PlotModeledSegments - HTSJDK Version: 2.19.0
    20:35:52.848 INFO PlotModeledSegments - Picard Version: 2.19.0
    20:35:52.849 INFO PlotModeledSegments - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    20:35:52.849 INFO PlotModeledSegments - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    20:35:52.849 INFO PlotModeledSegments - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    20:35:52.849 INFO PlotModeledSegments - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    20:35:52.849 INFO PlotModeledSegments - Deflater: IntelDeflater
    20:35:52.849 INFO PlotModeledSegments - Inflater: IntelInflater
    20:35:52.849 INFO PlotModeledSegments - GCS max retries/reopens: 20
    20:35:52.849 INFO PlotModeledSegments - Requester pays: disabled
    20:35:52.849 INFO PlotModeledSegments - Initializing engine
    20:35:52.849 INFO PlotModeledSegments - Done initializing engine
    20:35:52.851 INFO PlotModeledSegments - Reading and validating input files...
    20:35:53.747 INFO PlotModeledSegments - Contigs above length threshold: {chr1=248956422, chr2=242193529, chr3=198295559, chr4=190214555, chr5=181538259, chr6=170805979, chr7=159345973, chr8=145138636, chr9=138394717, chr10=133797422, chr11=135086622, chr12=133275309, chr13=114364328, chr14=107043718, chr15=101991189, chr16=90338345, chr17=83257441, chr18=80373285, chr19=58617616, chr20=64444167, chr21=46709983, chr22=50818468, chrX=156040895, chrY=57227415}
    20:35:53.852 INFO PlotModeledSegments - Writing plot to /12042020_CNV/tutorial_11683/test/plots/hcc1143_T_clean.modeled.png...
    20:35:56.358 INFO PlotModeledSegments - PlotModeledSegments complete.
    20:35:56.358 INFO PlotModeledSegments - Shutting down engine
    [January 13, 2021 8:35:56 PM UTC] done. Elapsed time: 0.09 minutes.
    Exception in thread "Thread-1" htsjdk.samtools.util.RuntimeIOException: java.nio.file.NoSuchFileException: /tmp/Rlib.3280229804452061538
    at htsjdk.samtools.util.IOUtil.recursiveDelete(
    Caused by: java.nio.file.NoSuchFileException: /tmp/Rlib.3280229804452061538
    at sun.nio.fs.UnixException.translateToIOException(
    at sun.nio.fs.UnixException.rethrowAsIOException(
    at sun.nio.fs.UnixException.rethrowAsIOException(
    at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(
    at sun.nio.fs.UnixFileSystemProvider.readAttributes(
    at sun.nio.fs.LinuxFileSystemProvider.readAttributes(
    at java.nio.file.Files.readAttributes(
    at java.nio.file.FileTreeWalker.getAttributes(
    at java.nio.file.FileTreeWalker.visit(
    at java.nio.file.FileTreeWalker.walk(
    at java.nio.file.Files.walkFileTree(
    at java.nio.file.Files.walkFileTree(
    at htsjdk.samtools.util.IOUtil.recursiveDelete(
    ... 3 more
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    You may want to verify that R is working correctly, please see again the Tools Involved section in the tutorial. If you are having issues, try using GATK in a docker container.  

    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk