GATK4.1.7.0 specific error: "htsjdk.samtools.SAMException: Cannot read non-existent file:"
Files work with multiple versions of GATK. From 4.0.0.0 to the older picard. However, I can't get the same command to work with 4.1.7.0. With newest program I was getting:
htsjdk.samtools.SAMException: Cannot read non-existent file:
I'm using variables for the files and checking they exist. I used older versions of the software and it's fine. I literally change my conda environment to gatk 4.0.0.0 and add "-launch" after gatk and the whole thing works. That's it!
Can you please provide
a) GATK version used
4.1.7.0
b) Exact GATK commands used
BAIT=/data/temp/T/NGHC_16_C_R1_sub.bait.interval.list
TARGET=/data/temp/T/NGHC_16_C_R1_sub.target.interval.list
BAM=/data/temp/T/NGHC_16_C_fixmate_novosort_dupsrmFalse.bam
BAI=/data/temp/T/NGHC_16_C_fixmate_novosort_dupsrmFalse.bai
REF=/data1/BIOINFORMATICS/REFERENCES/NIMBLEGEN/HumanGenome/HG-38/hg38.fa
du -sh $BAIT
du -sh $TARGET
du -sh $BAM
du -sh $REF
du -sh $BAI
LINE="gatk CollectHsMetrics --BAIT_INTERVALS $BAIT --INPUT $BAM --OUTPUT stats.tsv --TARGET_INTERVALS $TARGET"
echo "# $LINE"
eval "$LINE"
c) The entire error log if applicable.
./easyTest.sh
5.6M /data/temp/T/NGHC_16_C_R1_sub.bait.interval.list
5.6M /data/temp/T/NGHC_16_C_R1_sub.target.interval.list
1.2G /data/temp/T/NGHC_16_C_fixmate_novosort_dupsrmFalse.bam
3.0G /data1/BIOINFORMATICS/REFERENCES/NIMBLEGEN/HumanGenome/HG-38/hg38.fa
5.3M /data/temp/T/NGHC_16_C_fixmate_novosort_dupsrmFalse.bai
# gatk CollectHsMetrics --BAIT_INTERVALS /data/temp/T/NGHC_16_C_R1_sub.bait.interval.list --INPUT /data/temp/T/NGHC_16_C_fixmate_novosort_dupsrmFalse.bam --OUTPUT stats.tsv --TARGET_INTERVALS /data/temp/T/NGHC_16_C_R1_sub.target.interval.list
Using GATK jar /data1/BIOINFORMATICS/SOFTWARE/ANACONDA_JN/MINI-CONDA/envs/gatk-newest/share/gatk4-4.1.7.0-0/gatk-package-4.1.7.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /data1/BIOINFORMATICS/SOFTWARE/ANACONDA_JN/MINI-CONDA/envs/gatk-newest/share/gatk4-4.1.7.0-0/gatk-package-4.1.7.0-local.jar CollectHsMetrics --BAIT_INTERVALS /data/temp/T/NGHC_16_C_R1_sub.bait.interval.list --INPUT /data/temp/T/NGHC_16_C_fixmate_novosort_dupsrmFalse.bam --OUTPUT stats.tsv --TARGET_INTERVALS /data/temp/T/NGHC_16_C_R1_sub.target.interval.list
14:58:53.727 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data1/BIOINFORMATICS/SOFTWARE/ANACONDA_JN/MINI-CONDA/envs/gatk-newest/share/gatk4-4.1.7.0-0/gatk-package-4.1.7.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
[Fri May 15 14:58:53 EDT 2020] CollectHsMetrics --BAIT_INTERVALS /data/temp/T/NGHC_16_C_R1_sub.bait.interval.list --TARGET_INTERVALS /data/temp/T/NGHC_16_C_R1_sub.target.interval.list --INPUT /data/temp/T/NGHC_16_C_fixmate_novosort_dupsrmFalse.bam --OUTPUT stats.tsv --METRIC_ACCUMULATION_LEVEL ALL_READS --NEAR_DISTANCE 250 --MINIMUM_MAPPING_QUALITY 20 --MINIMUM_BASE_QUALITY 20 --CLIP_OVERLAPPING_READS true --INCLUDE_INDELS false --COVERAGE_CAP 200 --SAMPLE_SIZE 10000 --ALLELE_FRACTION 0.001 --ALLELE_FRACTION 0.005 --ALLELE_FRACTION 0.01 --ALLELE_FRACTION 0.02 --ALLELE_FRACTION 0.05 --ALLELE_FRACTION 0.1 --ALLELE_FRACTION 0.2 --ALLELE_FRACTION 0.3 --ALLELE_FRACTION 0.5 --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 2 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
May 15, 2020 2:58:54 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
[Fri May 15 14:58:54 EDT 2020] Executing as cxxrt@email on Linux 3.10.0-1062.1.2.el7.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_152-release-1056-b12; Deflater: Intel; Inflater: Intel; Provider GCS is available; Picard version: Version:4.1.7.0
[Fri May 15 14:58:54 EDT 2020] picard.analysis.directed.CollectHsMetrics done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2623012864
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
htsjdk.samtools.SAMException: Cannot read non-existent file: file:///data/temp/T/@HD%09VN:1.4%09SO:unsorted
at htsjdk.samtools.util.IOUtil.assertFileIsReadable(IOUtil.java:498)
at htsjdk.samtools.util.IOUtil.assertFileIsReadable(IOUtil.java:485)
at picard.analysis.directed.CollectTargetedMetrics.doWork(CollectTargetedMetrics.java:115)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
at org.broadinstitute.hellbender.cmdline.PicardCommandLineProgramExecutor.instanceMain(PicardCommandLineProgramExecutor.java:25)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)
-
Interesting. Can you try renaming "interval.list" to "interval_list" to see if that works? I wonder if you are encountering this issue.
-
That worked. Thank you Tiffany Miller!
Please sign in to leave a comment.
2 comments