Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

GATK GenomicsDBImport intervals

Answered
0

19 comments

  • Avatar
    Bekah W

    I've just found a way - can use a VCF file, is there a way of extracting all the intervals from all .g.vcf to input as an interval list please?

     

    0
    Comment actions Permalink
  • Avatar
    Pamela Bretscher

    Hi Bekah W,

    Thank you for writing in. I wouldn't recommend extracting all of the intervals from your .g.vcf files as this would result in really small intervals which is an inefficient way to run GenomicsDBImport. The error about the invalid input looks like it may be because this interval is not present in your reference. Can you check the sequence dictionary for the naming of the chromosomes? (the chromosome may be listed as 1 rather than chr1 which would cause the error)

    Kind regards,

    Pamela

    0
    Comment actions Permalink
  • Avatar
    Bekah W

    I tried using the NC that is the name of the chr but it gave the same error - was just simpler to write Chr in the post as an example

     

    SN:NC_027300.1

    0
    Comment actions Permalink
  • Avatar
    Bekah W

    I reuploaded the files and it ran to completion using just --intervals NC_027300.1 as a test with three files but the output gives nothing under the headers in the resulting file
    #CHROM POS ID REF ALT QUAL FILTER INFO

     

    0
    Comment actions Permalink
  • Avatar
    Pamela Bretscher

    Hi Bekah W,

    I'm glad to hear that GenomicsDBImport is now running to completion with no errors. Could you provide more information about the output file you are getting along with the full command you ran and stacktrace from the tool?

    Kind regards,

    Pamela

    0
    Comment actions Permalink
  • Avatar
    Bekah W

    Command is in the original post?

    What do you mean by more info about the output? and what is the stacktrace?

    0
    Comment actions Permalink
  • Avatar
    Bekah W

    Is this te stacktrace? Paths edited for confidentiality

    Using GATK jar /opt/gridware/depots/54e7fb3c/el7/pkg/apps/gatk/4.2.2.0/noarch/gatk-4.2.2.0/gatk-package-4.2.2.0-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx4g -Xms4g -jar /opt/gridware/depots/54e7fb3c/el7/pkg/apps/gatk/4.2.2.0/noarch/gatk-4.2.2.0/gatk-package-4.2.2.0-local.jar GenomicsDBImport -genomicsdb-workspace-path /mnt/scratch2/users/USERID/PATHTOTEMPDIRECTORY
    --variant PATH TO VARIANT 1.g.vcf.gz --variant PATH TO VARIANT 2.g.vcf.gz --variant PATH TO VARIANT 3.g.vcf.gz   --intervals NC_027300.1
    12:02:39.466 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/opt/gridware/depots/54e7fb3c/el7/pkg/apps/gatk/4.2.2.0/noarch/gatk-4.2.2.0/gatk-package-4.2.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    Nov 30, 2021 12:02:39 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    12:02:39.639 INFO GenomicsDBImport - ------------------------------------------------------------
    12:02:39.640 INFO GenomicsDBImport - The Genome Analysis Toolkit (GATK) v4.2.2.0
    12:02:39.640 INFO GenomicsDBImport - For support and documentation go to https://software.broadinstitute.org/gatk/
    12:02:39.640 INFO GenomicsDBImport - Executing as <NODE/NETWORK> on Linux v3.10.0-1160.2.1.el7.x86_64 amd64
    12:02:39.640 INFO GenomicsDBImport - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_151-b12
    12:02:39.640 INFO GenomicsDBImport - Start Date/Time: 30 November 2021 12:02:39 GMT
    12:02:39.640 INFO GenomicsDBImport - ------------------------------------------------------------
    12:02:39.640 INFO GenomicsDBImport - ------------------------------------------------------------
    12:02:39.641 INFO GenomicsDBImport - HTSJDK Version: 2.24.1
    12:02:39.641 INFO GenomicsDBImport - Picard Version: 2.25.4
    12:02:39.641 INFO GenomicsDBImport - Built for Spark Version: 2.4.5
    12:02:39.641 INFO GenomicsDBImport - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    12:02:39.641 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    12:02:39.641 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    12:02:39.641 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    12:02:39.641 INFO GenomicsDBImport - Deflater: IntelDeflater
    12:02:39.641 INFO GenomicsDBImport - Inflater: IntelInflater
    12:02:39.641 INFO GenomicsDBImport - GCS max retries/reopens: 20
    12:02:39.641 INFO GenomicsDBImport - Requester pays: disabled
    12:02:39.641 INFO GenomicsDBImport - Initializing engine
    12:02:45.276 INFO IntervalArgumentCollection - Processing 159038749 bp from intervals
    12:02:45.277 INFO GenomicsDBImport - Done initializing engine
    12:02:45.576 INFO GenomicsDBLibLoader - GenomicsDB native library version : 1.4.1-d59e886
    12:02:45.580 INFO GenomicsDBImport - Vid Map JSON file will be written to /mnt/scratch2/USERID/PATHTOTEMPDIR/vidmap.json
    12:02:45.580 INFO GenomicsDBImport - Callset Map JSON file will be written to /mnt/scratch2/usersUSERID/PATHTOTEMPDIR/callset.json
    12:02:45.580 INFO GenomicsDBImport - Complete VCF Header will be written to /mnt/scratch2/users/USERID/PATHTOTEMPDIR/vcfheader.vcf
    12:02:45.580 INFO GenomicsDBImport - Importing to workspace - /mnt/scratch2/users/USERID/PATHTOTEMPDIR
    12:02:51.907 INFO GenomicsDBImport - Importing batch 1 with 3 samples
    12:04:31.799 INFO GenomicsDBImport - Done importing batch 1/1
    12:04:31.801 INFO GenomicsDBImport - Import completed!
    12:04:31.801 INFO GenomicsDBImport - Shutting down engine
    [30 November 2021 12:04:31 GMT] org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport done. Elapsed time: 1.87 minutes.
    Runtime.totalMemory()=4136108032
    Tool returned:
    true

    0
    Comment actions Permalink
  • Avatar
    Bekah W

    Out temp directory created has: 
    Directory: NC_027300.1$1$159038749 which contains : __0172b34e-004a-4f6e-8a11-62c90f8df98e140159776184064_1638273785929 and genomicsdb_meta_dir directories
    as well as __array_schema.tdb file

    files: callset.json
    {"callsets": [{"sample_name": "1","row_idx": 0,"idx_in_file": 0,"stream_name": "1_stream"},{"sample_name": "2","row_idx": 1,"idx_in_file": 0,"stream_name": "2_stream"},{"sample_name": "3","row_idx": 2,"idx_in_file": 0,"stream_name": "3_stream"}]}

    vidmap.json

    {"fields": [{"name": "ID","type": ["char"],"length": [{"variable_length_descriptor": "var"}]},{"name": "LowQual","type": ["Integer"],"vcf_field_class": ["FILTER"],"length": [{"variable_length_descriptor": "1"}]},{"name": "PASS","type": ["Integer"],"vcf_field_class": ["FILTER"],"length": [{"variable_length_descriptor": "1"}]},{"name": "AD","type": ["Integer"],"vcf_field_class": ["FORMAT"],"length": [{"variable_length_descriptor": "R"}]},{"name": "DP","type": ["Integer"],"vcf_field_class": ["FORMAT","INFO"],"length": [{"variable_length_descriptor": "1"}]},{"name": "GQ","type": ["Integer"],"vcf_field_class": ["FORMAT"],"length": [{"variable_length_descriptor": "1"}]},{"name": "GT","type": ["int"],"vcf_field_class": ["FORMAT"],"length": [{"variable_length_descriptor": "PP"}]},{"name": "MIN_DP","type": ["Integer"],"vcf_field_class": ["FORMAT"],"length": [{"variable_length_descriptor": "1"}]},{"name": "PGT","type": ["String"],"vcf_field_class": ["FORMAT"],"length": [{"variable_length_descriptor": "var"}]},{"name": "PID","type": ["String"],"vcf_field_class": ["FORMAT"],"length": [{"variable_length_descriptor": "var"}]},{"name": "PL","type": ["Integer"],"vcf_field_class": ["FORMAT"],"length": [{"variable_length_descriptor": "G"}]},{"name": "PS","type": ["Integer"],"vcf_field_class": ["FORMAT"],"length": [{"variable_length_descriptor": "1"}]},{"name": "SB","type": ["Integer"],"vcf_field_class": ["FORMAT"],"length": [{"variable_length_descriptor": "4"}]},{"name": "AS_InbreedingCoeff","type": ["Float"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "A"}]},{"name": "AS_QD","type": ["Float"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "A"}]},{"name": "AS_RAW_BaseQRankSum","type": ["float","int"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "R"},{"variable_length_descriptor": "var"}],"vcf_delimiter": ["|",","],"VCF_field_combine_operation": "histogram_sum"},{"name": "AS_RAW_MQ","type": ["float"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "R"},{"variable_length_descriptor": "var"}],"vcf_delimiter": ["|",","],"VCF_field_combine_operation": "element_wise_sum","disable_remap_missing_with_non_ref": true},{"name": "AS_RAW_MQRankSum","type": ["float","int"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "R"},{"variable_length_descriptor": "var"}],"vcf_delimiter": ["|",","],"VCF_field_combine_operation": "histogram_sum"},{"name": "AS_RAW_ReadPosRankSum","type": ["float","int"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "R"},{"variable_length_descriptor": "var"}],"vcf_delimiter": ["|",","],"VCF_field_combine_operation": "histogram_sum"},{"name": "AS_SB_TABLE","type": ["int"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "R"},{"variable_length_descriptor": "var"}],"vcf_delimiter": ["|",","],"VCF_field_combine_operation": "element_wise_sum","disable_remap_missing_with_non_ref": true},{"name": "BaseQRankSum","type": ["Float"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "1"}]},{"name": "END","type": ["Integer"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "1"}]},{"name": "ExcessHet","type": ["Float"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "1"}]},{"name": "InbreedingCoeff","type": ["Float"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "1"}]},{"name": "MLEAC","type": ["Integer"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "A"}]},{"name": "MLEAF","type": ["Float"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "A"}]},{"name": "MQRankSum","type": ["Float"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "1"}]},{"name": "RAW_MQandDP","type": ["Integer"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "2"}],"VCF_field_combine_operation": "element_wise_sum"},{"name": "ReadPosRankSum","type": ["Float"],"vcf_field_class": ["INFO"],"length": [{"variable_length_descriptor": "1"}]}],"contigs": [{"name": "NC_027300.1","length": 159038749,"tiledb_column_offset": 0},{"name": "NC_027301.1","length": 72943711,"tiledb_column_offset": 159038749},{"name": etc ....

    vcfheader.vcf

    ##fileformat=VCFv4.2
    ##ALT=<ID=NON_REF,Description="Represents any possible alternative allele not already represented at this location by REF and ALT">
    ##FILTER=<ID=LowQual,Description="Low quality">
    ##FILTER=<ID=PASS,Description="All filters passed">
    ##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
    ##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
    ##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
    ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
    ##FORMAT=<ID=MIN_DP,Number=1,Type=Integer,Description="Minimum DP observed within the GVCF block">
    ##FORMAT=<ID=PGT,Number=1,Type=String,Description="Physical phasing haplotype information, describing how the alternate alleles are phased in relation to one another; will always be heterozygous and is not intended to describe called alleles">
    ##FORMAT=<ID=PID,Number=1,Type=String,Description="Physical phasing ID information, where each unique ID within a given sample (but not across samples) connects records within a phasing group">
    ##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
    ##FORMAT=<ID=PS,Number=1,Type=Integer,Description="Phasing set (typically the position of the first variant in the set)">
    ##FORMAT=<ID=SB,Number=4,Type=Integer,Description="Per-sample component statistics which comprise the Fisher's Exact Test to detect strand bias.">
    ##GATKCommandLine=<ID=GenomicsDBImport,CommandLine="GenomicsDBImport --genomicsdb-workspace-path /PATHTEMPDIR --variant PATH1.g.vcf.gz --variant PATH2.g.vcf.gz --variant /PATH3.g.vcf.gz --intervals NC_027300.1 --genomicsdb-segment-size 1048576 --genomicsdb-vcf-buffer-size 16384 --overwrite-existing-genomicsdb-workspace false --batch-size 0 --consolidate false --validate-sample-name-map false --merge-input-intervals false --reader-threads 1 --max-num-intervals-to-import-in-parallel 1 --merge-contigs-into-num-partitions 0 --genomicsdb-shared-posixfs-optimizations false --genomicsdb-use-gcs-hdfs-connector false --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --max-variants-per-shard 0 --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 0 --cloud-index-prefetch-buffer 0 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false",Version="4.2.2.0",Date="30 November 2021 12:02:44 GMT">
    ##GATKCommandLine=<ID=HaplotypeCaller,CommandLine="HaplotypeCaller --emit-ref-confidence GVCF --output /PATH.g.vcf --input PATH.bam --reference PATH.fna --annotation-group StandardAnnotation --annotation-group AS_StandardAnnotation --use-posteriors-to-calculate-qual false --dont-use-dragstr-priors false --use-new-qual-calculator true --annotate-with-num-discovered-alleles false --heterozygosity 0.001 --indel-heterozygosity 1.25E-4 --heterozygosity-stdev 0.01 --standard-min-confidence-threshold-for-calling 30.0 --max-alternate-alleles 6 --max-genotype-count 1024 --sample-ploidy 2 --num-reference-samples-if-no-call 0 --genotype-assignment-method USE_PLS_TO_ASSIGN --contamination-fraction-to-filter 0.0 --output-mode EMIT_VARIANTS_ONLY --all-site-pls false --gvcf-gq-bands 1 --gvcf-gq-bands 2 --gvcf-gq-bands 3 --gvcf-gq-bands 4 --gvcf-gq-bands 5 --gvcf-gq-bands 6 --gvcf-gq-bands 7 --gvcf-gq-bands 8 --gvcf-gq-bands 9 --gvcf-gq-bands 10 --gvcf-gq-bands 11 --gvcf-gq-bands 12 --gvcf-gq-bands 13 --gvcf-gq-bands 14 --gvcf-gq-bands 15 --gvcf-gq-bands 16 --gvcf-gq-bands 17 --gvcf-gq-bands 18 --gvcf-gq-bands 19 --gvcf-gq-bands 20 --gvcf-gq-bands 21 --gvcf-gq-bands 22 --gvcf-gq-bands 23 --gvcf-gq-bands 24 --gvcf-gq-bands 25 --gvcf-gq-bands 26 --gvcf-gq-bands 27 --gvcf-gq-bands 28 --gvcf-gq-bands 29 --gvcf-gq-bands 30 --gvcf-gq-bands 31 --gvcf-gq-bands 32 --gvcf-gq-bands 33 --gvcf-gq-bands 34 --gvcf-gq-bands 35 --gvcf-gq-bands 36 --gvcf-gq-bands 37 --gvcf-gq-bands 38 --gvcf-gq-bands 39 --gvcf-gq-bands 40 --gvcf-gq-bands 41 --gvcf-gq-bands 42 --gvcf-gq-bands 43 --gvcf-gq-bands 44 --gvcf-gq-bands 45 --gvcf-gq-bands 46 --gvcf-gq-bands 47 --gvcf-gq-bands 48 --gvcf-gq-bands 49 --gvcf-gq-bands 50 --gvcf-gq-bands 51 --gvcf-gq-bands 52 --gvcf-gq-bands 53 --gvcf-gq-bands 54 --gvcf-gq-bands 55 --gvcf-gq-bands 56 --gvcf-gq-bands 57 --gvcf-gq-bands 58 --gvcf-gq-bands 59 --gvcf-gq-bands 60 --gvcf-gq-bands 70 --gvcf-gq-bands 80 --gvcf-gq-bands 90 --gvcf-gq-bands 99 --floor-blocks false --indel-size-to-eliminate-in-ref-model 10 --disable-optimizations false --dragen-mode false --apply-bqd false --apply-frd false --disable-spanning-event-genotyping false --transform-dragen-mapping-quality false --mapping-quality-threshold-for-genotyping 20 --max-effective-depth-adjustment-for-frd 0 --just-determine-active-regions false --dont-genotype false --do-not-run-physical-phasing false --do-not-correct-overlapping-quality false --use-filtered-reads-for-annotations false --adaptive-pruning false --do-not-recover-dangling-branches false --recover-dangling-heads false --kmer-size 10 --kmer-size 25 --dont-increase-kmer-sizes-for-cycles false --allow-non-unique-kmers-in-ref false --num-pruning-samples 1 --min-dangling-branch-length 4 --recover-all-dangling-branches false --max-num-haplotypes-in-population 128 --min-pruning 2 --adaptive-pruning-initial-error-rate 0.001 --pruning-lod-threshold 2.302585092994046 --pruning-seeding-lod-threshold 9.210340371976184 --max-unpruned-variants 100 --linked-de-bruijn-graph false --disable-artificial-haplotype-recovery false --enable-legacy-graph-cycle-detection false --debug-assembly false --debug-graph-transformations false --capture-assembly-failure-bam false --num-matching-bases-in-dangling-end-to-recover -1 --error-correction-log-odds -Infinity --error-correct-reads false --kmer-length-for-read-error-correction 25 --min-observations-for-kmer-to-be-solid 20 --base-quality-score-threshold 18 --dragstr-het-hom-ratio 2 --dont-use-dragstr-pair-hmm-scores false --pair-hmm-gap-continuation-penalty 10 --expected-mismatch-rate-for-read-disqualification 0.02 --pair-hmm-implementation FASTEST_AVAILABLE --pcr-indel-model CONSERVATIVE --phred-scaled-global-read-mismapping-rate 45 --disable-symmetric-hmm-normalizing false --disable-cap-base-qualities-to-map-quality false --enable-dynamic-read-disqualification-for-genotyping false --dynamic-read-disqualification-threshold 1.0 --native-pair-hmm-threads 4 --native-pair-hmm-use-double-precision false --bam-writer-type CALLED_HAPLOTYPES --dont-use-soft-clipped-bases false --min-base-quality-score 10 --smith-waterman JAVA --max-mnp-distance 0 --force-call-filtered-alleles false --soft-clip-low-quality-ends false --allele-informative-reads-overlap-margin 2 --min-assembly-region-size 50 --max-assembly-region-size 300 --active-probability-threshold 0.002 --max-prob-propagation-distance 50 --force-active false --assembly-region-padding 100 --padding-around-indels 75 --padding-around-snps 20 --padding-around-strs 75 --max-extension-into-assembly-region-padding-legacy 25 --max-reads-per-alignment-start 50 --enable-legacy-assembly-region-trimming false --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --max-variants-per-shard 0 --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false --minimum-mapping-quality 20 --disable-tool-default-annotations false --enable-all-annotations false --allow-old-rms-mapping-quality-annotation-data false",Version="4.2.2.0",Date="17 November 2021 17:45:14 GMT">
    ##GVCFBlock0-1=minGQ=0(inclusive),maxGQ=1(exclusive)
    ##GVCFBlock1-2=minGQ=1(inclusive),maxGQ=2(exclusive)
    ##GVCFBlock10-11=minGQ=10(inclusive),maxGQ=11(exclusive)
    ##GVCFBlock11-12=minGQ=11(inclusive),maxGQ=12(exclusive)
    ##GVCFBlock12-13=minGQ=12(inclusive),maxGQ=13(exclusive)
    ##GVCFBlock13-14=minGQ=13(inclusive),maxGQ=14(exclusive)
    ##GVCFBlock14-15=minGQ=14(inclusive),maxGQ=15(exclusive)
    ##GVCFBlock15-16=minGQ=15(inclusive),maxGQ=16(exclusive)
    ##GVCFBlock16-17=minGQ=16(inclusive),maxGQ=17(exclusive)
    ##GVCFBlock17-18=minGQ=17(inclusive),maxGQ=18(exclusive)
    ##GVCFBlock18-19=minGQ=18(inclusive),maxGQ=19(exclusive)
    ##GVCFBlock19-20=minGQ=19(inclusive),maxGQ=20(exclusive)
    ##GVCFBlock2-3=minGQ=2(inclusive),maxGQ=3(exclusive)
    ##GVCFBlock20-21=minGQ=20(inclusive),maxGQ=21(exclusive)
    ##GVCFBlock21-22=minGQ=21(inclusive),maxGQ=22(exclusive)
    ##GVCFBlock22-23=minGQ=22(inclusive),maxGQ=23(exclusive)
    ##GVCFBlock23-24=minGQ=23(inclusive),maxGQ=24(exclusive)
    ##GVCFBlock24-25=minGQ=24(inclusive),maxGQ=25(exclusive)
    ##GVCFBlock25-26=minGQ=25(inclusive),maxGQ=26(exclusive)
    ##GVCFBlock26-27=minGQ=26(inclusive),maxGQ=27(exclusive)
    ##GVCFBlock27-28=minGQ=27(inclusive),maxGQ=28(exclusive)
    ##GVCFBlock28-29=minGQ=28(inclusive),maxGQ=29(exclusive)
    ##GVCFBlock29-30=minGQ=29(inclusive),maxGQ=30(exclusive)
    ##GVCFBlock3-4=minGQ=3(inclusive),maxGQ=4(exclusive)
    ##GVCFBlock30-31=minGQ=30(inclusive),maxGQ=31(exclusive)
    ##GVCFBlock31-32=minGQ=31(inclusive),maxGQ=32(exclusive)
    ##GVCFBlock32-33=minGQ=32(inclusive),maxGQ=33(exclusive)
    ##GVCFBlock33-34=minGQ=33(inclusive),maxGQ=34(exclusive)
    ##GVCFBlock34-35=minGQ=34(inclusive),maxGQ=35(exclusive)
    ##GVCFBlock35-36=minGQ=35(inclusive),maxGQ=36(exclusive)
    ##GVCFBlock36-37=minGQ=36(inclusive),maxGQ=37(exclusive)
    ##GVCFBlock37-38=minGQ=37(inclusive),maxGQ=38(exclusive)
    ##GVCFBlock38-39=minGQ=38(inclusive),maxGQ=39(exclusive)
    ##GVCFBlock39-40=minGQ=39(inclusive),maxGQ=40(exclusive)
    ##GVCFBlock4-5=minGQ=4(inclusive),maxGQ=5(exclusive)
    ##GVCFBlock40-41=minGQ=40(inclusive),maxGQ=41(exclusive)
    ##GVCFBlock41-42=minGQ=41(inclusive),maxGQ=42(exclusive)
    ##GVCFBlock42-43=minGQ=42(inclusive),maxGQ=43(exclusive)
    ##GVCFBlock43-44=minGQ=43(inclusive),maxGQ=44(exclusive)
    ##GVCFBlock44-45=minGQ=44(inclusive),maxGQ=45(exclusive)
    ##GVCFBlock45-46=minGQ=45(inclusive),maxGQ=46(exclusive)
    ##GVCFBlock46-47=minGQ=46(inclusive),maxGQ=47(exclusive)
    ##GVCFBlock47-48=minGQ=47(inclusive),maxGQ=48(exclusive)
    ##GVCFBlock48-49=minGQ=48(inclusive),maxGQ=49(exclusive)
    ##GVCFBlock49-50=minGQ=49(inclusive),maxGQ=50(exclusive)
    ##GVCFBlock5-6=minGQ=5(inclusive),maxGQ=6(exclusive)
    ##GVCFBlock50-51=minGQ=50(inclusive),maxGQ=51(exclusive)
    ##GVCFBlock51-52=minGQ=51(inclusive),maxGQ=52(exclusive)
    ##GVCFBlock52-53=minGQ=52(inclusive),maxGQ=53(exclusive)
    ##GVCFBlock53-54=minGQ=53(inclusive),maxGQ=54(exclusive)
    ##GVCFBlock54-55=minGQ=54(inclusive),maxGQ=55(exclusive)
    ##GVCFBlock55-56=minGQ=55(inclusive),maxGQ=56(exclusive)
    ##GVCFBlock56-57=minGQ=56(inclusive),maxGQ=57(exclusive)
    ##GVCFBlock57-58=minGQ=57(inclusive),maxGQ=58(exclusive)
    ##GVCFBlock58-59=minGQ=58(inclusive),maxGQ=59(exclusive)
    ##GVCFBlock59-60=minGQ=59(inclusive),maxGQ=60(exclusive)
    ##GVCFBlock6-7=minGQ=6(inclusive),maxGQ=7(exclusive)
    ##GVCFBlock60-70=minGQ=60(inclusive),maxGQ=70(exclusive)
    ##GVCFBlock7-8=minGQ=7(inclusive),maxGQ=8(exclusive)
    ##GVCFBlock70-80=minGQ=70(inclusive),maxGQ=80(exclusive)
    ##GVCFBlock8-9=minGQ=8(inclusive),maxGQ=9(exclusive)
    ##GVCFBlock80-90=minGQ=80(inclusive),maxGQ=90(exclusive)
    ##GVCFBlock9-10=minGQ=9(inclusive),maxGQ=10(exclusive)
    ##GVCFBlock90-99=minGQ=90(inclusive),maxGQ=99(exclusive)
    ##GVCFBlock99-100=minGQ=99(inclusive),maxGQ=100(exclusive)
    ##INFO=<ID=AS_InbreedingCoeff,Number=A,Type=Float,Description="Allele-specific inbreeding coefficient as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation">
    ##INFO=<ID=AS_QD,Number=A,Type=Float,Description="Allele-specific Variant Confidence/Quality by Depth">
    ##INFO=<ID=AS_RAW_BaseQRankSum,Number=1,Type=String,Description="raw data for allele specific rank sum test of base qualities">
    ##INFO=<ID=AS_RAW_MQ,Number=1,Type=String,Description="Allele-specfic raw data for RMS Mapping Quality">
    ##INFO=<ID=AS_RAW_MQRankSum,Number=1,Type=String,Description="Allele-specfic raw data for Mapping Quality Rank Sum">
    ##INFO=<ID=AS_RAW_ReadPosRankSum,Number=1,Type=String,Description="allele specific raw data for rank sum test of read position bias">
    ##INFO=<ID=AS_SB_TABLE,Number=1,Type=String,Description="Allele-specific forward/reverse read counts for strand bias tests. Includes the reference and alleles separated by |.">
    ##INFO=<ID=BaseQRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt Vs. Ref base qualities">
    ##INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth; some reads may have been filtered">
    ##INFO=<ID=END,Number=1,Type=Integer,Description="Stop position of the interval">
    ##INFO=<ID=ExcessHet,Number=1,Type=Float,Description="Phred-scaled p-value for exact test of excess heterozygosity">
    ##INFO=<ID=InbreedingCoeff,Number=1,Type=Float,Description="Inbreeding coefficient as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation">
    ##INFO=<ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed">
    ##INFO=<ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed">
    ##INFO=<ID=MQRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref read mapping qualities">
    ##INFO=<ID=RAW_MQandDP,Number=2,Type=Integer,Description="Raw data (sum of squared MQ and total depth) for improved RMS Mapping Quality calculation. Incompatible with deprecated RAW_MQ formulation.">
    ##INFO=<ID=ReadPosRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias">
    ##bcftools_normCommand=norm -m +any -O z -o PATHOUT.g.vcf.gz PATHIN.g.vcf; Date=Fri Nov 26 12:43:52 2021
    ##bcftools_normVersion=1.9+htslib-1.9
    ##bcftools_viewCommand=view -O z -o /PATHINOUT.g.vcf.gz PATHIN.g.vcf.gz; Date=Fri Nov 26 12:43:56 2021
    ##bcftools_viewVersion=1.9+htslib-1.9
    ##contig=<ID=NC_027300.1,length=159038749>
    ##contig=<ID=NC_027301.1,length=72943711>
    ##contig=<ID=NC_027302.1,length=92503428>
    ...

    ##contig=<ID=NC_001960.1,length=16665>
    ##source=GenomicsDBImport
    ##source=HaplotypeCaller
    #CHROM POS ID REF ALT QUAL FILTER INFO

    and __tiledb_workspace.tdb which is empty

    0
    Comment actions Permalink
  • Avatar
    Pamela Bretscher

    Hi Bekah W,

    Thank you for providing this stack trace and information. I had asked you to paste your full command because you indicated that the tool was giving a different output when running with the different interval. Could you try running GenomicsDBImport on all of your intended input data using this new interval command? If the output workspace is still empty, then the GATK team will look further into why that is.

    Kind regards,

    Pamela

    0
    Comment actions Permalink
  • Avatar
    Bekah W

    I tried this yesterday but got a java heap error now and no out put at all

    Using GATK jar /opt/gridware/depots/54e7fb3c/el7/pkg/apps/gatk/4.2.2.0/noarch/gatk-4.2.2.0/gatk-package-4.2.2.0-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx4g -Xms4g -jar /opt/gridware/depots/54e7fb3c/el7/pkg/apps/gatk/4.2.2.0/noarch/gatk-4.2.2.0/gatk-package-4.2.2.0-local.jar GenomicsDBImport -genomicsdb-workspace-path /mnt/scratch2/users/PATH.g.vcf.gz --variant <MULTIPLE LINES OF --variant path to files HERE>
    --intervals <PATH TO INTERVAL LIST FILE HERE> 
    00:24:39.864 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/opt/gridware/depots/54e7fb3c/el7/pkg/apps/gatk/4.2.2.0/noarch/gatk-4.2.2.0/gatk-package-4.2.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    Dec 01, 2021 12:24:39 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    00:24:39.984 INFO GenomicsDBImport - ------------------------------------------------------------
    00:24:39.984 INFO GenomicsDBImport - The Genome Analysis Toolkit (GATK) v4.2.2.0
    00:24:39.984 INFO GenomicsDBImport - For support and documentation go to https://software.broadinstitute.org/gatk/
    00:24:39.984 INFO GenomicsDBImport - Executing as 3055649@node103.pri.kelvin2.alces.network on Linux v3.10.0-1160.2.1.el7.x86_64 amd64
    00:24:39.984 INFO GenomicsDBImport - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_151-b12
    00:24:39.985 INFO GenomicsDBImport - Start Date/Time: 01 December 2021 00:24:39 GMT
    00:24:39.985 INFO GenomicsDBImport - ------------------------------------------------------------
    00:24:39.985 INFO GenomicsDBImport - ------------------------------------------------------------
    00:24:39.985 INFO GenomicsDBImport - HTSJDK Version: 2.24.1
    00:24:39.985 INFO GenomicsDBImport - Picard Version: 2.25.4
    00:24:39.985 INFO GenomicsDBImport - Built for Spark Version: 2.4.5
    00:24:39.985 INFO GenomicsDBImport - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    00:24:39.985 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    00:24:39.985 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    00:24:39.986 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    00:24:39.986 INFO GenomicsDBImport - Deflater: IntelDeflater
    00:24:39.986 INFO GenomicsDBImport - Inflater: IntelInflater
    00:24:39.986 INFO GenomicsDBImport - GCS max retries/reopens: 20
    00:24:39.986 INFO GenomicsDBImport - Requester pays: disabled
    00:24:39.986 INFO GenomicsDBImport - Initializing engine
    00:25:16.489 INFO IntervalArgumentCollection - Processing 128067 bp from intervals
    00:25:16.497 INFO GenomicsDBImport - Done initializing engine
    00:25:16.807 INFO GenomicsDBLibLoader - GenomicsDB native library version : 1.4.1-d59e886
    00:25:16.810 INFO GenomicsDBImport - Vid Map JSON file will be written to PATH.json
    00:25:16.810 INFO GenomicsDBImport - Callset Map JSON file will be written to PATH.json
    00:25:16.810 INFO GenomicsDBImport - Complete VCF Header will be written to PATH.vcf
    00:25:16.810 INFO GenomicsDBImport - Importing to workspace - PATH/tempbrainFW
    00:27:44.066 INFO GenomicsDBImport - Shutting down engine
    [01 December 2021 00:27:44 GMT] org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport done. Elapsed time: 3.07 minutes.
    Runtime.totalMemory()=3910139904
    java.util.concurrent.CompletionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
    at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
    at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1592)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
    Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.lang.Integer.valueOf(Integer.java:832)
    at htsjdk.variant.vcf.VCFContigHeaderLine.<init>(VCFContigHeaderLine.java:55)
    at htsjdk.variant.vcf.AbstractVCFCodec.parseHeaderFromLines(AbstractVCFCodec.java:201)
    at htsjdk.variant.vcf.VCFCodec.readActualHeader(VCFCodec.java:111)
    at htsjdk.tribble.AsciiFeatureCodec.readHeader(AsciiFeatureCodec.java:79)
    at htsjdk.tribble.AsciiFeatureCodec.readHeader(AsciiFeatureCodec.java:37)
    at htsjdk.tribble.TabixFeatureReader.readHeader(TabixFeatureReader.java:95)
    at htsjdk.tribble.TabixFeatureReader.<init>(TabixFeatureReader.java:82)
    at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:117)
    at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.getReaderFromPath(GenomicsDBImport.java:966)
    at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.getFeatureReadersSerially(GenomicsDBImport.java:950)
    at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.createSampleToReaderMap(GenomicsDBImport.java:721)
    at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport$$Lambda$101/1516384232.apply(Unknown Source)
    at org.genomicsdb.importer.GenomicsDBImporter.lambda$null$4(GenomicsDBImporter.java:726)
    at org.genomicsdb.importer.GenomicsDBImporter$$Lambda$105/285964343.get(Unknown Source)
    at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
    ... 3 more

     

    0
    Comment actions Permalink
  • Avatar
    Pamela Bretscher

    Hi Bekah W,

    You may need to increase the amount of memory you allocate to the tool (increasing the -xmx number) Depending on how much memory you have on your machine, you may want to allocate up to about 80% of your available memory so that you don't run into the memory error.

    Kind regards,

    Pamela

    0
    Comment actions Permalink
  • Avatar
    Bekah W

    I tried this and it gave the same output even with a smaller number of input files. I've contacted my IT support to see if it is something integral. thank you for your quick response.

    I also tried reducing the number of NC_xxxxx in the list of intervals, butby doing this am I just excluding scaffolds? I'm unsure of how to specify intervals - should they be written as:
    NC_1000.1 :NC_100000.1
    NC_100000.1:NC_200000.1
    NC_200000.1:NW_101000.1 etc..
    to make that an interval or is a shorter list of scaffolds in the format:
    NC_1000.1
    NC_100000.1
    NC_200000.1
    NW_101000.1  etc.... suitable

    Apologies for all the questions, I'm really uncertain after reading the help documentation still?

    0
    Comment actions Permalink
  • Avatar
    Pamela Bretscher

    Hi Bekah W,

    Thank you for your patience and no worries about the number of questions. Let me know if your IT support can give you any information about the memory error. As for the appropriate intervals that you should specify, I am consulting another member of the GATK team to look into it and will update you shortly.

    Kind regards,

    Pamela

    0
    Comment actions Permalink
  • Avatar
    Pamela Bretscher

    Hi Bekah W,

    I think you are confusing contigs with coordinates in your specification of intervals. In your most recent response, the first option would not be a valid interval because you are specifying "contig:contig". You can either just specify the contigs as in your second option or you can narrow down the intervals by adding coordinates (e.g. NC_1000.1: 1-1000). I know you mentioned you had already tried to read the documentation but this article is very helpful for explaining intervals. 

    Kind regards,

    Pamela

     

    0
    Comment actions Permalink
  • Avatar
    Bekah W

    Ah okay thank you! That makes sense I will see what happens. It ran okay with the second option so i continued with that. However I am now having issues with the filtering variants! I passed the output database from GenomicsDBimport no issues to make a gvcf, called the SNPs, but when I use this SNP call in VariantFiltration I get this error, and it outputs a .vcf file with a header but no variants
    SCRIPT:
    gatk VariantFiltration \
    -R /PATH.fna \
    -V PATH.vcf \
    -O PATH.vcf \
    --filter-name "HF_QD" \
    --filter-expression "QD < 2" \
    --filter-name "HF_FS" \
    --filter-expression "FS < 30" \
    --filter-name "HF_SOR" \
    --filter-expression "SOR < 2" \
    --filter-name "HF_RPRS" \
    --filter-expression "ReadPosRankSum > -4";

    STACK:
    Using GATK jar /opt/gridware/depots/54e7fb3c/el7/pkg/apps/gatk/4.2.2.0/noarch/gatk-4.2.2.0/gatk-package-4.2.2.0-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx2g -Djava.io.tmpdir=/tmp -jar /opt/gridware/depots/54e7fb3c/el7/pkg/apps/gatk/4.2.2.0/noarch/gatk-4.2.2.0/gatk-package-4.2.2.0-local.jar VariantFiltration -R PATH.fna -V PATH.vcf -O PATH.vcf --filter-name HF_QD --filter-expression QD < 2 --filter-name HF_FS --filter-expression FS < 30 --filter-name HF_SOR --filter-expression SOR < 2 --filter-name HF_RPRS --filter-expression ReadPosRankSum > -4
    13:58:55.515 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/opt/gridware/depots/54e7fb3c/el7/pkg/apps/gatk/4.2.2.0/noarch/gatk-4.2.2.0/gatk-package-4.2.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    Dec 03, 2021 1:58:55 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    13:58:55.752 INFO VariantFiltration - ------------------------------------------------------------
    13:58:55.752 INFO VariantFiltration - The Genome Analysis Toolkit (GATK) v4.2.2.0
    13:58:55.752 INFO VariantFiltration - For support and documentation go to https://software.broadinstitute.org/gatk/
    13:58:55.752 INFO VariantFiltration - Executing as 3055649@node120.pri.kelvin2.alces.network on Linux v3.10.0-1160.2.1.el7.x86_64 amd64
    13:58:55.752 INFO VariantFiltration - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_151-b12
    13:58:55.753 INFO VariantFiltration - Start Date/Time: 03 December 2021 13:58:55 GMT
    13:58:55.753 INFO VariantFiltration - ------------------------------------------------------------
    13:58:55.753 INFO VariantFiltration - ------------------------------------------------------------
    13:58:55.753 INFO VariantFiltration - HTSJDK Version: 2.24.1
    13:58:55.753 INFO VariantFiltration - Picard Version: 2.25.4
    13:58:55.753 INFO VariantFiltration - Built for Spark Version: 2.4.5
    13:58:55.753 INFO VariantFiltration - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    13:58:55.753 INFO VariantFiltration - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    13:58:55.753 INFO VariantFiltration - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    13:58:55.753 INFO VariantFiltration - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    13:58:55.754 INFO VariantFiltration - Deflater: IntelDeflater
    13:58:55.754 INFO VariantFiltration - Inflater: IntelInflater
    13:58:55.754 INFO VariantFiltration - GCS max retries/reopens: 20
    13:58:55.754 INFO VariantFiltration - Requester pays: disabled
    13:58:55.754 INFO VariantFiltration - Initializing engine
    13:59:01.211 INFO FeatureManager - Using codec VCFCodec to read file file:///PATH.vcf
    13:59:06.306 INFO VariantFiltration - Done initializing engine
    13:59:08.193 INFO ProgressMeter - Starting traversal
    13:59:08.194 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
    13:59:08.565 INFO VariantFiltration - Shutting down engine
    [03 December 2021 13:59:08 GMT] org.broadinstitute.hellbender.tools.walkers.filters.VariantFiltration done. Elapsed time: 0.22 minutes.
    Runtime.totalMemory()=1917845504
    java.lang.NumberFormatException: For input string: "25.36"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Long.parseLong(Long.java:589)
    at java.lang.Long.parseLong(Long.java:631)
    at org.apache.commons.jexl2.JexlArithmetic.toLong(JexlArithmetic.java:906)
    at org.apache.commons.jexl2.JexlArithmetic.compare(JexlArithmetic.java:718)
    at org.apache.commons.jexl2.JexlArithmetic.lessThan(JexlArithmetic.java:774)
    at org.apache.commons.jexl2.Interpreter.visit(Interpreter.java:967)
    at org.apache.commons.jexl2.parser.ASTLTNode.jjtAccept(ASTLTNode.java:18)
    at org.apache.commons.jexl2.Interpreter.interpret(Interpreter.java:232)
    at org.apache.commons.jexl2.ExpressionImpl.evaluate(ExpressionImpl.java:65)
    at htsjdk.variant.variantcontext.JEXLMap.evaluateExpression(JEXLMap.java:186)
    at htsjdk.variant.variantcontext.JEXLMap.get(JEXLMap.java:95)
    at htsjdk.variant.variantcontext.JEXLMap.get(JEXLMap.java:15)
    at htsjdk.variant.variantcontext.VariantContextUtils.match(VariantContextUtils.java:338)
    at org.broadinstitute.hellbender.tools.walkers.filters.VariantFiltration.matchesFilter(VariantFiltration.java:452)
    at org.broadinstitute.hellbender.tools.walkers.filters.VariantFiltration.filter(VariantFiltration.java:406)
    at org.broadinstitute.hellbender.tools.walkers.filters.VariantFiltration.apply(VariantFiltration.java:353)
    at org.broadinstitute.hellbender.engine.VariantWalker.lambda$traverse$0(VariantWalker.java:104)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
    at java.util.Iterator.forEachRemaining(Iterator.java:116)
    at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
    at org.broadinstitute.hellbender.engine.VariantWalker.traverse(VariantWalker.java:102)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1085)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
    at org.broadinstitute.hellbender.Main.main(Main.java:289)

    0
    Comment actions Permalink
  • Avatar
    Pamela Bretscher

    Hi Bekah W,

    I'm glad to hear the intervals are now working properly for you! I can see that your VariantFiltration command appears to be failing with "NumberFormatException: For input string: "25.36". I believe this may be an incompatibility between the integer values you specify in your command (e.g. FS<30) and the non-integer values the tool is running into in your files (25.36). Could you try adding ".0" to all of the integer values in your command (FS<30.0) to see if the tool runs properly? This article provides some more information about JEXL expression issues. 

    Kind regards,

    Pamela

    0
    Comment actions Permalink
  • Avatar
    Bekah W

    Oh yes sorry ... I ran an older script version I had tested previously, the modified version has .0 added! I'll see if this one works with the file! Sorry again!

     

    0
    Comment actions Permalink
  • Avatar
    Bekah W

    Yes this runs now! Apologies again entirely my fault!

    0
    Comment actions Permalink
  • Avatar
    Pamela Bretscher

    Bekah W, great! I'm glad to hear it's working for you now. Let me know if you have any additional questions.

    Kind regards,

    Pamela

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk