Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Error in GenomicsDBImport: Invalid deflate block found

Answered
1

6 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Lauren Hennelly,

    Thanks for writing into the forum! Let's see if we can get this figured out. 

    I think there might be a problem with one of your input files, that it is malformed. You can check your input GVCFs for this chromosome with ValidateVariants. You can also try checking the logs for HaplotypeCaller when you created these GVCFs to see if there were any issues. It looks like the problem is in the 2nd batch, to narrow your search down.

    Let me know what you find and if you have any other questions.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Cristian Gonzalez-Colin

    Hi GATK community, 

    I have a similar problem with a 120 human samples cohort, in all the chromosomes running GenomicsDBImport worked fine except for chromosome chr1. However, my command is slightly different. 

    GATK version used: 

    > gatk --version
    Using GATK jar /home/cgonzalez/tools/gatk-4.2.2.0/gatk-package-4.2.2.0-local.jar
    Running:
        java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/cgonzalez/tools/gatk-4.2.2.0/gatk-package-4.2.2.0-local.jar --version
    Picked up JAVA_TOOL_OPTIONS: -Djava.io.tmpdir=/scratch
    The Genome Analysis Toolkit (GATK) v4.2.2.0
    HTSJDK Version: 2.24.1
    Picard Version: 2.25.4

    Command:

    gatk  GenomicsDBImport -V ${sample1} -V ${sample2} ... -V ${sampleN} --genomicsdb-workspace-path ${database} -L ${interval_file} --tmp-dir ${params.tempdir}

    When I run it for the whole chromosome it gives me this output:

    LOG

    09:57:09.063 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/cgonzalez/tools/gatk-4.2.2.0/gatk-package-4.2.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    Mar 01, 2022 9:57:09 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    09:57:09.277 INFO  GenomicsDBImport - ------------------------------------------------------------
    09:57:09.279 INFO  GenomicsDBImport - The Genome Analysis Toolkit (GATK) v4.2.2.0
    09:57:09.279 INFO  GenomicsDBImport - For support and documentation go to https://software.broadinstitute.org/gatk/
    09:57:09.280 INFO  GenomicsDBImport - Executing as cgonzalez@compute-2-2.hpc.lji.org on Linux v3.10.0-1160.42.2.el7.x86_64 amd64
    09:57:09.281 INFO  GenomicsDBImport - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_232-b09
    09:57:09.282 INFO  GenomicsDBImport - Start Date/Time: March 1, 2022 9:57:08 AM PST
    09:57:09.283 INFO  GenomicsDBImport - ------------------------------------------------------------
    09:57:09.283 INFO  GenomicsDBImport - ------------------------------------------------------------
    09:57:09.284 INFO  GenomicsDBImport - HTSJDK Version: 2.24.1
    09:57:09.285 INFO  GenomicsDBImport - Picard Version: 2.25.4
    09:57:09.285 INFO  GenomicsDBImport - Built for Spark Version: 2.4.5
    09:57:09.286 INFO  GenomicsDBImport - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    09:57:09.287 INFO  GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    09:57:09.288 INFO  GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    09:57:09.289 INFO  GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    09:57:09.289 INFO  GenomicsDBImport - Deflater: IntelDeflater
    09:57:09.290 INFO  GenomicsDBImport - Inflater: IntelInflater
    09:57:09.291 INFO  GenomicsDBImport - GCS max retries/reopens: 20
    09:57:09.291 INFO  GenomicsDBImport - Requester pays: disabled
    09:57:09.292 INFO  GenomicsDBImport - Initializing engine
    09:58:11.755 INFO  FeatureManager - Using codec BEDCodec to read file file:///home/cgonzalez/myscratch/R24/wgs/tmp/chr1.bed
    09:58:11.769 INFO  IntervalArgumentCollection - Processing 248956421 bp from intervals
    09:58:11.774 INFO  GenomicsDBImport - Done initializing engine
    09:58:12.170 INFO  GenomicsDBLibLoader - GenomicsDB native library version : 1.4.1-d59e886
    09:58:12.178 INFO  GenomicsDBImport - Vid Map JSON file will be written to /mnt/beegfs/lts/cgonzalez/R24/wgs/DICE_Cancer_WGS/2.Processed_data/vcf_database_chr1/vidmap.json
    09:58:12.179 INFO  GenomicsDBImport - Callset Map JSON file will be written to /mnt/beegfs/lts/cgonzalez/R24/wgs/DICE_Cancer_WGS/2.Processed_data/vcf_database_chr1/callset.json
    09:58:12.180 INFO  GenomicsDBImport - Complete VCF Header will be written to /mnt/beegfs/lts/cgonzalez/R24/wgs/DICE_Cancer_WGS/2.Processed_data/vcf_database_chr1/vcfheader.vcf
    09:58:12.180 INFO  GenomicsDBImport - Importing to workspace - /mnt/beegfs/lts/cgonzalez/R24/wgs/DICE_Cancer_WGS/2.Processed_data/vcf_database_chr1
    09:58:53.786 INFO  GenomicsDBImport - Importing batch 1 with 120 samples
    14:00:06.343 INFO  GenomicsDBImport - Shutting down engine
    [March 6, 2022 2:00:06 PM PST] org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport done. Elapsed time: 7,442.96 minutes.
    Runtime.totalMemory()=6135742464
    java.lang.RuntimeException: Invalid deflate block found.
        at com.intel.gkl.compression.IntelInflater.inflateNative(Native Method)
        at com.intel.gkl.compression.IntelInflater.inflate(IntelInflater.java:174)
        at htsjdk.samtools.util.BlockGunzipper.unzipBlock(BlockGunzipper.java:145)
        at htsjdk.samtools.util.BlockGunzipper.unzipBlock(BlockGunzipper.java:96)
        at htsjdk.samtools.util.BlockCompressedInputStream.inflateBlock(BlockCompressedInputStream.java:550)
        at htsjdk.samtools.util.BlockCompressedInputStream.processNextBlock(BlockCompressedInputStream.java:532)
        at htsjdk.samtools.util.BlockCompressedInputStream.nextBlock(BlockCompressedInputStream.java:468)
        at htsjdk.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:458)
        at htsjdk.samtools.util.BlockCompressedInputStream.available(BlockCompressedInputStream.java:196)
        at htsjdk.samtools.util.BlockCompressedInputStream.read(BlockCompressedInputStream.java:241)
        at htsjdk.tribble.readers.TabixReader.readLine(TabixReader.java:215)
        at htsjdk.tribble.readers.TabixReader.access$300(TabixReader.java:48)
        at htsjdk.tribble.readers.TabixReader$IteratorImpl.next(TabixReader.java:434)
        at htsjdk.tribble.readers.TabixIteratorLineReader.readLine(TabixIteratorLineReader.java:46)
        at htsjdk.tribble.TabixFeatureReader$FeatureIterator.readNextRecord(TabixFeatureReader.java:170)
        at htsjdk.tribble.TabixFeatureReader$FeatureIterator.next(TabixFeatureReader.java:205)
        at htsjdk.tribble.TabixFeatureReader$FeatureIterator.next(TabixFeatureReader.java:149)
        at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport$1$NoMnpIterator.next(GenomicsDBImport.java:984)
        at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport$1$NoMnpIterator.next(GenomicsDBImport.java:975)
        at org.genomicsdb.importer.GenomicsDBImporterStreamWrapper.next(GenomicsDBImporterStreamWrapper.java:110)
        at org.genomicsdb.importer.GenomicsDBImporter.doSingleImport(GenomicsDBImporter.java:578)
        at org.genomicsdb.importer.GenomicsDBImporter.lambda$null$4(GenomicsDBImporter.java:730)
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)


    I tried to split the chr1 into 6 chunks and ran it again with this interval file:

    > cat hg38.bed
    chr1    1    41492737
    chr1    41492738    82985475
    chr1    82985476    124478213
    chr1    124478214    165970951
    chr1    165970952    207463689
    chr1    207463690    248956422

    And it is crashing importing the batch samples during the 5th interval:

    LOG

    15:21:05.720 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/cgonzalez/tools/gatk-4.2.2.0/gatk-package-4.2.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    Mar 15, 2022 3:21:05 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    15:21:05.889 INFO  GenomicsDBImport - ------------------------------------------------------------
    15:21:05.890 INFO  GenomicsDBImport - The Genome Analysis Toolkit (GATK) v4.2.2.0
    15:21:05.891 INFO  GenomicsDBImport - For support and documentation go to https://software.broadinstitute.org/gatk/
    15:21:05.891 INFO  GenomicsDBImport - Executing as cgonzalez@gpu-3-2.hpc.lji.org on Linux v3.10.0-1160.42.2.el7.x86_64 amd64
    15:21:05.892 INFO  GenomicsDBImport - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_232-b09
    15:21:05.892 INFO  GenomicsDBImport - Start Date/Time: March 15, 2022 3:21:05 PM PDT
    15:21:05.893 INFO  GenomicsDBImport - ------------------------------------------------------------
    15:21:05.893 INFO  GenomicsDBImport - ------------------------------------------------------------
    15:21:05.894 INFO  GenomicsDBImport - HTSJDK Version: 2.24.1
    15:21:05.895 INFO  GenomicsDBImport - Picard Version: 2.25.4
    15:21:05.896 INFO  GenomicsDBImport - Built for Spark Version: 2.4.5
    15:21:05.896 INFO  GenomicsDBImport - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    15:21:05.897 INFO  GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    15:21:05.897 INFO  GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    15:21:05.897 INFO  GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    15:21:05.898 INFO  GenomicsDBImport - Deflater: IntelDeflater
    15:21:05.899 INFO  GenomicsDBImport - Inflater: IntelInflater
    15:21:05.899 INFO  GenomicsDBImport - GCS max retries/reopens: 20
    15:21:05.900 INFO  GenomicsDBImport - Requester pays: disabled
    15:21:05.900 INFO  GenomicsDBImport - Initializing engine
    15:22:42.658 INFO  FeatureManager - Using codec BEDCodec to read file file:///home/cgonzalez/myscratch/R24/wgs/tmp/hg38.bed
    15:22:42.673 INFO  IntervalArgumentCollection - Processing 248956416 bp from intervals
    15:22:42.678 INFO  GenomicsDBImport - Done initializing engine
    15:22:43.300 INFO  GenomicsDBLibLoader - GenomicsDB native library version : 1.4.1-d59e886
    15:22:43.309 INFO  GenomicsDBImport - Vid Map JSON file will be written to /home/cgonzalez/myscratch/R24/wgs/tmp/vcf_database_chr1/vidmap.json
    15:22:43.310 INFO  GenomicsDBImport - Callset Map JSON file will be written to /home/cgonzalez/myscratch/R24/wgs/tmp/vcf_database_chr1/callset.json
    15:22:43.310 INFO  GenomicsDBImport - Complete VCF Header will be written to /home/cgonzalez/myscratch/R24/wgs/tmp/vcf_database_chr1/vcfheader.vcf
    15:22:43.311 INFO  GenomicsDBImport - Importing to workspace - /home/cgonzalez/myscratch/R24/wgs/tmp/vcf_database_chr1
    15:24:16.949 INFO  GenomicsDBImport - Importing batch 1 with 120 samples
    03:40:59.710 INFO  GenomicsDBImport - Importing batch 1 with 120 samples
    07:58:09.631 INFO  GenomicsDBImport - Importing batch 1 with 120 samples
    09:13:20.816 INFO  GenomicsDBImport - Importing batch 1 with 120 samples
    23:05:49.016 INFO  GenomicsDBImport - Importing batch 1 with 120 samples
    23:29:03.385 INFO  GenomicsDBImport - Shutting down engine
    [March 20, 2022 11:29:03 PM PDT] org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport done. Elapsed time: 7,687.96 minutes.
    Runtime.totalMemory()=10736893952
    java.lang.RuntimeException: Invalid deflate block found.
        at com.intel.gkl.compression.IntelInflater.inflateNative(Native Method)
        at com.intel.gkl.compression.IntelInflater.inflate(IntelInflater.java:174)
        at htsjdk.samtools.util.BlockGunzipper.unzipBlock(BlockGunzipper.java:145)
        at htsjdk.samtools.util.BlockGunzipper.unzipBlock(BlockGunzipper.java:96)
        at htsjdk.samtools.util.BlockCompressedInputStream.inflateBlock(BlockCompressedInputStream.java:550)
        at htsjdk.samtools.util.BlockCompressedInputStream.processNextBlock(BlockCompressedInputStream.java:532)
        at htsjdk.samtools.util.BlockCompressedInputStream.nextBlock(BlockCompressedInputStream.java:468)
        at htsjdk.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:458)
        at htsjdk.samtools.util.BlockCompressedInputStream.available(BlockCompressedInputStream.java:196)
        at htsjdk.samtools.util.BlockCompressedInputStream.read(BlockCompressedInputStream.java:241)
        at htsjdk.tribble.readers.TabixReader.readLine(TabixReader.java:215)
        at htsjdk.tribble.readers.TabixReader.access$300(TabixReader.java:48)
        at htsjdk.tribble.readers.TabixReader$IteratorImpl.next(TabixReader.java:434)
        at htsjdk.tribble.readers.TabixIteratorLineReader.readLine(TabixIteratorLineReader.java:46)
        at htsjdk.tribble.TabixFeatureReader$FeatureIterator.readNextRecord(TabixFeatureReader.java:170)
        at htsjdk.tribble.TabixFeatureReader$FeatureIterator.next(TabixFeatureReader.java:205)
        at htsjdk.tribble.TabixFeatureReader$FeatureIterator.next(TabixFeatureReader.java:149)
        at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport$1$NoMnpIterator.next(GenomicsDBImport.java:984)
        at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport$1$NoMnpIterator.next(GenomicsDBImport.java:975)
        at org.genomicsdb.importer.GenomicsDBImporterStreamWrapper.next(GenomicsDBImporterStreamWrapper.java:110)
        at org.genomicsdb.importer.GenomicsDBImporter.doSingleImport(GenomicsDBImporter.java:578)
        at org.genomicsdb.importer.GenomicsDBImporter.lambda$null$4(GenomicsDBImporter.java:730)
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)


    I'm not sure if it is a problem with any of my input files in this specific region, would you know what could be the issue? 

    Lauren Hennelly, could you solve it? 

     

    0
    Comment actions Permalink
  • Avatar
    Lauren Hennelly

    Hi Genevieve and Cristian,

    With Genevieve's advice using ValidateVariants, I did end up solving the issue!

    I have 95 individuals that I used for the GATK pipeline. After I ran ValidateVariants on the GVCF files of all my individuals for chromosome 36, the output of ValidateVariants showed that only one of the individuals gave the error of "Invalid deflate block found." There were no other issues with any other sample for chromosome 36. 

    I just ended up removing that individual from my dataset, and after rerunning GenomicsDBImport without that individual, and it worked perfectly. 

    Here's the ValidateVariants command I  used: 


    echo "My SLURM_ARRAY_TASK_ID: " $SLURM_ARRAY_TASK_ID
    GVCF=$(sed "${SLURM_ARRAY_TASK_ID}q;d" list.txt)
    echo ${GVCF}

    gatk ValidateVariants \
    -V /home/hennelly/projects/GATK/GVCFfiles/${GVCF} \
    -L chr36 \
    -R /home/hennelly/fastqfiles/DogRefwithY/genomes/canFam3_withY.fa \
    -gvcf

    Hope that helps! 

    --Lauren

    0
    Comment actions Permalink
  • Avatar
    Pamela Bretscher

    Thank you for your input and sharing how you solved the issue Lauren Hennelly! Cristian Gonzalez-Colin have you tried running ValidateVariants on your GVCF input files for Chromosome 1 to pinpoint any potential issues?

    Kind regards,

    Pamela

    0
    Comment actions Permalink
  • Avatar
    Cristian Gonzalez-Colin

    Thanks, Pamela Bretscher and Lauren Hennelly for your input I could find the problematic donor. 

    Hi Pamela Bretscher as Lauren example, the validation for this donor gave me this error:

    14:24:37.501 WARN  ValidateVariants - Current interval chr1:206489377-206489377 overlaps previous interval ending at 206489377
    14:24:37.733 WARN  ValidateVariants - Current interval chr1:206619239-206619239 overlaps previous interval ending at 206619239
    14:24:38.390 WARN  ValidateVariants - Current interval chr1:207100059-207100063 overlaps previous interval ending at 207100063
    14:24:38.390 WARN  ValidateVariants - Current interval chr1:207100091-207100091 overlaps previous interval ending at 207100094
    14:24:38.390 WARN  ValidateVariants - Current interval chr1:207100092-207100094 overlaps previous interval ending at 207100094
    14:24:38.422 WARN  ValidateVariants - Current interval chr1:207118166-207118166 overlaps previous interval ending at 207118169
    14:24:38.422 WARN  ValidateVariants - Current interval chr1:207118167-207118167 overlaps previous interval ending at 207118169
    14:24:38.422 WARN  ValidateVariants - Current interval chr1:207118168-207118169 overlaps previous interval ending at 207118169
    14:24:39.254 INFO  ValidateVariants - Shutting down engine
    [March 24, 2022 2:24:39 PM PDT] org.broadinstitute.hellbender.tools.walkers.variantutils.ValidateVariants done. Elapsed time: 4.32 minutes.
    Runtime.totalMemory()=375697408
    java.lang.RuntimeException: Invalid deflate block found.
        at com.intel.gkl.compression.IntelInflater.inflateNative(Native Method)
        at com.intel.gkl.compression.IntelInflater.inflate(IntelInflater.java:174)
        at htsjdk.samtools.util.BlockGunzipper.unzipBlock(BlockGunzipper.java:145)
        at htsjdk.samtools.util.BlockGunzipper.unzipBlock(BlockGunzipper.java:96)
        at htsjdk.samtools.util.BlockCompressedInputStream.inflateBlock(BlockCompressedInputStream.java:550)
        at htsjdk.samtools.util.BlockCompressedInputStream.processNextBlock(BlockCompressedInputStream.java:532)
        at htsjdk.samtools.util.BlockCompressedInputStream.nextBlock(BlockCompressedInputStream.java:468)
        at htsjdk.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:458)
        at htsjdk.samtools.util.BlockCompressedInputStream.available(BlockCompressedInputStream.java:196)
        at htsjdk.samtools.util.BlockCompressedInputStream.read(BlockCompressedInputStream.java:241)
        at htsjdk.tribble.readers.TabixReader.readLine(TabixReader.java:215)
        at htsjdk.tribble.readers.TabixReader.access$300(TabixReader.java:48)
        at htsjdk.tribble.readers.TabixReader$IteratorImpl.next(TabixReader.java:434)
        at htsjdk.tribble.readers.TabixIteratorLineReader.readLine(TabixIteratorLineReader.java:46)
        at htsjdk.tribble.TabixFeatureReader$FeatureIterator.readNextRecord(TabixFeatureReader.java:170)
        at htsjdk.tribble.TabixFeatureReader$FeatureIterator.next(TabixFeatureReader.java:205)
        at htsjdk.tribble.TabixFeatureReader$FeatureIterator.next(TabixFeatureReader.java:149)
        at org.broadinstitute.hellbender.engine.FeatureIntervalIterator.loadNextFeature(FeatureIntervalIterator.java:98)
        at org.broadinstitute.hellbender.engine.FeatureIntervalIterator.loadNextNovelFeature(FeatureIntervalIterator.java:74)
        at org.broadinstitute.hellbender.engine.FeatureIntervalIterator.next(FeatureIntervalIterator.java:62)
        at org.broadinstitute.hellbender.engine.FeatureIntervalIterator.next(FeatureIntervalIterator.java:24)
        at java.util.Iterator.forEachRemaining(Iterator.java:116)
        at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
        at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
        at org.broadinstitute.hellbender.engine.VariantWalker.traverse(VariantWalker.java:102)
        at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1085)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
        at org.broadinstitute.hellbender.Main.main(Main.java:289)

    I checked the log from the HaplotypeCaller step and didn't find any other issue when the GVCF is generated. Do you know how can I save this donor data? 

    Thanks again,

    Cristian

    0
    Comment actions Permalink
  • Avatar
    Pamela Bretscher

    HI Cristian Gonzalez-Colin,

    Thank you for working through this suggestion and finding the problematic sample. There are a few things you can try to pinpoint the issue with this donor and attempt to keep the data. The first thing you can try is the --bypass-feature-reader argument when running GenomicsDBImport. However, if the issue is with the zip blocks itself, then there may not be much to do. You can try running PrintBGZFBlockInformation on this file to pinpoint where the error might be. If nothing else works, you can try to unzip and recompress the file with bgzip or reindex the file with tabix to see if GATK can import the file. These suggestions may help to find the potential error in the file, but it is most likely the simplest solution to remove this donor from the analysis. Please let me know if you have any questions.

    Kind regards,

    Pamela

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk