Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Base recalibration

0

7 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Priyadarshini Thirunavukkarasu,

    Could you check that the known indels VCF file is not malformed by running ValidateVariants? If it is not malformed, you can also try re-indexing the VCF with IndexFeatureFile.

    Let me know if these solve the issue.

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Priyadarshini Thirunavukkarasu

    Hello Genevieve

    I couldn't index the VCF file. I get the error:

    Cannot read file:///scicore/home/cichon/GROUP/memory_optimization/data/index_dict/Homo_sapiens_assembly38.known_indels.vcf.gz because no suitable codecs found

    I also tried running ValidateVariants and I got the same error:

    Cannot read file:///scicore/home/cichon/GROUP/memory_optimization/data/index_dict/Homo_sapiens_assembly38.known_indels.vcf.gz because no suitable codecs found

    Thanks

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    This file might be malformed. Try deleting it then redownloading it.

    0
    Comment actions Permalink
  • Avatar
    Priyadarshini Thirunavukkarasu

    Thanks. Can you suggest any indels vcf files from any website or links?. Previously, I have downloaded this file many times and it seem to cause the same error. Anyway, I will try this time downloading and repeating the same step. 

    0
    Comment actions Permalink
  • Avatar
    Priyadarshini Thirunavukkarasu

    Hello

    I downloaded the Homo_sapiens_assembly38.known_indels.vcf.gz file from https://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0?pli=1. I could create index for the file. When I tried to validate the vcf using the command below, I got the error:Input files reference and features have incompatible contigs: No overlapping contigs found


    gatk ValidateVariants \
    > -R "/scicore/home/cichon/GROUP/memory_optimization/data/reference/gch38.fa" \
    > -V "/scicore/home/cichon/GROUP/memory_optimization/data/index_dict/resources_broad_hg38_v0_Homo_sapiens_assembly38.dbsnp138.vcf"
    Using GATK jar /scicore/soft/apps/GATK/4.2.2.0-foss-2018b-Java-1.8/gatk-package-4.2.2.0-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /scicore/soft/apps/GATK/4.2.2.0-foss-2018b-Java-1.8/gatk-package-4.2.2.0-local.jar ValidateVariants -R /scicore/home/cichon/GROUP/memory_optimization/data/reference/gch38.fa -V /scicore/home/cichon/GROUP/memory_optimization/data/index_dict/resources_broad_hg38_v0_Homo_sapiens_assembly38.dbsnp138.vcf
    12:56:48.650 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/scicore/soft/apps/GATK/4.2.2.0-foss-2018b-Java-1.8/gatk-package-4.2.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    Oct 04, 2021 12:56:48 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    12:56:48.795 INFO ValidateVariants - ------------------------------------------------------------
    12:56:48.796 INFO ValidateVariants - The Genome Analysis Toolkit (GATK) v4.2.2.0
    12:56:48.796 INFO ValidateVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
    12:56:48.796 INFO ValidateVariants - Executing as thirun0000@login20.cluster.bc2.ch on Linux v3.10.0-1160.41.1.el7.x86_64 amd64
    12:56:48.796 INFO ValidateVariants - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_212-b03
    12:56:48.797 INFO ValidateVariants - Start Date/Time: October 4, 2021 12:56:48 PM CEST
    12:56:48.797 INFO ValidateVariants - ------------------------------------------------------------
    12:56:48.797 INFO ValidateVariants - ------------------------------------------------------------
    12:56:48.797 INFO ValidateVariants - HTSJDK Version: 2.24.1
    12:56:48.797 INFO ValidateVariants - Picard Version: 2.25.4
    12:56:48.798 INFO ValidateVariants - Built for Spark Version: 2.4.5
    12:56:48.798 INFO ValidateVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    12:56:48.798 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    12:56:48.798 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    12:56:48.798 INFO ValidateVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    12:56:48.798 INFO ValidateVariants - Deflater: IntelDeflater
    12:56:48.798 INFO ValidateVariants - Inflater: IntelInflater
    12:56:48.798 INFO ValidateVariants - GCS max retries/reopens: 20
    12:56:48.798 INFO ValidateVariants - Requester pays: disabled
    12:56:48.799 INFO ValidateVariants - Initializing engine
    12:56:49.262 INFO FeatureManager - Using codec VCFCodec to read file file:///scicore/home/cichon/GROUP/memory_optimization/data/index_dict/resources_broad_hg38_v0_Homo_sapiens_assembly38.dbsnp138.vcf
    12:56:49.956 INFO ValidateVariants - Shutting down engine
    [October 4, 2021 12:56:49 PM CEST] org.broadinstitute.hellbender.tools.walkers.variantutils.ValidateVariants done. Elapsed time: 0.02 minutes.
    Runtime.totalMemory()=162267136
    ***********************************************************************

    A USER ERROR has occurred: Input files reference and features have incompatible contigs: No overlapping contigs found.

    VCF seem to be mapped to GRCh37, whereas the reference is GRCh38. So, I am not able to create dictionary for this vcf. Is there any indel vcf mapped to GRCh38?

     

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    One of our users located the files available and listed them here: https://gatk.broadinstitute.org/hc/en-us/community/posts/360075305092/comments/360014557672

    Hope you can find what you need!

    0
    Comment actions Permalink
  • Avatar
    Priyadarshini Thirunavukkarasu

    Thanks

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk