Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

GenomicsDBImport just gives IntelInflater - Zero Bytes Written : 0 and GenotypeGVCFs becomes blank vcf

0

7 comments

  • Avatar
    Genevieve Brandt (she/her)

    Could you share the complete program log?

    0
    Comment actions Permalink
  • Avatar
    Brian Wiley

    Hi Genevieve,

    I have attached the log from bsub command.  I have also attached an example of one of the sample VCF files.

    Log file: https://wustl.box.com/s/wqierf165hth9vf17ng4vjlr0agp6loa 

    GVCF1: https://wustl.box.com/s/hbawc84mk408bp8n1wvr42llqxaf6syb 
    GVCF2: https://wustl.box.com/s/ktzvvczlk4jqke9q1xzszsm7zudti037 

    0
    Comment actions Permalink
  • Avatar
    Brian Wiley

    I should note that I did not know you could split multiallelics before running GenomicsDBImport and so I have to reverse bcftools split multiallelics to be able to run it.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    I can't access those files without making an account. If the program log is too long to be pasted, could you paste the part of the program log with the stack trace? I don't think these warning messages are necessarily related to the problem. 

    If that isn't possible, you can upload a bug report to our FTP site: https://gatk.broadinstitute.org/hc/en-us/articles/360035889671

    Thank you!

    0
    Comment actions Permalink
  • Avatar
    Brian Wiley

    Hi Genevieve,

    The issue was because I have normalized the variants with bcftools and unnormalizing doesn't give you back the same file.  It works now but my followup question is based on the fact that I do not have these four INFO fields: QD, FS, SOR, and MQ for my variants that have a non-reference allele, which are defined for hard filtering here: https://gatk.broadinstitute.org/hc/en-us/articles/360035890471-Hard-filtering-germline-short-variants

    Is this because I used the below group annotation (-G) parameters in my call to HaplotypeCaller?

    $ gatk HaplotypeCaller --java-options "-Xmx32g" \
    -R GRCh38_full_analysis_set_plus_decoy_hla.fa \
    -I ${sample}_23153_0_0.bqsr.bam \
    -O ${sample}_23153_0_0.vcf.gz \
    -L tp53.gata2.canonical.splice.1_index.interval_list \
    -ERC GVCF \
    -G AS_StandardAnnotation \
    -G StandardAnnotation

    Sorry here is an example of my variant:

    chr3 128486117 . G C,<NON_REF> 867.64 PASS AS_RAW_BaseQRankSum=|0.1,1|NaN;AS_RAW_MQ=140400.00|97200.00|0
    .00;AS_RAW_MQRankSum=|0.0,1|NaN;AS_RAW_ReadPosRankSum=|0.8,1|NaN;AS_SB_TABLE=15,24|15,12|0,0;BaseQRankSum=0.112;DP=69;ExcessHet=3.010
    3;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=0.000;RAW_MQandDP=248400,69;ReadPosRankSum=0.855 GT:AD:DP:GQ:PL:SB 0/1:39,27,0:66:
    99:875,0,1379,992,1461,2453:15,24,15,12
    0
    Comment actions Permalink
  • Avatar
    Brian Wiley

    OK so they are in the GenotypeGVCFs file after calling joint genotypes.  So are these Hard filters supposed to be on a population level rather than on an individual level from HaplotypeCaller? If so it may be a good idea to write a sentence on this in the link to the Hard filters documentation maybe?

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    I'm glad that you were able to get to the bottom of your issue! 

    Yes, these annotations are at a population level, this is to filter out if these sites have data that cannot be trusted for accurate results. Even with VQSR, variant sites are being filtered out, not individual sample's data. I'll make that a note to our documentation team though, to hopefully spare some people in the future from this confusion!

    We also have documentations about these annotations, which can be found here: https://gatk.broadinstitute.org/hc/en-us/articles/4409907944219--Tool-Documentation-Index#VariantAnnotations

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk