Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Created panel of normals returns empty vcf

Answered
0

3 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi Rafael Viana,

    I think there might have been an issue with your mapping step first because many reads were filtered out during Mutect2:

    3590243 read(s) filtered by: MappingQualityReadFilter

    It looks like the reads are not mapped well to the reference you are using. Is there an issue with the reference you edited and are using?

    Best,

    Genevieve

    0
    Comment actions Permalink
  • Avatar
    Rafael Viana

    Hi Genevieve,
    Thank you for the response.

    To produce the exome we used commercial kits (Agilent Sure select v6).
    Reads were mapped to Homo Sapiens GRCh38p13.rel102.dna.primary_assembly.
    Qualimap finds that the x30 coverage in the WES region is around 97% for all cases, and
    the mean mapping quality is consistently between 57.8 and 57.95
    These samples are healthy individuals that were used in germinal trios (parents of sick kids).
    A pipeline using other gatk tools like "HaplotypeCaller" -> "VariantRecalibrator" -> "VariantFiltration" was succesful.
    The amount of final mapped reads are around 40 million (pair end 150bp), so I think losing 3M should not be so dramatic.
    any idea about what van I do? lower quality thresholds?
    Thank you for your time.

    Kind regards

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    Thanks for the clarification Rafael Viana. So then, it looks like the issue is coming up with GenomicsDBImport since there are no variants present in the input for CreateSomaticPanelofNormals:

    16:14:54.444 INFO ProgressMeter - Traversal complete. Processed 0 total variants in 0.0 minutes.

    Here is our usage guide for GenomicsDBImport: https://gatk.broadinstitute.org/hc/en-us/articles/360056138571-GDBI-usage-and-performance-guidelines

    I would recommend decreasing the batch size and using the --genomicsdb-shared-posixfs-optimizations true argument if you are using a cluster or other shared file system. Did it run successfully without the --merge-input-intervals? If so, I would recommend trying it again with those other arguments and without merging intervals.

    Best,

    Genevieve

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk