Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

The logic of joint calling for germline short variants Follow

10 comments

  • Avatar
    Kaalindi Misra

    Hello,

    Thank you for the explanation. I had a question about what type of samples can I include in a cohort-type analysis.

    Assuming that they are all WES (same panel) and Same ancestry (Just sequenced in separate batches)

    If my cohort has 100 cases with 8 families and sporadic then should I run them all together or divide them into batches of families and one for sporadic?

    The other question I have is: If I have 100 cases and 40 controls can I do the 'join genotyping' together or do it separately?

    Thank you for your help.

    4
    Comment actions Permalink
  • Avatar
    Degang Wu

    Hello,

    Could anyone explain the meaning of "rescueing genotype calls" under section titled "2. Greater sensitivity for low-frequency variants"? Does it mean the genotype calls of the low-frequency variants would be more accurate? Or does it mean that the low frequency variant site would be detected more easily?

    Thanks for your help.

    0
    Comment actions Permalink
  • Avatar
    Kevin Esoh

    This is such a great explanation

    0
    Comment actions Permalink
  • Avatar
    John-Hanson Machado

    Best practices link results in a 404 error

    1
    Comment actions Permalink
  • 0
    Comment actions Permalink
  • Avatar
    Priyal Visavadiya JRF

    I had the same query for joint call cohort. I assume that the diseased and case control samples should not be merged for cohort calling thinking that the high confidence SNP in case control samples might influence the variant calling procedure.

    0
    Comment actions Permalink
  • Avatar
    Saeed Farajzadeh Valilou

    I have a thousand whole-genome sequencing VCF files. Is there any way to figure out and be sure if these VCF files are individually called or jointly called? Is there any line in the header to look at for this?

    0
    Comment actions Permalink
  • Avatar
    Degang Wu

    @Saeed GATK will leave the commands used to call the genotypes in the header, so you can infer from there whether your vcfs we're joint-called or not.

    0
    Comment actions Permalink
  • Avatar
    Saeed Farajzadeh Valilou

    @Degang I have this line in the header for GenotypeGVCFs:

    ##GATKCommandLine=<ID=HaplotypeCaller,Version=2014.4-3.3.0-0-ga3711aa,

    But I am not sure what input has been provided for GenotypeGVCFs ( 1. a single single-sample GVCF 2. a single multi-sample GVCF created by CombineGVCFs or 3. a GenomicsDB workspace created by GenomicsDBImport). Is there any specific commandline for this?

     

    0
    Comment actions Permalink
  • Avatar
    Adrian Platts

    As mentioned in replies above sent in 2021, the link to best practices still seems to generate a 404 error.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk