Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

VCF - Variant Call Format Follow

8 comments

  • Avatar
    Loren Cassin Sackett

    The last paragraph says "This is how people used to do variant analysis on large numbers of samples, but we do not recommend proceeding this way because that workflow suffers from serious methodological flaws." Could you please explain or link to a document that explains these methodological flaws? 

    4
    Comment actions Permalink
  • Avatar
    ISmolicz

    As Loren Cassin Sackett has mentioned, it would be useful to have a further explanation. If it is not advised to combine VCFs for multiple samples, what would be the recommended method to compare variants in separate VCFs for multiple samples after variant calling, filtering and annotation has been completed?

    Thank you for your help.

    Kind regards.

    2
    Comment actions Permalink
  • Avatar
    lid.zigh

    Hello, 

    I have the same question too.

    "If it is not advised to combine VCFs for multiple samples, what would be the recommended method to compare variants in separate VCFs for multiple samples after variant calling, filtering and annotation has been completed?"

    Could anyone answer to this question? 

    Thank you.

    1
    Comment actions Permalink
  • Avatar
    Lynn Fink

    Is support planned for 5-methyl-cytosine bases in the VCF format? I am currently performing an accredited NGS methylation clinical assay on cancer specimens, sometimes in combination with an NGS genetic clinical assay on the same patient, and I would like to combine the output from both assays into a single VCF. I have base-resolution for the CpG sites I interrogate, but I can't see that the VCF format supports bases outside of A,C,G,T, and N so I don't know how to encode that information.

    0
    Comment actions Permalink
  • Avatar
    Neil Humphryes-Kirilov

    I also have the same question about how to combine sample VCFs. I am using a public VCF dataset where bam files are not available to do multi-sample variant calling so it would be great to know how best to tackle this.

    Also the link to the "Best Practices" article does not work for me.

    0
    Comment actions Permalink
  • Avatar
    Roller, Eric

    Can you confirm that this definition of GQ is different than the GQ defined in the VCF spec? From the spec:

    GQ (Integer): Conditional genotype quality, encoded as a phred quality −10log10 p(genotype call is wrong, conditioned on the site’s being variant). 

    0
    Comment actions Permalink
  • Avatar
    Roller, Eric

    If there is 50/50 chance that the variant is present, what should the QUAL be?

    I would expect -10log10(0.5) = ~3
    But if you are normalizing similar to how your PL values are normalized then this may end up being
    -10log10(0.5/0.5) = 0

    Which one does GATK report and can you confirm if this is compliant with the VCF spec:
    QUAL — quality: Phred-scaled quality score for the assertion made in ALT. i.e. −10log10 prob(call in ALT is wrong). If ALT is ‘.’ (no variant) then this is −10log10 prob(variant), and if ALT is not ‘.’ this is −10log10 prob(no variant). If unknown, the MISSING value must be specified. (Float)

    0
    Comment actions Permalink
  • Avatar
    Miguel Soares

    Hello,

    Best practices recommendation articles not functioning. I have utilized all steps of the pipeline but indeed wish to keep individual variation, or at least, perform joint calling in a cohort manner without loosing individual information.

    Can you suggest how to proceed?

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk