Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

(How to) Consolidate GVCFs for joint calling with GenotypeGVCFs Follow

7 comments

  • Avatar
    Robert Butler

    The syntax in this is incorrect, a list of comma separated intervals will not work: `--intervals chr20,chr21`

    Instead, it seems like a list file is required?: https://gatk.broadinstitute.org/hc/en-us/articles/360035531852-Intervals-and-interval-lists

    1
    Comment actions Permalink
  • Avatar
    Junhao Su

    an error occurred when using joint genotyping, seems the -newQual is not a valid option for `GenotypeGVCFs`

    error message:

    A USER ERROR has occurred: -newQual is not a recognized option

    1
    Comment actions Permalink
  • Avatar
    Jonathan Klonowski

    can you append a gvcf to a previously created gendb database?

     

    for example:

    gatk GenomicsDBImport \
        -V gendb://my_database 
    -V data/gvcfs/mother.g.vcf \ -V data/gvcfs/father.g.vcf \ -V data/gvcfs/son.g.vcf \ --genomicsdb-workspace-path my_database \ --intervals chr20,chr21

    So that you dont have to rerun GenomicsDBImport on all samples each time you have new samples genotyped?

     

    1
    Comment actions Permalink
  • Avatar
    Sandra Bohn

    I tried this with the intention of combining gVCFs and using SelectVariants (as described in the addendum) with the --concordance flag to extract a set of variants from my gVCFs. However, the output had all missing data for the GT fields, and after some searching I found that SelectVariants does not output the GT field. Is there a way to extract GTs without using the genotyper? My gVCFs were produced by another caller.

    Also, I could not get chr20,chr21 to work either.

    1
    Comment actions Permalink
  • Avatar
    Jonathan Klonowski

    No one actually looks at the comments. Lol. Useless.

    6
    Comment actions Permalink
  • Avatar
    jingchun sun

    I have combined some GVCFs by GenomicsDBImport, and continuity added other samples. however, I find some mistakes with  previous samples, how do I delete these samples from GATK GenomicsDBImport?

    1
    Comment actions Permalink
  • Avatar
    Michael

    gatk GenomicsDBImport \
        -V data/gvcfs/mother.g.vcf \
        -V data/gvcfs/father.g.vcf \
        -V data/gvcfs/son.g.vcf \
        --genomicsdb-workspace-path my_database \
        --intervals chr20,chr21

    The example is possibly misleading because --intervals don't seem to take a comma-separated list of arguments, at least with --intervals NC_001133.9,NC_001134.8 in GATK v4.2.6.1 this gives:

    A USER ERROR has occurred: Badly formed genome unclippedLoc: Query interval "NC_001133.9,NC_001134.8" is not valid for this input.

     

     

    Maybe this is because of the formatting of the refseq ids,

    OR the example should possibly be given as

    gatk GenomicsDBImport \
        -V data/gvcfs/mother.g.vcf \
        -V data/gvcfs/father.g.vcf \
        -V data/gvcfs/son.g.vcf \
        --genomicsdb-workspace-path my_database \
        --intervals chr20 --intervals chr21

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk