Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

GenomicsDBImport: are all intervals being processed or is there an error?

Answered
0

4 comments

  • Avatar
    Genevieve Brandt (she/her)

    Hi ISmolicz, you can check the data in the GenomicsDB with SelectVariants

    0
    Comment actions Permalink
  • Avatar
    ISmolicz

    Thank you for your help Genevieve Brandt. I checked the data and multiple genomic locations are present although only one genomic location was listed in the error log.

    However, please may I ask how data for only one sample can be extracted from GenomicsDB using SelectVariants? I used the --sample-name option but the 'tumor_sample' name in the output VCF header was different to the sample I was requesting data for. Is this due to the 'tumor_sample' name automatically being set to the first sample imported into GenomicsDB?

    Thank you again.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    ISmolicz

    The GATK support team is focused on resolving questions about GATK tool-specific errors and abnormal results from the tools. For all other questions, such as this one, we are building a backlog to work through when we have the capacity.

    Please continue to post your questions because we will be mining them for improvements to documentation, resources, and tools.

    We cannot guarantee a reply, however, we ask other community members to help out if you know the answer.

    For context, check out our support policy.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt (she/her)

    The progress meter will not update will all sites, so the GenomicsDBWorkspace could have been created just fine depending on the size of your input VCFs. I'm not sure what happened with your sample names. It is probably easiest to overtly specify the sample names you want in the GenomicsDB by using the --sample-name-map argument. 

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk