Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Error in Running command GtcToVcf

0

4 comments

  • Avatar
    Nur Adlina Binti Mohd Affian

    Hi GATK Team, 

    I am currently working on converting gtc to vcf file. However, there is an error appeared while doing it. Kindly advise me on this 

    0
    Comment actions Permalink
  • Avatar
    David Roazen

    Hi Nur Adlina Binti Mohd Affian,

    It looks like the assembly (AS) tag in the sequence dictionary (ie., the ".dict" file) for your fasta reference has the value "true". This particular tool requires the assembly tag to have the value "GRCh37".

    Regards,

    David

    0
    Comment actions Permalink
  • Avatar
    Nur Adlina Binti Mohd Affian

    Dear David, 

    I have successfully run the conversion of the data. However, I found there are a lot of duplicates found in the vcf file . Thus may I know how can I remove the duplicates? Is there any command that I can use to do so. 

    0
    Comment actions Permalink
  • Avatar
    David Roazen

    Hi Nur Adlina Binti Mohd Affian,

    I believe that bcftools can do this -- specifically the "bcftools norm" command as discussed here: https://www.biostars.org/p/420990/

    Regards,

    David

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk