Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Picard LiftOverVCF 2.22.3. hs37d5_to_GRCh38. Many mismatched reference alleles

0

3 comments

  • Avatar
    danilovkiri

    Hi.

    Try to tun 'bcftools norm` prior to liftovering. It might be of help. 

    Also, have a look at the rejected VCF file (there is an argument REJECT to specify the file which will contain all rejected VCF entries) after you try normalizing with bcftools. It might help discover the problem.

    0
    Comment actions Permalink
  • Avatar
    Argonaut44

    Thank you for the feedback. I will try the bcftools norm. Should I also create a custom h37d5_to_GRCh38 chain file or it is not an option?

    0
    Comment actions Permalink
  • Avatar
    Argonaut44

    That was my mistake. I one book on bioinformatics I have read that the reference file should be the one vcf file mapped to. Thus, I did not properly read the original gatk documentation and got so many rejected variants (only 18% liftovered). Changing the reference file to the target fasta (GRCh38) increased the successful liftover rate to ~95%.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk