Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

LiftoverVcf (Picard) Follow

6 comments

  • Avatar
    Riaz Gillani

    Hello - I am liftover from a VCF in UCSC hg19 coordinates (with "chr" prefixes) to b37 coordinates. My understanding is that in order to successfully use this tool, I would need a chain file that looks something like this hg19tob37.chain.

    Is this correct? Does such a file exist, and if so, where can I find it? It is not at the UCSC links listed above.

    Thanks,

    Riaz

    0
    Comment actions Permalink
  • Avatar
    Riaz Gillani

    Some of the legacy forums suggest it should be here: ftp://ftp.broadinstitute.org/Liftover_Chain_Files

    However, this does not seem accessible.

    Thanks,

    Riaz

    0
    Comment actions Permalink
  • Avatar
    Jose Ferrao

    Riaz,

    You should be able to find the chain files to liftOver from hg19 to other versions here: https://hgdownload.soe.ucsc.edu/goldenPath/hg19/liftOver/

    But I guess that one is not there.

    https://gatk.broadinstitute.org/hc/en-us/articles/360035890711-GRCh37-hg19-b37-humanG1Kv37-Human-Reference-Discrepancies

     

    Jose

     

    0
    Comment actions Permalink
  • Avatar
    mehar

    The provided VCF file is malformed at approximately line number 3460: 0/0:.:.:.:.:.:. is not a valid start position in the VCF format:

    I have been using LiftOverVcf with the below command:

    bcftools view variants.bcf | picard LiftoverVcf I=/dev/stdin O=lifted_over.vcf CHAIN=canFam3ToCanFam4.over.chain.gz REJECT=rejected_variants.vcf R=target.fasta  MAX_RECORDS_IN_RAM=100000

    It gives the error:

    [Thu Feb 04 15:09:13 EET 2021] picard.vcf.LiftoverVcf done. Elapsed time: 0.21 minutes.

    Runtime.totalMemory()=3603279872

    To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp

    Exception in thread "main" htsjdk.tribble.TribbleException: The provided VCF file is malformed at approximately line number 3460: 0/0:.:.:.:.:.:. is not a valid start position in the VCF format, for input source: file:///dev/stdin

    at htsjdk.variant.vcf.AbstractVCFCodec.generateException(AbstractVCFCodec.java:883)

    at htsjdk.variant.vcf.AbstractVCFCodec.parseVCFLine(AbstractVCFCodec.java:409)

    at htsjdk.variant.vcf.AbstractVCFCodec.decodeLine(AbstractVCFCodec.java:384)

    at htsjdk.variant.vcf.AbstractVCFCodec.decode(AbstractVCFCodec.java:328)

    at htsjdk.variant.vcf.AbstractVCFCodec.decode(AbstractVCFCodec.java:48)

    at htsjdk.tribble.AsciiFeatureCodec.decode(AsciiFeatureCodec.java:70)

    at htsjdk.tribble.AsciiFeatureCodec.decode(AsciiFeatureCodec.java:37)

    at htsjdk.tribble.TribbleIndexedFeatureReader$WFIterator.readNextRecord(TribbleIndexedFeatureReader.java:373)

    at htsjdk.tribble.TribbleIndexedFeatureReader$WFIterator.<init>(TribbleIndexedFeatureReader.java:342)

    at htsjdk.tribble.TribbleIndexedFeatureReader.iterator(TribbleIndexedFeatureReader.java:309)

    at htsjdk.variant.vcf.VCFFileReader.iterator(VCFFileReader.java:305)

    at picard.vcf.LiftoverVcf.doWork(LiftoverVcf.java:389)

    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)

    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)

    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)

    crossmap  0.5.2  environment loaded

    The input bcf file is a merged multi-sample file and is piped by bcftools. The file has missing genotypes for some samples as `0/0:.:.:.:.:.:`  and as `./.:0,0:0:.:0,0,0` for few other samples in the first record itself and picard doesn't seem to like this kind of genotypes.

    Could someone help how to run liftover in such cases?

     

    0
    Comment actions Permalink
  • Avatar
    Steven Fleck

    I'm currently working with a VCF for a group a 292 plant samples. The current SNPs were created by mapping to a reference assembly with 1542 contigs, but I want to lift those over to the new Hi-C scaffolded chromosome-level assembly. The biggest problem that I'm having is the requirement of a .chain file. Can this file be generated in some way? I'm having a hard time finding a tool to do this. Thanks for any help you can provide

    EDIT: I was able to find a way through transanno
    https://github.com/informationsea/transanno

    0
    Comment actions Permalink
  • Avatar
    Michael McQuillan

    Does this LiftoverVCF tool retain phase information in a phased VCF file? In other words, if I lift over a phased VCF file from hg38 to hg19, is the phase information still accurate in the lifted over file?

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk