Liftover Picard hg19 to hg38: variants were not successfully lifted over
I am trying to convert a sample vcf file from GRCh37 that comes with the VEP installation to see how this tool works, but for some unknown reason, it is not working for me. I have also tried with another smaller test vcf file..but same luck.
a) GATK version used: v4.3.0.0-12 and Picard Version: 2.27.5
b) Exact command used:
java -Xmx8g -jar ~/picard/picard.jar LiftoverVcf \
I=input/homo_sapiens_hg19.vcf \
O=output/test_hg19_lifted.vcf \
CHAIN=hg19ToHg38.over.chain.gz \
REJECT=output/rejected_vars.vcf \
R=/root/.vep/homo_sapiens/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
c) Entire program log:
Executing as ...@... on Linux 5.15.0-56-generic amd64; OpenJDK 64-Bit Server VM 1.8.0_152-release-1056-b12; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.27.5
INFO 2022-12-09 18:31:41 LiftoverVcf Loading up the target reference genome.
INFO 2022-12-09 18:32:15 LiftoverVcf Lifting variants over and sorting (not yet writing the output file.)
INFO 2022-12-09 18:32:15 LiftoverVcf Processed 173 variants.
INFO 2022-12-09 18:32:15 LiftoverVcf 173 variants failed to liftover.
INFO 2022-12-09 18:32:15 LiftoverVcf 0 variants lifted over but had mismatching reference alleles after lift over.
INFO 2022-12-09 18:32:15 LiftoverVcf 100.0000% of variants were not successfully lifted over and written to the output.
INFO 2022-12-09 18:32:15 LiftoverVcf liftover success by source contig:
INFO 2022-12-09 18:32:15 LiftoverVcf 21: 0 / 37 (0.0000%)
INFO 2022-12-09 18:32:15 LiftoverVcf 22: 0 / 136 (0.0000%)
INFO 2022-12-09 18:32:15 LiftoverVcf lifted variants by target contig:
INFO 2022-12-09 18:32:15 LiftoverVcf no successfully lifted variants
WARNING 2022-12-09 18:32:15 LiftoverVcf 0 variants with a swapped REF/ALT were identified, but were not recovered. See RECOVER_SWAPPED_REF_ALT and associated caveats.
INFO 2022-12-09 18:32:15 LiftoverVcf Writing out sorted records to final VCF.
[Fri Dec 09 18:32:15 UTC 2022] picard.vcf.LiftoverVcf done. Elapsed time: 0.57 minutes.
Runtime.totalMemory()=3167748096
d) This is the head of the output vcf file:
##fileformat=VCFv4.2
##INFO=<ID=ReverseComplementedAlleles,Number=0,Type=Flag,Description="The REF and the ALT alleles have been reverse complemente
d in liftover since the mapping from the previous reference to the current one was on the negative strand.">
##INFO=<ID=SwappedAlleles,Number=0,Type=Flag,Description="The REF and the ALT alleles have been swapped in liftover due to chan
ges in the reference. It is possible that not all INFO annotations reflect this swap, and in the genotypes, only the GT, PL, an
d AD fields have been modified. You should check the TAGS_TO_REVERSE parameter that was used during the LiftOver to be sure.">
##contig=<ID=1,length=248956422>
.....
ANY IDEAS PLEASE?
-
In case someone finds this post, I finally managed to solve it by using a different chain file downloaded from Ensembl (GRCh37_to_GRCh38.chain.gz). The one I was using (downloaded from UCSC) was the one causing the issue.
-
Download link:
Please sign in to leave a comment.
2 comments