Can I use LiftOver Picard for converting SNPs from one genomic build to another? If yes, how? If no, then what is the best alternate?
REQUIRED for all errors and issues:
a) GATK version used: GATK4
b) Exact command used:
c) Entire program log:
I have a few sample vcfs which are not in a very good quality. They are 23andme files in the following format:
rsID, chromosome no, position, genotype
I have tried remapping them using Galaxy. However, I guess the error is due to the format. The vcfs contain only SNPs.
ANY IDEAS PLEASE? How can i make it work?
-
I also want to add that these vcfs are mapped on the GRCh 36/hg18 and need to be remapped on hg38.
I have a specific list of SNPs 9according to the hg38) in a csv format which I need to filter from each of these vcfs after remapping.
Please suggest any alternate workflows if there are any to help me make this work.
-
These files are not of VCF type but a tabular format. VCF format requires
1- A header section. With Sequence dictionary (could be optional but it better be present)
2- Mandatory immutably positioned columns
You might want to check the spec sheet to convert your tabular files to vcf format manually.
https://samtools.github.io/hts-specs/VCFv4.2.pdf
Once you have a proper format then you may liftover to appropriate version and select variants based on their positions using a bed file.
I hope this helps.
-
Thank you so much for this. I assumed these were vcf files but they're not. Explains all the hassle.
Would you suggest converting the tab files into csv and then writing them as vcf? I've tried directly however, it always gives an error (probably due to the formatting issues with these)
Considering that converting them into csv would sort the columns into separate headers, is it a good approach? I would really appreciate your input on this and any additional suggestions you might have for a beginner.
Thank you so much for your help.
-
Hi again.
Converting only tab separated files to VCF may not be a simple task since you don't have reference alleles ready for your conversion process. I would strongly suggest you to pick correct coordinates and reference alleles for those SNP IDs based on your genome of preference and later try performing a manual conversion if necessary.
I hope this helps.
Regards.
Please sign in to leave a comment.
4 comments