How do I link adjacent variants from a haplotypercaller output vcf?
GATK version: gatk4-4.1.9.0-0 (install under anaconda environment)
Data type: pair-end RNA-seq
Preprocessing:
1. trim-adaptor: flexbar
2. Alignment: hisat2
3. AddOrReplaceReadGroups
4. MarkDuplicates
5. SplitNCigarReads
6. BaseRecalibrator
7. ApplyBQSR
Finally, use HaplotypeCaller:
gatk HaplotypeCaller -R Homo_sapiens.GRCh37.75.dna.chromosome.all.fasta -I Tumor_RNA.recali.bam --dont-use-soft-clipped-bases --dbsnp dbsnp_138.b37.vcf -stand-call-conf 20 -O Tumor_RNA_sorted.HC.vcf --native-pair-hmm-threads 8 --assembly-region-out TumorRNA_assemblyregion_profile.tsv --bam-output Tumor_RNA.HC.bam --create-output-bam-index true --output-mode EMIT_ALL_ACTIVE_SITES
I notice that variants locate on 229675327 and 229675327 obviously is linked together.
As a consequence, I expect it may look like ref:GG alt:AT.
Does the HaplotypeCaller has an argument could output a VCFf that links adjacent variants together?
-
Hi Chiahsin Liu,
The GATK support team is focused on resolving questions about GATK tool-specific errors and abnormal results from the tools. For all other questions, such as this one, we are building a backlog to work through when we have the capacity.
Please continue to post your questions because we will be mining them for improvements to documentation, resources, and tools.
We cannot guarantee a reply, however, we ask other community members to help out if you know the answer.
For context, check out our support policy.
-
Thanks for replying.
Martin
-
You can try 'bcftools norm -m+both, for more detail check
Please sign in to leave a comment.
3 comments