Same fastqs and pipeline get different results
GATK version used: 4.1.0.0
Hello!
I'm trying to establish an analysis pipeline for panel sequencing to find somatic variants from paired normal and tumor fastqs. Now, the pipeline can run success, but there are some troubles about the final result.
**Trouble:**
The final variants in vcf is not reproducible with same fastqs and same pipeline between different runs.
It means that when I run my pipeline several times with same fastqs and databases as input, the somatic variants in final vcf is not always all the same, variant number may be 100 in most times, but can also be 101 or 105 and so on in some occasionally time.
**My pipeline brief description:**
1) BWA mem for alignment;
2) Bam processing: sambamba for bam sort, picard for markdup, GATK for BQSR;
3) Mutect2 (GATK 4.1.0.0) for somatic variants calling.
**PS:**
1) All softs in pipeline are in docker image, so I'm pretty sure that the running environment is consistent between different runs;
2) When start from final bams for variant calling with Mutect2, the result is reproducible;
3) As BWA report multiple alignments randomly, I'm not filter reads with low mapping quality, because it seems that Mutect2 can filter these reads before processing bam.
4) AF of different variants is ranging from 2% to 10%.
Thank you for the help!
-
Hi Wei
Please follow this best practices pipeline we put together: https://gatk.broadinstitute.org/hc/en-us/articles/360035894731-Somatic-short-variant-discovery-SNVs-Indels-
We are unable to help with issues that might arise with pipelines that are not following these best practices steps. Please try the above recommendations and let us know if the issue persists.
-
Thanks for your reply. I'll try the best practices pipeline.
Please sign in to leave a comment.
2 comments