Version of GATK used: 3.8-1-0-gf15c1c3ef and 184.108.40.206.
Command used for local indel-reaglinment: java -Xmx8g -jar /home/software/AlignmentPipeline/GenomeAnalysisTK-3.8-1-0-gf15c1c3ef/GenomeAnalysisTK.jar -T IndelRealigner -R /home/data/Shared/References/reference.fna -I ./Dir_sample/sample_rh.rmDupli.bam -targetIntervals ./Dir_sample/sample_rh.intervals -o ./Dir_sample/sample_rh.rmDupliIndelRealigned.bam
command used for variant calling (mainly interested in SNPs):
/home/software/VariantCallingPipeline/SnpCalling/gatk-220.127.116.11/gatk --java-options "-Xmx30G -Djava.library.path=/home/software/VariantCallingPipeline/SnpCalling/gatk-18.104.22.168/libs -XX:+UseParallelGC -XX:ParallelGCThreads=2" HaplotypeCaller -I '+argList+" -O "+dest1+" -R "+argList+" --sample-name "+argList[-1]+" --emit-ref-confidence GVCF -pairHMM FASTEST_AVAILABLE --native-pair-hmm-threads 2 -L "+argList+":"+str(argList)+"-"+str(argList))
After using the above-mentioned two commands and following GATK's recommended filtering criteria, I noticed that thousands of SNPs have been called from soft-clipped bases, for example, in the attached photos (jbrowse alignment and text file), I observed that variants have been called at 242433 and 242435 positions, but there is no read aligned at these positions in the bam file. Instead, these positions are covered by soft-clipped bases, of course, I can turn on the parameter "--dont-use-soft-clipped-bases" but is it recommended? and I would be grateful if someone can explain me this behaviour of GATK haplotypecaller.
Please sign in to leave a comment.