CombineGVCFs Fails for a batch of 2 samples where samples shows high number of alternate INDEL Alleles
Hello GATK-Team,
A similar problem has been reported before, however couldn't solve for my case---
I ran the following command on a 2 sample set with GATK latest and GATK-3.6.0 to combineGVCF files for the contig hs37d5
../bin/gatk- CombineGVCFs -R /scratch/HS_Build37/BWA_INDEX_hs37d5/hs37d5.fa -V IPF0047.194173.endPosCorr.gvcf.gz -V IPF0048.194174.endPosCorr.gvcf.gz-L hs37d5 -O cohort.2.3.newGATK.hs37d5.gvcf
java -jar /nfs/goldstein/software/GATK/GATK-3.6.0-ArchivedVersion-g89b7209-patched/GenomeAnalysisTK.jar -T CombineGVCFs -R /scratch/HS_Build37/BWA_INDEX_hs37d5/hs37d5.fa -V IPF0047.194173.endPosCorr.gvcf.gz -V IPF0048.194174.endPosCorr.gvcf.gz -L hs37d5 > cohort.2.2.hs37d5.txt
The ERROR message received:
CombineGVCF was succesful till hs37d5:24140804 and Failed at hs37d5:24140805. I think the error is because of a site with higher number of Alternate Indel Alleles in 1 sample --
Screen shot of hs37d5:24140805 in Sample2
Would really appreciate any help in this regard...
Can you please post the error message using the latest version since we do not support GATK3 anymore.
Error from gatk-
Hi info 2020
Can you please share the two gvcfs with us so we can reproduce the error on our end. You can find instructions to sharing our data here:
Please sign in to leave a comment.