GenotypeGVCF not getting all samples
REQUIRED for all errors and issues:
a) GATK version used: v4.1.4.1
b) Exact command used:
First genomicsDB:
gatk --java-options "-Xmx24g -Xms24g" GenomicsDBImport --genomicsdb-workspace-path my_database --batch-size 50 -L reference/exons.bed --sample-name-map vcf_names.map --tmp-dir=../secondarytmp --reader-threads 24
Then genotypegvcfs:
gatk --java-options "-Xmx4g" GenotypeGVCFs -R reference/Aedes_aegypti_lvpagwg_ref.fa -V gendb://my_database -L reference/exons.bed -O aedes_test2.vcf.gz
c) Entire program log: Runs for some time then I get this error after the normal warnings:
WARNING: No valid combination operation found for INFO field DS - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field InbreedingCoeff - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field MLEAC - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field MLEAF - the field will NOT be part of INFO fields in the generated VCF records
20:07:22.349 INFO GenotypeGVCFs - Shutting down engine
GENOMICSDB_TIMER,GenomicsDB iterator next() timer,Wall-clock time(s),0.0,Cpu time(s),0.0
[19 June 2023 20:07:22 UTC] org.broadinstitute.hellbender.tools.walkers.GenotypeGVCFs done. Elapsed time: 611.19 minutes.
Runtime.totalMemory()=3397386240
java.lang.IndexOutOfBoundsException: Index: 839975697, Size: 4
I end up with a 100 sample vcf when i supplied 611 samples as proved with this command:
python -c "import json; print(len(json.load(open('my_database/callset.json'))['callsets']))"
611
-
Since you import your samples in batches of 50 it seems that during the 3rd cycle there is a sample with a problematic index or malformed gvcf file therefore importing crashes after successive 2 cycles and genotyping results with only 100 samples in it. Can you check 3rd batch of 50 gvcf files and their index files for validity?
Please sign in to leave a comment.
1 comment