disabled all read filters but reads are still filtering
Hello,
I'm working on UMI included data and had UMI aware deduplication step. But still, the haplotypecaller removes most of the reads even though the log file shows "0 read(s) filtered by: NotDuplicateReadFilter". To be sure what is happening, I used disable-tool-default-read-filters but as a result, still reads are removed. I want to have high depth as input data. I attached input and output IGV read counts. Why is this happening and how can I solve this?
Thank you in advance.
VCF:
chr13 32912299 . T C 1061.64 AC=1;AF=0.500;AN=2;BaseQRankSum=-11.406;DP=268;ExcessHet=0.0000;FS=22.545;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=4.15;ReadPosRankSum=3.299;SOR=0.330 GT:AD:DP:GQ:PL 0/1:161,95:256:99:1069,0,4790
input bam file:
haplotypecaller bamout:
REQUIRED for all errors and issues:
a) GATK version used: gatk-4.3.0.0
b) Exact command used:
$GATK --java-options "-Xmx"$memory"G -XX:ParallelGCThreads=$thread" HaplotypeCaller -R $GENOME_FASTA -I $BQSR_DIR/recal_reads_"$pid".bam --disable-tool-default-read-filters \
-bamout $VAR_DIR/hc_"$pid".bam -O $VAR_DIR/raw_variants_"$pid".vcf
c) Entire program log:
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx160G -XX:ParallelGCThreads=44 -jar /gpfs/shared/WES_analyses/tools/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar HaplotypeCaller -R /gpfs/shared/WES_analyses/refs/refs/broad_human_bundle_hg37/Homo_sapiens_assembly19.fasta -I /gpfs/shared/WES_analyses/Sets-annalysis/TGDA-B_26102023/S11/hg19/BQSR_output/recal_reads_S11.bam --disable-tool-default-read-filters -bamout /gpfs/shared/WES_analyses/Sets-annalysis/TGDA-B_26102023/_S11/hg19/Variants_output/hc_S11.bam -O /gpfs/shared/WES_analyses/Sets-annalysis/TGDA-B_26102023/S11/hg19/Variants_output/raw_variants_S11.vcf
11:26:53.896 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gpfs/shared/WES_analyses/tools/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
11:26:54.329 INFO HaplotypeCaller - ------------------------------------------------------------
11:26:54.330 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.3.0.0
11:26:54.330 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
11:26:54.330 INFO HaplotypeCaller - Executing as sselvi@server001.mesten.local on Linux v3.10.0-1062.12.1.el7.x86_64 amd64
11:26:54.330 INFO HaplotypeCaller - Java runtime: IBM J9 VM v8.0.5.30 - pxa6480sr5fp30-20190207_01(SR5 FP30)
11:26:54.330 INFO HaplotypeCaller - Start Date/Time: January 3, 2024 11:26:53 AM TRT
11:26:54.330 INFO HaplotypeCaller - ------------------------------------------------------------
11:26:54.330 INFO HaplotypeCaller - ------------------------------------------------------------
11:26:54.331 INFO HaplotypeCaller - HTSJDK Version: 3.0.1
11:26:54.331 INFO HaplotypeCaller - Picard Version: 2.27.5
11:26:54.331 INFO HaplotypeCaller - Built for Spark Version: 2.4.5
11:26:54.331 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
11:26:54.331 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
11:26:54.331 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
11:26:54.331 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
11:26:54.331 INFO HaplotypeCaller - Deflater: IntelDeflater
11:26:54.331 INFO HaplotypeCaller - Inflater: IntelInflater
11:26:54.331 INFO HaplotypeCaller - GCS max retries/reopens: 20
11:26:54.331 INFO HaplotypeCaller - Requester pays: disabled
11:26:54.331 INFO HaplotypeCaller - Initializing engine
11:26:54.786 INFO HaplotypeCaller - Done initializing engine
11:26:54.801 INFO HaplotypeCallerEngine - Disabling physical phasing, which is supported only for reference-model confidence output
11:26:54.838 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/gpfs/shared/WES_analyses/tools/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
11:26:54.839 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/gpfs/shared/WES_analyses/tools/gatk-4.3.0.0/gatk-package-4.3.0.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
11:26:54.862 INFO IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
11:26:54.863 INFO IntelPairHmm - Available threads: 44
11:26:54.863 INFO IntelPairHmm - Requested threads: 4
11:26:54.863 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
11:26:54.897 INFO ProgressMeter - Starting traversal
11:26:54.897 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
11:26:55.354 WARN InbreedingCoeff - InbreedingCoeff will not be calculated at position chrM:2354 and possibly subsequent; at least 10 samples must have called genotypes
11:27:04.897 INFO ProgressMeter - chr1:20540228 0.2 68540 411240.0
11:27:14.897 INFO ProgressMeter - chr1:45463556 0.3 151640 454920.0
11:27:24.898 INFO ProgressMeter - chr1:69413989 0.5 231500 463000.0
11:27:34.898 INFO ProgressMeter - chr1:94532093 0.7 315250 472863.2
11:27:44.898 INFO ProgressMeter - chr1:119832437 0.8 399600 479510.4
11:27:54.898 INFO ProgressMeter - chr1:145283674 1.0 484440 484431.9
11:28:04.898 INFO ProgressMeter - chr1:171296263 1.2 571160 489558.7
11:28:14.898 INFO ProgressMeter - chr1:196159485 1.3 654060 490538.9
11:28:24.898 INFO ProgressMeter - chr1:219470054 1.5 731780 487847.9
11:28:34.898 INFO ProgressMeter - chr1:244699525 1.7 815900 489535.1
11:28:44.898 INFO ProgressMeter - chr2:21911421 1.8 904120 493151.9
11:28:54.898 INFO ProgressMeter - chr2:47187524 2.0 988390 494190.9
11:29:04.898 INFO ProgressMeter - chr2:73752471 2.2 1076960 497054.6
11:29:14.898 INFO ProgressMeter - chr2:99529328 2.3 1162920 498390.7
11:29:24.899 INFO ProgressMeter - chr2:125775461 2.5 1250430 500168.7
11:29:34.898 INFO ProgressMeter - chr2:152335604 2.7 1339000 502121.9
11:29:44.898 INFO ProgressMeter - chr2:179140394 2.8 1428370 504127.6
11:29:54.898 INFO ProgressMeter - chr2:205058105 3.0 1514790 504927.2
11:30:04.898 INFO ProgressMeter - chr2:230034191 3.2 1598070 504651.0
11:30:14.898 INFO ProgressMeter - chr3:13602629 3.3 1687320 506193.5
11:30:24.898 INFO ProgressMeter - chr3:40133124 3.5 1775780 507363.3
11:30:34.899 INFO ProgressMeter - chr3:66443002 3.7 1863490 508222.2
11:30:44.898 INFO ProgressMeter - chr3:91405697 3.8 1946730 507840.4
11:30:54.898 INFO ProgressMeter - chr3:117876612 4.0 2034990 508745.4
11:31:04.899 INFO ProgressMeter - chr3:143633152 4.2 2120860 509004.4
11:31:14.898 INFO ProgressMeter - chr3:168770065 4.3 2204670 508768.0
11:31:24.898 INFO ProgressMeter - chr3:194649323 4.5 2290950 509098.1
11:31:34.898 INFO ProgressMeter - chr4:22326243 4.7 2376640 509278.2
11:31:44.899 INFO ProgressMeter - chr4:48944105 4.8 2465390 510078.9
11:31:54.898 INFO ProgressMeter - chr4:75847299 5.0 2555110 511020.3
11:32:04.898 INFO ProgressMeter - chr4:103330879 5.2 2646740 512270.6
11:32:14.899 INFO ProgressMeter - chr4:128203087 5.3 2729680 511811.8
11:32:24.899 INFO ProgressMeter - chr4:152905541 5.5 2812050 511278.7
11:32:34.899 INFO ProgressMeter - chr4:179401867 5.7 2900380 511828.8
11:32:44.899 INFO ProgressMeter - chr5:14598082 5.8 2988230 512265.1
11:32:54.900 INFO ProgressMeter - chr5:40818402 6.0 3075650 512604.1
11:33:04.900 INFO ProgressMeter - chr5:68270592 6.2 3167180 513592.6
11:33:14.901 INFO ProgressMeter - chr5:94544631 6.3 3254770 513907.0
11:33:24.905 INFO ProgressMeter - chr5:119896732 6.5 3339300 513729.2
11:33:34.905 INFO ProgressMeter - chr5:147505015 6.7 3431350 514693.5
11:33:44.905 INFO ProgressMeter - chr5:172634851 6.8 3515140 514401.9
11:33:54.904 INFO ProgressMeter - chr6:17217938 7.0 3600170 514301.4
11:34:04.905 INFO ProgressMeter - chr6:43535074 7.2 3687910 514582.5
11:34:14.905 INFO ProgressMeter - chr6:70188271 7.3 3776770 515004.7
11:34:24.906 INFO ProgressMeter - chr6:96054671 7.5 3863020 515059.0
11:34:34.907 INFO ProgressMeter - chr6:122442025 7.7 3951000 515336.6
11:34:44.908 INFO ProgressMeter - chr6:148455529 7.8 4037730 515442.8
11:34:54.908 INFO ProgressMeter - chr7:3770176 8.0 4125850 515719.4
11:35:04.908 INFO ProgressMeter - chr7:29007176 8.2 4210000 515498.6
11:35:14.908 INFO ProgressMeter - chr7:55981001 8.3 4299930 515980.2
11:35:24.908 INFO ProgressMeter - chr7:82130451 8.5 4387120 516120.6
11:35:34.909 INFO ProgressMeter - chr7:107399304 8.7 4471370 515916.4
11:35:44.908 INFO ProgressMeter - chr7:132692404 8.8 4555700 515728.9
11:35:54.908 INFO ProgressMeter - chr7:157866608 9.0 4639630 515503.9
11:36:04.908 INFO ProgressMeter - chr8:23998602 9.2 4723900 515324.2
11:36:14.908 INFO ProgressMeter - chr8:49327416 9.3 4808360 515171.3
11:36:24.908 INFO ProgressMeter - chr8:74592221 9.5 4892590 514999.5
11:36:34.908 INFO ProgressMeter - chr8:99992320 9.7 4977270 514880.2
11:36:44.908 INFO ProgressMeter - chr8:125209752 9.8 5061340 514702.9
11:36:54.908 INFO ProgressMeter - chr9:3861334 10.0 5144740 514464.6
11:37:04.908 INFO ProgressMeter - chr9:29246716 10.2 5229380 514356.0
11:37:14.909 INFO ProgressMeter - chr9:54695955 10.3 5314230 514271.2
11:37:24.908 INFO ProgressMeter - chr9:79726940 10.5 5397690 514056.7
11:37:34.908 INFO ProgressMeter - chr9:105134868 10.7 5482400 513966.2
11:37:44.908 INFO ProgressMeter - chr9:130264736 10.8 5566200 513794.4
11:37:54.908 INFO ProgressMeter - chr10:14200465 11.0 5650050 513632.3
11:38:04.908 INFO ProgressMeter - chr10:39284486 11.2 5733700 513457.2
11:38:14.908 INFO ProgressMeter - chr10:64530363 11.3 5817880 513334.0
11:38:24.908 INFO ProgressMeter - chr10:89584530 11.5 5901420 513158.8
11:38:34.908 INFO ProgressMeter - chr10:114960723 11.7 5986030 513080.2
11:38:44.908 INFO ProgressMeter - chr11:4353601 11.8 6069150 512878.0
11:38:54.908 INFO ProgressMeter - chr11:29778210 12.0 6153910 512818.0
11:39:04.908 INFO ProgressMeter - chr11:55288815 12.2 6238960 512783.5
11:39:14.909 INFO ProgressMeter - chr11:80713464 12.3 6323720 512726.4
11:39:24.908 INFO ProgressMeter - chr11:105976533 12.5 6407950 512628.5
11:39:34.908 INFO ProgressMeter - chr11:131144971 12.7 6491870 512508.6
11:39:44.908 INFO ProgressMeter - chr12:21336686 12.8 6575890 512399.7
11:39:54.908 INFO ProgressMeter - chr12:46535619 13.0 6659910 512293.5
11:40:04.908 INFO ProgressMeter - chr12:71806614 13.2 6744170 512208.3
11:40:14.908 INFO ProgressMeter - chr12:97127543 13.3 6828590 512137.2
11:40:24.908 INFO ProgressMeter - chr12:122428731 13.5 6912940 512062.7
11:40:34.908 INFO ProgressMeter - chr13:14210701 13.7 6998390 512070.4
11:40:45.657 INFO ProgressMeter - chr13:32895223 13.8 7060690 509944.4
11:40:58.740 INFO ProgressMeter - chr13:32913018 14.1 7060780 502044.6
11:41:09.412 INFO ProgressMeter - chr13:32931428 14.2 7060860 495780.8
11:41:19.411 INFO ProgressMeter - chr13:44734989 14.4 7100240 492779.1
11:41:29.412 INFO ProgressMeter - chr13:70694629 14.6 7186790 493082.3
11:41:39.411 INFO ProgressMeter - chr13:96273670 14.7 7272060 493291.9
11:41:49.412 INFO ProgressMeter - chr14:6608101 14.9 7357090 493480.2
11:41:59.412 INFO ProgressMeter - chr14:32141366 15.1 7442210 493670.8
11:42:09.414 INFO ProgressMeter - chr14:57440740 15.2 7526560 493805.6
11:42:19.414 INFO ProgressMeter - chr14:82955449 15.4 7611620 493984.6
11:42:29.414 INFO ProgressMeter - chr15:803401 15.6 7695640 494093.1
11:42:39.414 INFO ProgressMeter - chr15:26274339 15.7 7780550 494255.8
11:42:49.415 INFO ProgressMeter - chr15:51438917 15.9 7864460 494352.2
11:42:59.415 INFO ProgressMeter - chr15:76501040 16.1 7948030 494425.5
11:43:09.414 INFO ProgressMeter - chr16:180301 16.2 8035410 494731.9
11:43:19.414 INFO ProgressMeter - chr16:26629418 16.4 8123600 495081.3
11:43:29.414 INFO ProgressMeter - chr16:52730452 16.6 8210630 495353.8
11:43:39.414 INFO ProgressMeter - chr16:79414781 16.7 8299600 495736.8
11:43:49.414 INFO ProgressMeter - chr17:13842542 16.9 8382220 495736.6
11:43:59.414 INFO ProgressMeter - chr17:38602039 17.1 8464780 495732.9
11:44:09.919 INFO ProgressMeter - chr17:41256738 17.3 8473670 491216.8
11:44:19.919 INFO ProgressMeter - chr17:66956421 17.4 8559370 491436.7
11:44:29.919 INFO ProgressMeter - chr18:11184128 17.6 8644130 491599.0
11:44:39.919 INFO ProgressMeter - chr18:37614614 17.8 8732270 491948.7
11:44:49.919 INFO ProgressMeter - chr18:63908285 17.9 8819940 492265.6
11:44:59.920 INFO ProgressMeter - chr19:11559105 18.1 8905730 492472.3
11:45:09.921 INFO ProgressMeter - chr19:37207456 18.3 8991230 492659.8
11:45:19.920 INFO ProgressMeter - chr20:3293522 18.4 9075300 492766.2
11:45:29.920 INFO ProgressMeter - chr20:28124487 18.6 9158090 492801.9
11:45:39.920 INFO ProgressMeter - chr20:53322628 18.8 9242110 492902.5
11:45:49.920 INFO ProgressMeter - chr21:15504347 18.9 9326150 493002.3
11:45:59.920 INFO ProgressMeter - chr21:40021118 19.1 9407910 492981.0
11:46:09.920 INFO ProgressMeter - chr22:17094370 19.3 9491930 493077.5
11:46:19.921 INFO ProgressMeter - chr22:42005193 19.4 9574990 493122.4
11:46:29.926 INFO ProgressMeter - chrX:15590168 19.6 9657980 493161.3
11:46:39.927 INFO ProgressMeter - chrX:40901517 19.8 9742370 493272.5
11:46:49.926 INFO ProgressMeter - chrX:66114313 19.9 9826450 493366.3
11:46:59.927 INFO ProgressMeter - chrX:91200856 20.1 9910100 493437.1
11:47:09.927 INFO ProgressMeter - chrX:117160718 20.3 9996660 493650.4
11:47:19.926 INFO ProgressMeter - chrX:143426450 20.4 10084240 493910.3
11:47:29.926 INFO ProgressMeter - chrY:14044241 20.6 10170560 494104.7
11:47:39.926 INFO ProgressMeter - chrY:40221744 20.8 10257820 494341.3
11:47:49.928 INFO ProgressMeter - chr6_cox_hap2:327301 20.9 10343580 494502.4
11:47:59.926 INFO ProgressMeter - chr6_ssto_hap7:1775401 21.1 10426860 494543.3
11:48:04.148 INFO HaplotypeCaller - 0 read(s) filtered by: AllowAllReadsReadFilter
11:48:04.148 INFO ProgressMeter - chrUn_gl000249:36301 21.2 10459967 494463.3
11:48:04.148 INFO ProgressMeter - Traversal complete. Processed 10459967 total regions in 21.2 minutes.
11:48:04.327 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 0.034278959000000005
11:48:04.327 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 20.205034784000002
11:48:04.327 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 11.08 sec
11:48:04.607 INFO HaplotypeCaller - Shutting down engine
[January 3, 2024 11:48:04 AM TRT] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 21.18 minutes.
Runtime.totalMemory()=1049165824
-
Hi Sinem Selvi
You might also want to disable downsampling in HaplotypeCaller using
--max-reads-per-alignment-start 0
parameter.
-
Hi SkyWarrior
Thank you so much. I think that was one of the reasons. The depth increases to 694 but still less than the input bam file. I will keep this parameter for this data. Do you have any other suggestions for why the rest of the reads are removed?
Thank you in advance
-
Hi again.
Although you are disabling your read filters and downsampling there is still the additional local reassembly and pairHMM doing its job based on not only read filters but also base qualities and mapping qualities. Local reassembly algorithm cleans many non-useful reads and kmers based on the base and mapping qualities therefore your depth in IGV will not match perfectly to the depth in bamout or HaplotypeCaller VCF. I hope this answers your question.
-
Thank you SkyWarrior
The problem is solved after submitting the target regions. Do you know why the target file affected depth for the variants for that much?
-
Hello @Sinem Selvi.
If you are looking at the Bamout for HaplotypeCaller we don't really expect that to be representative of every read in your sample as there are a number of internal filters that are applied to the reads that are independent of the input filtering. Specifically we filter reads based on MappingQuality, length after trimming low quality, overhanging bases, excessive low BQ, etc... These filters are part of the genotyping code and thus are not disabled by the `--disable-tool-default-read-filters` argument. Furthermore, reads can fail to be re-aligned to their best scoring haplotypes or be poorly concordant with any haplotypes, which can also cause them to be dropped from the bamout.
Without more context from your site it is hard to say exactly what caused it in this particular case. Very often the signal you are seeing here (lots of reads not in the bamout) is a sign that there might have been a problem with the local assembly causing problems. Many of the assembly related arguments in this article might be worth testing with if you need to run this particular case to ground:
https://gatk.broadinstitute.org/hc/en-us/articles/360043491652-When-HaplotypeCaller-and-Mutect2-do-not-call-an-expected-variant
Please sign in to leave a comment.
5 comments