After apply gatk ApplyBQSR, the total reads become less
For the bam has been removed by picard MarkDuplicates, i ran gatk BaseRecalibrator and gatk ApplyBQSR. everything is ok. But when i stat the total reads by samtools flagstat,i found the reads number is different for the markduplicated bam and, the markduplicated and bqsr bam.
Before BQSR:
127811016 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
1560108 + 0 supplementary
43584620 + 0 duplicates
127399340 + 0 mapped (99.68% : N/A)
126250908 + 0 paired in sequencing
63125454 + 0 read1
63125454 + 0 read2
122834746 + 0 properly paired (97.29% : N/A)
125781140 + 0 with itself and mate mapped
58092 + 0 singletons (0.05% : N/A)
1768220 + 0 with mate mapped to a different chr
1307454 + 0 with mate mapped to a different chr (mapQ>=5)
After BQSR
51227137 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
107003 + 0 supplementary
25453177 + 0 duplicates
51205021 + 0 mapped (99.96% : N/A)
51120134 + 0 paired in sequencing
25566161 + 0 read1
25553973 + 0 read2
50430240 + 0 properly paired (98.65% : N/A)
51073415 + 0 with itself and mate mapped
24603 + 0 singletons (0.05% : N/A)
191793 + 0 with mate mapped to a different chr
187655 + 0 with mate mapped to a different chr (mapQ>=5)
Why? it should be the same! Look foward to your reply. Thank you!
REQUIRED for all errors and issues:
a) GATK version used: v4.2.6.1
b) Exact command used:
./gatk-4.2.6.1/gatk BaseRecalibrator -I test.markdup.bam -R ./grch37_ensemble/db/Homo_sapiens.GRCh37.fa --known-sites af-only-gnomad.hg19.vcf.gz -L OncoCGP668_v1.0.probe.grch37.bed -O test.baserecall.table && \
./gatk-4.2.6.1/gatk ApplyBQSR -R ./grch37_ensemble/db/Homo_sapiens.GRCh37.fa -I test.markdup.bam --bqsr-recal-file test.baserecall.table -L OncoCGP668_v1.0.probe.grch37.bed -O test.markdup.bqsr.bam && \
samtools flagstat test.markdup.bam > test.markdup.stat && \
samtools flagstat test.markdup.bqsr.bam > test.markdup.bqsr.stat
c) Entire program log:
14:34:17.572 INFO BaseRecalibrator - ------------------------------------------------------------
14:34:17.573 INFO BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.2.6.1
14:34:17.573 INFO BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
14:34:17.573 INFO BaseRecalibrator - Executing as swzhang@node3 on Linux v3.10.0-1160.71.1.el7.x86_64 amd64
14:34:17.573 INFO BaseRecalibrator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_332-b09
14:34:17.573 INFO BaseRecalibrator - Start Date/Time: April 7, 2024 2:34:17 PM CST
14:34:17.573 INFO BaseRecalibrator - ------------------------------------------------------------
14:34:17.573 INFO BaseRecalibrator - ------------------------------------------------------------
14:34:17.574 INFO BaseRecalibrator - HTSJDK Version: 2.24.1
14:34:17.574 INFO BaseRecalibrator - Picard Version: 2.27.1
14:34:17.574 INFO BaseRecalibrator - Built for Spark Version: 2.4.5
14:34:17.574 INFO BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
14:34:17.574 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
14:34:17.574 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
14:34:17.574 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
14:34:17.574 INFO BaseRecalibrator - Deflater: IntelDeflater
14:34:17.574 INFO BaseRecalibrator - Inflater: IntelInflater
14:34:17.574 INFO BaseRecalibrator - GCS max retries/reopens: 20
14:34:17.574 INFO BaseRecalibrator - Requester pays: disabled
14:34:17.574 INFO BaseRecalibrator - Initializing engine
14:34:17.962 INFO FeatureManager - Using codec VCFCodec to read file file:///ifs3/seqdata/swzhang/database/variant/hg19/gnom_af_vcf/af-only-gnomad.hg19.vcf.gz
14:34:18.020 INFO FeatureManager - Using codec BEDCodec to read file file:///ifs3/seqdata/swzhang/database/cancer/bed/hg19/OncoCGP668_v1.0/OncoCGP668_v1.0.probe.grch37.bed
14:34:18.072 INFO IntervalArgumentCollection - Processing 2965395 bp from intervals
14:34:18.081 INFO BaseRecalibrator - Done initializing engine
14:34:18.084 INFO BaseRecalibrationEngine - The covariates being used here:
14:34:18.084 INFO BaseRecalibrationEngine - ReadGroupCovariate
14:34:18.084 INFO BaseRecalibrationEngine - QualityScoreCovariate
14:34:18.084 INFO BaseRecalibrationEngine - ContextCovariate
14:34:18.084 INFO BaseRecalibrationEngine - CycleCovariate
14:34:18.087 INFO ProgressMeter - Starting traversal
14:34:18.088 INFO ProgressMeter - Current Locus Elapsed Minutes Reads Processed Reads/Minute
14:34:28.090 INFO ProgressMeter - 1:16257955 0.2 349000 2093790.6
14:34:38.092 INFO ProgressMeter - 1:40363301 0.3 762000 2285657.2
14:34:48.103 INFO ProgressMeter - 1:120461023 0.5 1191000 2380888.9
14:34:58.115 INFO ProgressMeter - 1:175996656 0.7 1655000 2480887.4
14:35:08.119 INFO ProgressMeter - 10:43617405 0.8 2122000 2544873.1
14:35:18.130 INFO ProgressMeter - 11:64138657 1.0 2578000 2576196.7
14:35:28.131 INFO ProgressMeter - 11:118359917 1.2 3048000 2610967.5
14:35:38.132 INFO ProgressMeter - 12:6715385 1.3 3580000 2683524.1
14:35:48.159 INFO ProgressMeter - 12:49440147 1.5 4099000 2730512.6
14:35:58.173 INFO ProgressMeter - 12:120805843 1.7 4581000 2746293.1
14:36:08.182 INFO ProgressMeter - 13:32891683 1.8 5062000 2758733.4
14:36:18.199 INFO ProgressMeter - 14:30066721 2.0 5560000 2777454.0
14:36:28.215 INFO ProgressMeter - 15:42002966 2.2 6046000 2787738.1
14:36:38.216 INFO ProgressMeter - 15:88553161 2.3 6609000 2829841.3
14:36:48.229 INFO ProgressMeter - 15:88662559 2.5 7231000 2889760.7
14:36:58.231 INFO ProgressMeter - 16:2226067 2.7 7698000 2884172.3
14:37:08.233 INFO ProgressMeter - 16:68844063 2.8 8196000 2890240.7
14:37:18.252 INFO ProgressMeter - 17:7980236 3.0 8646000 2879392.6
14:37:28.260 INFO ProgressMeter - 17:37872738 3.2 9157000 2889084.0
14:37:38.273 INFO ProgressMeter - 17:40855660 3.3 9693000 2905227.2
14:37:48.305 INFO ProgressMeter - 17:56440741 3.5 10230000 2919853.9
14:37:58.319 INFO ProgressMeter - 18:19756950 3.7 10731000 2923566.6
14:38:08.335 INFO ProgressMeter - 19:5206633 3.8 11223000 2924611.1
14:38:18.342 INFO ProgressMeter - 19:10602639 4.0 11708000 2923905.5
14:38:28.356 INFO ProgressMeter - 19:15302929 4.2 12201000 2925104.3
14:38:38.369 INFO ProgressMeter - 19:30308146 4.3 12680000 2923006.0
14:38:48.378 INFO ProgressMeter - 19:41096194 4.5 13140000 2916867.1
14:38:58.396 INFO ProgressMeter - 19:45924421 4.7 13651000 2922000.1
14:39:08.407 INFO ProgressMeter - 2:29416339 4.8 14147000 2923749.4
14:39:18.419 INFO ProgressMeter - 2:128038026 5.0 14625000 2921776.3
14:39:28.435 INFO ProgressMeter - 20:31368162 5.2 15133000 2925692.9
14:39:38.437 INFO ProgressMeter - 20:62298851 5.3 15628000 2927057.7
14:39:48.458 INFO ProgressMeter - 21:45656739 5.5 16118000 2927263.4
14:39:58.473 INFO ProgressMeter - 22:36685087 5.7 16532000 2914120.5
14:40:08.489 INFO ProgressMeter - 3:10183762 5.8 16971000 2905984.9
14:40:18.519 INFO ProgressMeter - 3:49940938 6.0 17459000 2906361.8
14:40:28.525 INFO ProgressMeter - 3:142180753 6.2 17991000 2914025.6
14:40:38.538 INFO ProgressMeter - 3:189582051 6.3 18498000 2917282.2
14:40:48.541 INFO ProgressMeter - 5:236397 6.5 19024000 2923381.1
14:40:58.545 INFO ProgressMeter - 5:121780352 6.7 19523000 2925108.1
14:41:08.549 INFO ProgressMeter - 5:180058699 6.8 20029000 2927781.2
14:41:18.550 INFO ProgressMeter - 6:117739570 7.0 20505000 2926074.0
14:41:28.558 INFO ProgressMeter - 7:55220249 7.2 21003000 2927458.2
14:41:38.564 INFO ProgressMeter - 7:116334210 7.3 21554000 2936012.3
14:41:48.580 INFO ProgressMeter - 8:29195939 7.5 22061000 2938260.7
14:41:58.604 INFO ProgressMeter - 8:38315021 7.7 22540000 2936712.2
14:42:08.606 INFO ProgressMeter - 8:68968116 7.8 23057000 2940206.3
14:42:18.606 INFO ProgressMeter - 8:145739582 8.0 23516000 2936331.2
14:42:28.608 INFO ProgressMeter - 9:98224233 8.2 23995000 2935054.5
14:42:38.626 INFO ProgressMeter - 9:139410032 8.3 24480000 2934448.4
14:42:48.635 INFO ProgressMeter - X:66863024 8.5 24984000 2936145.0
14:42:55.381 INFO BaseRecalibrator - 556088 read(s) filtered by: MappingQualityNotZeroReadFilter
0 read(s) filtered by: MappingQualityAvailableReadFilter
0 read(s) filtered by: MappedReadFilter
0 read(s) filtered by: NotSecondaryAlignmentReadFilter
25310235 read(s) filtered by: NotDuplicateReadFilter
0 read(s) filtered by: PassesVendorQualityCheckReadFilter
0 read(s) filtered by: WellformedReadFilter
25866323 total reads filtered
14:42:55.381 INFO ProgressMeter - GL000251.1:3939294 8.6 25360814 2941560.9
14:42:55.381 INFO ProgressMeter - Traversal complete. Processed 25360814 total reads in 8.6 minutes.
14:42:55.430 INFO BaseRecalibrator - Calculating quantized quality scores...
14:42:55.442 INFO BaseRecalibrator - Writing recalibration report...
14:42:55.590 INFO BaseRecalibrator - ...done!
14:42:55.590 INFO BaseRecalibrator - BaseRecalibrator was able to recalibrate 25360814 reads
14:42:55.590 INFO BaseRecalibrator - Shutting down engine
[April 7, 2024 2:42:55 PM CST] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 8.64 minutes.
Runtime.totalMemory()=2469920768
Tool returned:
SUCCESS
14:42:58.281 INFO ApplyBQSR - ------------------------------------------------------------
14:42:58.282 INFO ApplyBQSR - The Genome Analysis Toolkit (GATK) v4.2.6.1
14:42:58.282 INFO ApplyBQSR - For support and documentation go to https://software.broadinstitute.org/gatk/
14:42:58.282 INFO ApplyBQSR - Executing as swzhang@node3 on Linux v3.10.0-1160.71.1.el7.x86_64 amd64
14:42:58.282 INFO ApplyBQSR - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_332-b09
14:42:58.282 INFO ApplyBQSR - Start Date/Time: April 7, 2024 2:42:58 PM CST
14:42:58.282 INFO ApplyBQSR - ------------------------------------------------------------
14:42:58.283 INFO ApplyBQSR - ------------------------------------------------------------
14:42:58.283 INFO ApplyBQSR - HTSJDK Version: 2.24.1
14:42:58.283 INFO ApplyBQSR - Picard Version: 2.27.1
14:42:58.284 INFO ApplyBQSR - Built for Spark Version: 2.4.5
14:42:58.284 INFO ApplyBQSR - HTSJDK Defaults.COMPRESSION_LEVEL : 2
14:42:58.284 INFO ApplyBQSR - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
14:42:58.284 INFO ApplyBQSR - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
14:42:58.284 INFO ApplyBQSR - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
14:42:58.284 INFO ApplyBQSR - Deflater: IntelDeflater
14:42:58.284 INFO ApplyBQSR - Inflater: IntelInflater
14:42:58.284 INFO ApplyBQSR - GCS max retries/reopens: 20
14:42:58.284 INFO ApplyBQSR - Requester pays: disabled
14:42:58.284 INFO ApplyBQSR - Initializing engine
WARNING: BAM index file /ifs3/seqdata/swzhang/project/OncoCGP668_Novaseq_TMB_MSI_LC_20240326/OncoCGP668_Novaseq_run_tumor_only/Mapping/TMB-5-DNA_KY283_0228/TMB-5-DNA_KY283_0228.markdup.bai is older than BAM /ifs3/seqdata/swzhang/project/OncoCGP668_Novaseq_TMB_MSI_LC_20240326/OncoCGP668_Novaseq_run_tumor_only/Mapping/TMB-5-DNA_KY283_0228/TMB-5-DNA_KY283_0228.markdup.bam
14:42:58.699 INFO FeatureManager - Using codec BEDCodec to read file file:///ifs3/seqdata/swzhang/database/cancer/bed/hg19/OncoCGP668_v1.0/OncoCGP668_v1.0.probe.grch37.bed
14:42:58.765 INFO IntervalArgumentCollection - Processing 2965395 bp from intervals
14:42:58.776 INFO ApplyBQSR - Done initializing engine
14:42:59.563 INFO ProgressMeter - Starting traversal
14:42:59.564 INFO ProgressMeter - Current Locus Elapsed Minutes Reads Processed Reads/Minute
14:43:09.572 INFO ProgressMeter - 1:16255569 0.2 666000 3993204.8
14:43:19.582 INFO ProgressMeter - 1:36932412 0.3 1384000 4148473.8
14:43:29.583 INFO ProgressMeter - 1:65349086 0.5 2042000 4081551.1
14:43:39.593 INFO ProgressMeter - 1:156834151 0.7 2664000 3993105.0
14:43:49.605 INFO ProgressMeter - 1:176132910 0.8 3330000 3992726.0
14:43:59.608 INFO ProgressMeter - 10:43601969 1.0 4085000 4082006.5
14:44:09.615 INFO ProgressMeter - 11:17741415 1.2 4873000 4173816.2
14:44:19.624 INFO ProgressMeter - 11:77034335 1.3 5637000 4224581.6
14:44:29.635 INFO ProgressMeter - 11:128649526 1.5 6433000 4285286.1
14:44:39.646 INFO ProgressMeter - 12:12023909 1.7 7171000 4299074.8
14:44:49.648 INFO ProgressMeter - 12:49431716 1.8 7927000 4320518.9
14:44:59.650 INFO ProgressMeter - 12:69965028 2.0 8704000 4348919.5
14:45:09.666 INFO ProgressMeter - 12:133244201 2.2 9492000 4377522.1
14:45:19.672 INFO ProgressMeter - 13:32954097 2.3 10255000 4391643.5
14:45:29.683 INFO ProgressMeter - 14:23451358 2.5 11023000 4405734.2
14:45:39.683 INFO ProgressMeter - 15:40675191 2.7 11799000 4421336.6
14:45:49.688 INFO ProgressMeter - 15:88493677 2.8 12568000 4432531.6
14:45:59.693 INFO ProgressMeter - 15:88576286 3.0 13366000 4452142.6
14:46:09.704 INFO ProgressMeter - 15:88646923 3.2 14163000 4469256.7
14:46:19.710 INFO ProgressMeter - 16:2103320 3.3 14943000 4479629.9
14:46:29.721 INFO ProgressMeter - 16:9857103 3.5 15751000 4496923.7
14:46:39.730 INFO ProgressMeter - 16:72993390 3.7 16546000 4509225.0
14:46:49.735 INFO ProgressMeter - 17:8113448 3.8 17315000 4513600.8
14:46:59.738 INFO ProgressMeter - 17:37687361 4.0 18083000 4517493.6
14:47:09.739 INFO ProgressMeter - 17:38510797 4.2 18878000 4527568.8
14:47:19.739 INFO ProgressMeter - 17:41230373 4.3 19665000 4535041.9
14:47:29.749 INFO ProgressMeter - 17:58678145 4.5 20457000 4542887.3
14:47:39.756 INFO ProgressMeter - 17:78897272 4.7 21241000 4548523.9
14:47:49.757 INFO ProgressMeter - 19:2211040 4.8 22047000 4558414.6
14:47:59.764 INFO ProgressMeter - 19:5256059 5.0 22805000 4557976.5
14:48:09.770 INFO ProgressMeter - 19:11031493 5.2 23597000 4564128.4
14:48:19.782 INFO ProgressMeter - 19:15298017 5.3 24410000 4573773.4
14:48:29.785 INFO ProgressMeter - 19:18279890 5.5 25173000 4573846.0
14:48:39.797 INFO ProgressMeter - 19:36218337 5.7 25984000 4582285.0
14:48:49.800 INFO ProgressMeter - 19:42776257 5.8 26772000 4586393.2
14:48:59.800 INFO ProgressMeter - 19:47422013 6.0 27575000 4592822.5
14:49:09.810 INFO ProgressMeter - 2:25463127 6.2 28331000 4591176.1
14:49:19.811 INFO ProgressMeter - 2:61148898 6.3 29124000 4595539.2
14:49:29.816 INFO ProgressMeter - 2:209113316 6.5 29940000 4603179.5
14:49:39.818 INFO ProgressMeter - 20:36031230 6.7 30708000 4603276.9
14:49:49.824 INFO ProgressMeter - 20:62311147 6.8 31525000 4610490.9
14:49:59.827 INFO ProgressMeter - 21:42877868 7.0 32320000 4614253.5
14:50:09.832 INFO ProgressMeter - 22:29690254 7.2 33120000 4618527.6
14:50:19.833 INFO ProgressMeter - 22:42524126 7.3 33938000 4625081.5
14:50:29.834 INFO ProgressMeter - 3:47125812 7.5 34737000 4628833.0
14:50:39.844 INFO ProgressMeter - 3:52620619 7.7 35517000 4629834.0
14:50:49.850 INFO ProgressMeter - 3:142232505 7.8 36310000 4632500.2
14:50:59.857 INFO ProgressMeter - 3:185798292 8.0 37094000 4633931.0
14:51:09.858 INFO ProgressMeter - 4:126238153 8.2 37911000 4639389.1
14:51:19.869 INFO ProgressMeter - 5:1295935 8.3 38723000 4643936.5
14:51:29.870 INFO ProgressMeter - 5:149503790 8.5 39523000 4646985.6
14:51:39.876 INFO ProgressMeter - 6:397157 8.7 40296000 4646750.4
14:51:49.881 INFO ProgressMeter - 6:94066521 8.8 41113000 4651528.5
14:51:59.887 INFO ProgressMeter - 6:163148582 9.0 41914000 4654327.1
14:52:09.898 INFO ProgressMeter - 7:55272236 9.2 42740000 4659715.7
14:52:19.903 INFO ProgressMeter - 7:116335625 9.3 43556000 4663899.3
14:52:29.903 INFO ProgressMeter - 7:151932961 9.5 44379000 4668697.0
14:52:39.906 INFO ProgressMeter - 8:38175494 9.7 45207000 4673830.3
14:52:49.915 INFO ProgressMeter - 8:48686786 9.8 46022000 4677420.7
14:52:59.916 INFO ProgressMeter - 8:95403904 10.0 46834000 4680654.0
14:53:09.923 INFO ProgressMeter - 9:5456151 10.2 47663000 4685406.5
14:53:19.925 INFO ProgressMeter - 9:98242668 10.3 48464000 4687335.3
14:53:29.929 INFO ProgressMeter - 9:139401047 10.5 49291000 4691662.8
14:53:39.933 INFO ProgressMeter - X:47038866 10.7 50060000 4690428.0
14:53:49.941 INFO ProgressMeter - X:107406172 10.8 50879000 4693807.9
14:53:54.305 INFO ApplyBQSR - 0 read(s) filtered by: WellformedReadFilter
14:53:54.305 INFO ProgressMeter - GL000256.1:3850996 10.9 51227137 4694418.4
14:53:54.305 INFO ProgressMeter - Traversal complete. Processed 51227137 total reads in 10.9 minutes.
14:53:54.354 INFO ApplyBQSR - Shutting down engine
[April 7, 2024 2:53:54 PM CST] org.broadinstitute.hellbender.tools.walkers.bqsr.ApplyBQSR done. Elapsed time: 10.94 minutes.
Runtime.totalMemory()=1786773504
-
Hi shouweizhang
Looks like you applied a bed file to the ApplyBQSR step which restricts the reads to be written to those regions. We do not recommend that practice, therefore if you remove that bed file from the ApplyBQSR step your reads will be the same before and after.
Regards.
-
Thank you! When i removed this parameter "-L", the bam before and after BQSR have the same reads.
Please sign in to leave a comment.
2 comments