The expected variants were not detected for pig RNA-seq by GATK
REQUIRED for all errors and issues:
a) GATK version used: v4.1.9
b) Exact command used: gatk Haplotype -R Sus_scrofa.fa -I BQSR_SAMEA5337669-STARsorted.bam -O SAMEA5337669.vcf.gz --dont-use-soft-clipped-bases --output-mode EMIT_ALL_CONFIDENT_SITES -bamout SAMEA5337669.realign.bam
c) Entire program log:
Using GATK jar /home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar ApplyBQSR -R /home/agis/likui_group/yaowenye/project/Sscrofa11.1/Sus_scrofa.Sscrofa11.1.dna.toplevel.fa -I SplitNCigarReads_SAMEA5337669-STARsorted.bam --bqsr-recal-file SAMEA5337669-recal.table -O BQSR_SAMEA5337669-STARsorted.bam
21:22:04.312 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Oct 10, 2022 9:22:04 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
21:22:04.497 INFO HaplotypeCaller - ------------------------------------------------------------
21:22:04.498 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.1.9.0
21:22:04.498 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
21:22:04.498 INFO HaplotypeCaller - Executing as yaowenye@comput42 on Linux v3.10.0-862.el7.x86_64 amd64
21:22:04.498 INFO HaplotypeCaller - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
21:22:04.498 INFO HaplotypeCaller - Start Date/Time: October 10, 2022 9:22:04 PM CST
21:22:04.498 INFO HaplotypeCaller - ------------------------------------------------------------
21:22:04.499 INFO HaplotypeCaller - ------------------------------------------------------------
21:22:04.499 INFO HaplotypeCaller - HTSJDK Version: 2.23.0
21:22:04.499 INFO HaplotypeCaller - Picard Version: 2.23.3
21:22:04.499 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
21:22:04.499 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
21:22:04.499 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
21:22:04.499 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
21:22:04.500 INFO HaplotypeCaller - Deflater: IntelDeflater
21:22:04.500 INFO HaplotypeCaller - Inflater: IntelInflater
21:22:04.500 INFO HaplotypeCaller - GCS max retries/reopens: 20
21:22:04.500 INFO HaplotypeCaller - Requester pays: disabled
21:22:04.500 INFO HaplotypeCaller - Initializing engine
21:22:05.045 INFO FeatureManager - Using codec VCFCodec to read file file:///home/agis/likui_group/yaowenye/project/ASE/02.RNA.known.sites/sus_scrofa.snp.edited.vcf.gz
21:22:05.161 WARN IndexUtils - Feature file "/home/agis/likui_group/yaowenye/project/ASE/02.RNA.known.sites/sus_scrofa.snp.edited.vcf.gz" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file
21:22:05.377 WARN IndexUtils - Index file /home/agis/likui_group/yaowenye/project/ASE/02.RNA.known.sites/sus_scrofa.snp.edited.vcf.gz.tbi is out of date (index older than input file). Use IndexFeatureFile to make a new index.
21:22:05.436 INFO HaplotypeCaller - Done initializing engine
21:22:05.463 INFO HaplotypeCallerEngine - Disabling physical phasing, which is supported only for reference-model confidence output
21:22:05.572 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
21:22:05.735 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
21:22:05.825 INFO IntelPairHmm - Using CPU-supported AVX-512 instructions
21:22:05.825 INFO IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
21:22:05.826 INFO IntelPairHmm - Available threads: 2
21:22:05.826 INFO IntelPairHmm - Requested threads: 4
21:22:05.826 WARN IntelPairHmm - Using 2 available threads, but 4 were requested
21:22:05.826 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
21:22:05.915 INFO ProgressMeter - Starting traversal
21:22:05.915 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
21:22:07.202 WARN InbreedingCoeff - InbreedingCoeff will not be calculated; at least 10 samples must have called genotypes
21:22:15.915 INFO ProgressMeter - 1:13676305 0.2 46030 276180.0
21:22:25.915 INFO ProgressMeter - 1:41906728 0.3 140770 422310.0
21:22:35.920 INFO ProgressMeter - 1:73121790 0.5 245480 490878.2
21:22:45.920 INFO ProgressMeter - 1:101530706 0.7 340810 511151.1
21:22:55.921 INFO ProgressMeter - 1:128260272 0.8 430800 516898.0
21:23:05.921 INFO ProgressMeter - 1:161898418 1.0 543570 543515.6
21:23:15.921 INFO ProgressMeter - 1:188965103 1.2 634690 543973.4
21:23:25.921 INFO ProgressMeter - 1:220551642 1.3 740560 555378.3
21:23:35.921 INFO ProgressMeter - 1:249927264 1.5 839300 559496.0
21:23:45.928 INFO ProgressMeter - 1:270206516 1.7 907970 544711.2
21:23:56.692 INFO ProgressMeter - 2:6334510 1.8 943990 511292.1
21:24:06.695 INFO ProgressMeter - 2:13327549 2.0 968400 481073.0
21:24:16.695 INFO ProgressMeter - 2:39006107 2.2 1054760 483908.9
21:24:26.699 INFO ProgressMeter - 2:61207228 2.3 1129780 481498.5
21:24:36.762 INFO ProgressMeter - 2:73288425 2.5 1171650 466028.5
21:24:46.774 INFO ProgressMeter - 2:82661135 2.7 1204470 449264.3
21:24:56.776 INFO ProgressMeter - 2:115874208 2.8 1315760 462045.8
21:25:06.778 INFO ProgressMeter - 2:141966048 3.0 1403670 465660.0
21:25:16.777 INFO ProgressMeter - 3:6753547 3.2 1460400 459096.1
21:25:26.784 INFO ProgressMeter - 3:17325052 3.3 1496650 447052.6
21:25:36.793 INFO ProgressMeter - 3:33858412 3.5 1553150 441909.5
21:25:46.793 INFO ProgressMeter - 3:47364942 3.7 1599520 434498.7
21:25:56.793 INFO ProgressMeter - 3:71601149 3.8 1681380 436952.8
21:26:06.793 INFO ProgressMeter - 3:98451710 4.0 1771840 441345.4
21:26:16.794 INFO ProgressMeter - 3:121059658 4.2 1848270 442030.6
21:26:26.794 INFO ProgressMeter - 4:5405803 4.3 1906450 438467.6
21:26:36.807 INFO ProgressMeter - 4:35988676 4.5 2008980 444970.0
21:26:46.808 INFO ProgressMeter - 4:67746816 4.7 2115470 451875.5
21:26:56.812 INFO ProgressMeter - 4:93490010 4.8 2202250 454232.9
21:27:06.817 INFO ProgressMeter - 4:107629342 5.0 2250720 448794.6
21:27:16.845 INFO ProgressMeter - 5:22117 5.2 2329270 449478.0
21:27:26.853 INFO ProgressMeter - 5:11220104 5.3 2367670 442640.6
21:27:36.854 INFO ProgressMeter - 5:21729937 5.5 2403790 435812.6
21:27:46.854 INFO ProgressMeter - 5:48292582 5.7 2493210 438766.5
21:27:56.854 INFO ProgressMeter - 5:68332744 5.8 2560980 437850.5
21:28:06.854 INFO ProgressMeter - 5:93295993 6.0 2645220 439723.1
21:28:16.959 INFO ProgressMeter - 6:10717867 6.2 2719100 439694.5
21:28:26.970 INFO ProgressMeter - 6:28546557 6.4 2779840 437706.9
21:28:36.970 INFO ProgressMeter - 6:48524650 6.5 2847620 436913.5
21:28:46.978 INFO ProgressMeter - 6:59605167 6.7 2885950 431745.1
21:28:56.983 INFO ProgressMeter - 6:75376558 6.9 2939670 429079.0
21:29:06.983 INFO ProgressMeter - 6:89300393 7.0 2987600 425717.5
21:29:16.985 INFO ProgressMeter - 6:115879077 7.2 3077140 428302.6
21:29:26.986 INFO ProgressMeter - 6:148711253 7.4 3187210 433564.2
21:29:36.987 INFO ProgressMeter - 6:168067683 7.5 3252880 432687.5
21:29:46.996 INFO ProgressMeter - 7:20820371 7.7 3332380 433639.2
21:29:56.996 INFO ProgressMeter - 7:35788066 7.9 3383590 430956.5
21:30:06.999 INFO ProgressMeter - 7:54886121 8.0 3448420 430081.2
21:30:17.000 INFO ProgressMeter - 7:77687118 8.2 3525500 430740.1
21:30:27.000 INFO ProgressMeter - 7:100832556 8.4 3603820 431522.0
21:30:37.006 INFO ProgressMeter - 8:3798691 8.5 3687390 432884.6
21:30:47.006 INFO ProgressMeter - 8:31485318 8.7 3780300 435275.2
21:30:57.006 INFO ProgressMeter - 8:64598395 8.9 3891300 439619.6
21:31:07.006 INFO ProgressMeter - 8:94411123 9.0 3991450 442600.2
21:31:17.006 INFO ProgressMeter - 8:127396572 9.2 4102000 446605.0
21:31:27.006 INFO ProgressMeter - 9:11465311 9.4 4179680 446952.1
21:31:37.006 INFO ProgressMeter - 9:40860588 9.5 4278450 449502.8
21:31:47.007 INFO ProgressMeter - 9:67115583 9.7 4366890 450898.3
21:31:57.007 INFO ProgressMeter - 9:102004021 9.9 4483710 455128.1
21:32:07.007 INFO ProgressMeter - 9:126859472 10.0 4567600 455930.2
21:32:17.021 INFO ProgressMeter - 10:13816731 10.2 4656650 457202.2
21:32:27.022 INFO ProgressMeter - 10:31312927 10.4 4716040 455577.5
21:32:37.022 INFO ProgressMeter - 10:54104073 10.5 4792940 455669.8
21:32:47.022 INFO ProgressMeter - 11:8684917 10.7 4873670 456117.6
21:32:57.022 INFO ProgressMeter - 11:38729707 10.9 4974570 458410.4
21:33:07.022 INFO ProgressMeter - 11:72262542 11.0 5086930 461673.8
21:33:17.043 INFO ProgressMeter - 12:6248422 11.2 5131910 458801.6
21:33:27.053 INFO ProgressMeter - 12:21797604 11.4 5185110 456745.3
21:33:37.056 INFO ProgressMeter - 12:38900368 11.5 5243250 455182.1
21:33:47.141 INFO ProgressMeter - 12:52872349 11.7 5291340 452750.5
21:33:57.141 INFO ProgressMeter - 13:13560858 11.9 5366750 452746.4
21:34:07.146 INFO ProgressMeter - 13:32728422 12.0 5431900 451885.7
21:34:17.146 INFO ProgressMeter - 13:59442475 12.2 5521840 453085.8
21:34:27.146 INFO ProgressMeter - 13:81096185 12.4 5595010 452896.1
21:34:37.146 INFO ProgressMeter - 13:116331586 12.5 5712990 456290.3
21:34:47.146 INFO ProgressMeter - 13:142695250 12.7 5801910 457304.8
21:34:57.146 INFO ProgressMeter - 13:180075446 12.9 5926930 461101.5
21:35:07.150 INFO ProgressMeter - 13:207445858 13.0 6018990 462267.3
21:35:17.156 INFO ProgressMeter - 14:24316755 13.2 6103970 462865.5
21:35:27.159 INFO ProgressMeter - 14:45241490 13.4 6174860 462396.1
21:35:37.158 INFO ProgressMeter - 14:62250265 13.5 6232840 460984.4
21:35:47.159 INFO ProgressMeter - 14:89076512 13.7 6323240 461975.2
21:35:57.162 INFO ProgressMeter - 14:113707560 13.9 6406410 462419.2
21:36:07.167 INFO ProgressMeter - 14:140351748 14.0 6496100 463316.6
21:36:17.185 INFO ProgressMeter - 15:29723056 14.2 6600520 465224.5
21:36:27.184 INFO ProgressMeter - 15:57934133 14.4 6695310 466426.4
21:36:37.184 INFO ProgressMeter - 15:87123239 14.5 6793520 467836.2
21:36:47.189 INFO ProgressMeter - 15:117095959 14.7 6894260 469383.6
21:36:57.189 INFO ProgressMeter - 15:139376748 14.9 6969470 469180.3
21:37:07.208 INFO ProgressMeter - 16:27946733 15.0 7066770 470442.1
21:37:17.208 INFO ProgressMeter - 16:55563042 15.2 7159740 471401.0
21:37:27.208 INFO ProgressMeter - 17:5321057 15.4 7259430 472776.6
21:37:37.213 INFO ProgressMeter - 17:32962699 15.5 7352400 473687.3
21:37:47.221 INFO ProgressMeter - 17:50796478 15.7 7413070 472518.2
21:37:57.221 INFO ProgressMeter - 18:11337386 15.9 7494140 472664.3
21:38:07.221 INFO ProgressMeter - 18:42217281 16.0 7597820 474218.6
21:38:17.221 INFO ProgressMeter - X:8187085 16.2 7671720 473901.3
21:38:27.221 INFO ProgressMeter - X:36695318 16.4 7767390 474921.6
21:38:37.221 INFO ProgressMeter - X:60243315 16.5 7846990 474948.6
21:38:47.238 INFO ProgressMeter - X:96169182 16.7 7967180 477399.2
21:38:57.242 INFO ProgressMeter - X:125387413 16.9 8065340 478500.4
21:39:07.242 INFO ProgressMeter - Y:39508260 17.0 8198990 481666.9
21:39:17.471 INFO ProgressMeter - MT:9631 17.2 8212530 477678.2
21:39:27.487 INFO ProgressMeter - AEMK02000361.1:1982027 17.4 8241140 474733.3
21:39:37.486 INFO ProgressMeter - AEMK02000171.1:88201 17.5 8337900 475739.6
21:39:47.486 INFO ProgressMeter - AEMK02000312.1:20101 17.7 8428850 476398.7
21:39:48.465 INFO HaplotypeCaller - 955146 read(s) filtered by: MappingQualityReadFilter
0 read(s) filtered by: MappingQualityAvailableReadFilter
0 read(s) filtered by: MappedReadFilter
0 read(s) filtered by: NotSecondaryAlignmentReadFilter
101998 read(s) filtered by: NotDuplicateReadFilter
0 read(s) filtered by: PassesVendorQualityCheckReadFilter
0 read(s) filtered by: NonZeroReferenceLengthAlignmentReadFilter
0 read(s) filtered by: GoodCigarReadFilter
0 read(s) filtered by: WellformedReadFilter
1057144 total reads filtered
21:39:48.466 INFO ProgressMeter - AEMK02000519.1:12601 17.7 8435938 476359.5
21:39:48.466 INFO ProgressMeter - Traversal complete. Processed 8435938 total regions in 17.7 minutes.
21:39:48.580 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 0.19054606100000002
21:39:48.580 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 23.596010852000003
21:39:48.580 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 52.69 sec
21:39:50.107 INFO HaplotypeCaller - Shutting down engine
[October 10, 2022 9:39:50 PM CST] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 17.76 minutes.
Runtime.totalMemory()=4287102976
I tried to detect variants for pig RNA-seq data; I am following the RNAseq variant detection best practices.(MarkDuplicates-SplitNCigarReads-BaseRecalibrator-ApplyBQSR-HaplotypeCaller) I found that some variants were not detected in some regions(not the repeat region, also good mapping quality) but there are variants in the input bam file(according to igv image) and also be detected by using Freebayes. And I also tired the suggestions in the troubleshooting page https://gatk.broadinstitute.org/hc/en-us/community/posts/360077647812-Why-do-a-clear-expected-variant-not-show-up-in-the-Mutect2-vcf-file https://gatk.broadinstitute.org/hc/en-us/articles/360035891111-Expected-variant-at-a-specific-site-was-not-called
but none of them worked. The following is details:
1) the first command line
gatk HaplotypeCaller --tmp-dir ./tmp
-R ./Sus_scrofa.Sscrofa11.1.dna.toplevel.fa -I BQSR_SAMEA5337669-STARsorted.bam -O SAMEA5337669.vcf.gz --dont-use-soft-clipped-bases --output-mode EMIT_ALL_CONFIDENT_SITES -bamout SAMEA5337669.realign.bam
Here, there are 3 variants in the region were not detected by GATK
In the Freebayes vcf output
1 95750731 . C T 409.741 . AB=0.487179;ABP=3.06598;AC=1;AF=0.5;AN=2;AO=19;CIGAR=1X;DP=39;DPB=39;DPRA=0;EPP=8.61041;EPPR=24.2907;GTI=0;LEN=1;MEANALT=1;MQM=60;MQMR=60;NS=1;NUMALT=1;ODDS=90.7687;PAIRED=1;PAIREDR=1;PAO=0;PQA=0;PQR=0;PRO=0;QA=691;QR=724;RO=20;RPL=2;RPP=28.7251;RPPR=30.8051;RPR=17;RUN=1;SAF=4;SAP=16.8392;SAR=15;SRF=1;SRP=38.1882;SRR=19;TYPE=snp;technology.illumina=1 GT:DP:RO:QR:AO:QA:GL 0/1:39:20:724:19:691:-50.7797,nan,-53.7473
1 95750749 . G A 439.814 . AB=0.567568;ABP=4.47751;AC=1;AF=0.5;AN=2;AO=21;CIGAR=1X;DP=37;DPB=37;DPRA=0;EPP=26.2761;EPPR=3.55317;GTI=0;LEN=1;MEANALT=1;MQM=60;MQMR=60;NS=1;NUMALT=1;ODDS=73.0495;PAIRED=1;PAIREDR=1;PAO=0;PQA=0;PQR=0;PRO=0;QA=758;QR=565;RO=16;RPL=3;RPP=26.2761;RPPR=5.18177;RPR=18;RUN=1;SAF=2;SAP=32.8939;SAR=19;SRF=1;SRP=29.6108;SRR=15;TYPE=snp;technology.illumina=1 GT:DP:RO:QR:AO:QA:GL 0/1:37:16:565:21:758:-57.4047,nan,-40.0367
1 95750761 . A G 710.785 . AB=0.681818;ABP=15.6443;AC=1;AF=0.5;AN=2;AO=30;CIGAR=1X;DP=44;DPB=44;DPRA=0;EPP=10.2485;EPPR=3.0103;GTI=0;LEN=1;MEANALT=1;MQM=60;MQMR=60;NS=1;NUMALT=1;ODDS=48.71;PAIRED=1;PAIREDR=1;PAO=0;PQA=0;PQR=0;PRO=0;QA=1102;QR=468;RO=14;RPL=9;RPP=13.4334;RPPR=5.49198;RPR=21;RUN=1;SAF=1;SAP=59.7581;SAR=29;SRF=2;SRP=18.5208;SRR=12;TYPE=snp;technology.illumina=1 GT:DP:RO:QR:AO:QA:GL 0/1:44:14:468:30:1102:-86.2455,nan,-29.1865
1 95750823 . C A 151.236 . AB=0.37931;ABP=6.67934;AC=1;AF=0.5;AN=2;AO=11;CIGAR=1X;DP=29;DPB=29;DPRA=0;EPP=26.8965;EPPR=20.3821;GTI=0;LEN=1;MEANALT=1;MQM=60;MQMR=60;NS=1;NUMALT=1;ODDS=34.8234;PAIRED=1;PAIREDR=1;PAO=0;PQA=0;PQR=0;PRO=0;QA=407;QR=666;RO=18;RPL=11;RPP=26.8965;RPPR=33.8935;RPR=0;RUN=1;SAF=0;SAP=26.8965;SAR=11;SRF=2;SRP=26.6552;SRR=16;TYPE=snp;technology.illumina=1 GT:DP:RO:QR:AO:QA:GL 0/1:29:18:666:11:407:-28.2484,nan,-51.5447
1 95750831 . T C 60.6972 . AB=0.291667;ABP=12.0581;AC=1;AF=0.5;AN=2;AO=7;CIGAR=1X;DP=24;DPB=24;DPRA=0;EPP=18.2106;EPPR=24.5973;GTI=0;LEN=1;MEANALT=1;MQM=60;MQMR=60;NS=1;NUMALT=1;ODDS=13.9761;PAIRED=1;PAIREDR=1;PAO=0;PQA=0;PQR=0;PRO=0;QA=255;QR=621;RO=17;RPL=7;RPP=18.2106;RPPR=31.7504;RPR=0;RUN=1;SAF=0;SAP=18.2106;SAR=7;SRF=1;SRP=31.7504;SRR=16;TYPE=snp;technology.illumina=1 GT:DP:RO:QR:AO:QA:GL 0/1:24:17:621:7:255:-16.0769,nan,-48.9995
In the GAKT ouput vcf file
1 95750823 . C A 361.64 . AC=1;AF=0.500;AN=2;BaseQRankSum=3.997;DP=38;ExcessHet=0.0000;FS=2.905;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=12.47;ReadPosRankSum=1.502;SOR=1.702 GT:AD:DP:GQ:PL 0/1:18,11:29:99:369,0,702
1 95750831 . T C 361.64 . AC=1;AF=0.500;AN=2;BaseQRankSum=2.642;DP=29;ExcessHet=0.0000;FS=2.905;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=12.47;ReadPosRankSum=0.519;SOR=1.702 GT:AD:DP:GQ:PL 0/1:18,11:29:99:369,0,702
2) Add “--debug True” command in the first command line.
the output vcf has no change. and the .out output
According to the output, It seems that the region Chr1:95750678-95750940 were identified as an active region, but it was trimmed, only the region chr1:95750803-75750851 were detected as a final haplotype. Here, my question are why the front region were trimmed and I wondering where the reads existed in the input bam but not in the realigned bam?
3) Add “--linked-de-bruijn-graph True” to the first command line
4) gvcf (-ERC GVCF)
1 95750731 . C <NON_REF> . . END=95750731 GT:DP:GQ:MIN_DP:PL 0/0:35:0:35:0,0,111
1 95750749 . G <NON_REF> . . END=95750749 GT:DP:GQ:MIN_DP:PL 0/0:37:0:37:0,0,0
1 95750761 . A <NON_REF> . . END=95750761 GT:DP:GQ:MIN_DP:PL 0/0:44:0:44:0,0,0
I also tried the “--pruning-lod-threshold 1.3” , “--min-pruning 0” , “--recover-all-dangling-branches True” and different version gatk (v4.1.3, v.4.3.0) but there is no change in output.
So I wonder what I should do next to detect these variants and how to explain this phenomenon?
-
Hi Wen-ye Yao,
Thank you for writing to the GATK forum! I hope that we can help you sort this out.
Before I bring your issues to our developers for review, please respond with your entire program log/stack trace. It is difficult for the team to accurately diagnose potential problems without seeing your log.
I look forward to hearing from you! Please do not hesitate to include any additional questions as they arise.
Best,
Anthony -
Hi Anthony Dias-Ciarla :
Thank you for your message.
I updated the haplotypeCaller part log file in the Post. I hope this is what you want.
-
Hi Wen-ye Yao,
Thank you for your patience and for providing your stack trace! I brought this to our developers and received some feedback to share.
With RNA-seq data, you typically have to run some preprocessing tools on it. Reads often need to be split. It is common for reads to have many ends in the areas that are not expressed.
You can use our RNA Seq Pipeline to perform this preprocessing instead of trying to do it independently. One step in this pipeline that is particularly useful is SplitNCigarReads. You must run it on your data before you run tools like VQSR or HaplotypeCaller. So please first try running this on your data.
I hope this helps! Please let me know if this leads you to success. In the meantime, please do not hesitate to reach out with any questions as they arise.
Best,
Anthony -
Hi Anthony:
Thank you for your comments. I did follow the RNAseq variant detection best practices for my data analysis at the beginning. The following is the complete code I used:
After I followed the pipeline, the question I mentioned is still there.
gatk AddOrReplaceReadGroups -I SAMEA5337669-sort.bam -O rg_added_SAMEA5337669-STARsorted.bam -RGID 4 -RGLB lib1 -RGPL illumina -RGPU run -RGSM 20 -CREATE_INDEX true -VALIDATION_STRINGENCY SILENT -SORT_ORDER coordinate
#MarkDuplicates
gatk MarkDuplicates -I rg_added_SAMEA5337669-STARsorted.bam -O dedupped_SAMEA5337669-STARsorted.bam -CREATE_INDEX true -VALIDATION_STRINGENCY SILENT --READ_NAME_REGEX null -M dedupped_SAMEA5337669-STARsorted.marked_dup_metrics.txt
#SplitNCigarReads
gatk SplitNCigarReads --tmp-dir ./tmp -R Sus_scrofa.Sscrofa11.1.dna.toplevel.fa -I dedupped_SAMEA5337669-STARsorted.bam -O SplitNCigarReads_SAMEA5337669-STARsorted.bam
#BaseRecalibrator
gatk BaseRecalibrator --tmp-dir ./tmp -I SplitNCigarReads_SAMEA5337669-STARsorted.bam -R Sus_scrofa.Sscrofa11.1.dna.toplevel.fa --known-sites ./sus_scrofa.snp.edited.vcf.gz -O SAMEA5337669-recal.table
gatk ApplyBQSR -R Sus_scrofa.Sscrofa11.1.dna.toplevel.fa -I SplitNCigarReads_SAMEA5337669-STARsorted.bam --bqsr-recal-file SAMEA5337669-recal.table -O BQSR_SAMEA5337669-STARsorted.bam
gatk HaplotypeCaller --tmp-dir ./tmp -R Sus_scrofa.Sscrofa11.1.dna.toplevel.fa -I BQSR_SAMEA5337669-STARsorted.bam -O SAMEA5337669.vcf.gz --dbsnp sus_scrofa.snp.edited.vcf.gz --dont-use-soft-clipped-bases --output-mode EMIT_ALL_CONFIDENT_SITES -
Hi Wen-ye Yao,
Thank you for your response! It seems you intended to include your entire program log in your last reply, but it still needs to be included.
Please reply with the entire program log, including all errors/warnings/exceptions, so that we can accurately help you troubleshoot.
I look forward to hearing back from you!
Best,
Anthony -
Hi Anthony:
I have posted the log about GATK --hapotypeCaller step. I am unsure if the entire log you mentioned is from the "Star mapping" step or just the GATK program. I post the whole program log from the GATK AddOrReplaceReadGroups to GATK HaplotypeCaller steps.
21:16:02.178 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
[Mon Oct 10 21:16:02 CST 2022] AddOrReplaceReadGroups --INPUT SAMEA5337669-STARAligned.sortedByCoord.out.bam --OUTPUT rg_added_SAMEA5337669-STARsorted.bam --SORT_ORDER coordinate --RGID 4 --RGLB lib1 --RGPL illumina --RGPU run --RGSM 20 --VALIDATION_STRINGENCY SILENT --CREATE_INDEX true --VERBOSITY INFO --QUIET false --COMPRESSION_LEVEL 2 --MAX_RECORDS_IN_RAM 500000 --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
Oct 10, 2022 9:16:03 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
[Mon Oct 10 21:16:03 CST 2022] Executing as yaowenye@comput42 on Linux 3.10.0-862.el7.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_152-release-1056-b12; Deflater: Intel; Inflater: Intel; Provider GCS is available; Picard version: Version:4.1.9.0
INFO 2022-10-10 21:16:03 AddOrReplaceReadGroups Created read-group ID=4 PL=illumina LB=lib1 SM=20
INFO 2022-10-10 21:16:09 AddOrReplaceReadGroups Processed 1,000,000 records. Elapsed time: 00:00:05s. Time for last 1,000,000: 5s. Last read position: 13:189,450,349
INFO 2022-10-10 21:16:12 AddOrReplaceReadGroups Processed 2,000,000 records. Elapsed time: 00:00:09s. Time for last 1,000,000: 3s. Last read position: AEMK02000695.1:7,284
[Mon Oct 10 21:16:14 CST 2022] picard.sam.AddOrReplaceReadGroups done. Elapsed time: 0.20 minutes.
Runtime.totalMemory()=1756364800
Using GATK jar /home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar AddOrReplaceReadGroups -I SAMEA5337669-STARAligned.sortedByCoord.out.bam -O rg_added_SAMEA5337669-STARsorted.bam -RGID 4 -RGLB lib1 -RGPL illumina -RGPU run -RGSM 20 -CREATE_INDEX true -VALIDATION_STRINGENCY SILENT -SORT_ORDER coordinate
21:16:17.255 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
[Mon Oct 10 21:16:17 CST 2022] MarkDuplicates --INPUT rg_added_SAMEA5337669-STARsorted.bam --OUTPUT dedupped_SAMEA5337669-STARsorted.bam --METRICS_FILE dedupped_SAMEA5337669-STARsorted.marked_dup_metrics.txt --VALIDATION_STRINGENCY SILENT --CREATE_INDEX true --MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP 50000 --MAX_FILE_HANDLES_FOR_READ_ENDS_MAP 8000 --SORTING_COLLECTION_SIZE_RATIO 0.25 --TAG_DUPLICATE_SET_MEMBERS false --REMOVE_SEQUENCING_DUPLICATES false --TAGGING_POLICY DontTag --CLEAR_DT true --DUPLEX_UMI false --ADD_PG_TAG_TO_READS true --REMOVE_DUPLICATES false --ASSUME_SORTED false --DUPLICATE_SCORING_STRATEGY SUM_OF_BASE_QUALITIES --PROGRAM_RECORD_ID MarkDuplicates --PROGRAM_GROUP_NAME MarkDuplicates --OPTICAL_DUPLICATE_PIXEL_DISTANCE 100 --MAX_OPTICAL_DUPLICATE_SET_SIZE 300000 --VERBOSITY INFO --QUIET false --COMPRESSION_LEVEL 2 --MAX_RECORDS_IN_RAM 500000 --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
Oct 10, 2022 9:16:17 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
[Mon Oct 10 21:16:17 CST 2022] Executing as yaowenye@comput42 on Linux 3.10.0-862.el7.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_152-release-1056-b12; Deflater: Intel; Inflater: Intel; Provider GCS is available; Picard version: Version:4.1.9.0
INFO 2022-10-10 21:16:17 MarkDuplicates Start of doWork freeMemory: 2444882192; totalMemory: 2470445056; maxMemory: 28631367680
INFO 2022-10-10 21:16:17 MarkDuplicates Reading input file and constructing read end information.
INFO 2022-10-10 21:16:17 MarkDuplicates Will retain up to 103736839 data points before spilling to disk.
INFO 2022-10-10 21:16:22 MarkDuplicates Read 1,000,000 records. Elapsed time: 00:00:04s. Time for last 1,000,000: 4s. Last read position: 13:189,450,349
INFO 2022-10-10 21:16:22 MarkDuplicates Tracking 19 as yet unmatched pairs. 19 records in RAM.
INFO 2022-10-10 21:16:25 MarkDuplicates Read 2,000,000 records. Elapsed time: 00:00:07s. Time for last 1,000,000: 3s. Last read position: AEMK02000695.1:7,284
INFO 2022-10-10 21:16:25 MarkDuplicates Tracking 308 as yet unmatched pairs. 308 records in RAM.
INFO 2022-10-10 21:16:25 MarkDuplicates Read 2236717 records. 0 pairs never matched.
INFO 2022-10-10 21:16:26 MarkDuplicates After buildSortedReadEndLists freeMemory: 2421916640; totalMemory: 3468165120; maxMemory: 28631367680
INFO 2022-10-10 21:16:26 MarkDuplicates Will retain up to 894730240 duplicate indices before spilling to disk.
INFO 2022-10-10 21:16:28 MarkDuplicates Traversing read pair information and detecting duplicates.
INFO 2022-10-10 21:16:29 MarkDuplicates Traversing fragment information and detecting duplicates.
INFO 2022-10-10 21:16:29 MarkDuplicates Sorting list of duplicate records.
INFO 2022-10-10 21:16:31 MarkDuplicates After generateDuplicateIndexes freeMemory: 3501386520; totalMemory: 10702290944; maxMemory: 28631367680
INFO 2022-10-10 21:16:31 MarkDuplicates Marking 282928 records as duplicates.
WARNING 2022-10-10 21:16:31 MarkDuplicates Skipped optical duplicate cluster discovery; library size estimation may be inaccurate!
INFO 2022-10-10 21:16:31 MarkDuplicates Reads are assumed to be ordered by: coordinate
INFO 2022-10-10 21:16:40 MarkDuplicates Writing complete. Closing input iterator.
INFO 2022-10-10 21:16:40 MarkDuplicates Duplicate Index cleanup.
INFO 2022-10-10 21:16:40 MarkDuplicates Getting Memory Stats.
INFO 2022-10-10 21:16:40 MarkDuplicates Before output close freeMemory: 12098169544; totalMemory: 12186025984; maxMemory: 28631367680
INFO 2022-10-10 21:16:40 MarkDuplicates Closed outputs. Getting more Memory Stats.
INFO 2022-10-10 21:16:40 MarkDuplicates After output close freeMemory: 11996340584; totalMemory: 12048662528; maxMemory: 28631367680
[Mon Oct 10 21:16:40 CST 2022] picard.sam.markduplicates.MarkDuplicates done. Elapsed time: 0.39 minutes.
Runtime.totalMemory()=12048662528
Using GATK jar /home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar MarkDuplicates -I rg_added_SAMEA5337669-STARsorted.bam -O dedupped_SAMEA5337669-STARsorted.bam -CREATE_INDEX true -VALIDATION_STRINGENCY SILENT --READ_NAME_REGEX null -M dedupped_SAMEA5337669-STARsorted.marked_dup_metrics.txt
21:16:44.556 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Oct 10, 2022 9:16:44 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
21:16:44.820 INFO SplitNCigarReads - ------------------------------------------------------------
21:16:44.821 INFO SplitNCigarReads - The Genome Analysis Toolkit (GATK) v4.1.9.0
21:16:44.821 INFO SplitNCigarReads - For support and documentation go to https://software.broadinstitute.org/gatk/
21:16:44.821 INFO SplitNCigarReads - Executing as yaowenye@comput42 on Linux v3.10.0-862.el7.x86_64 amd64
21:16:44.821 INFO SplitNCigarReads - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
21:16:44.821 INFO SplitNCigarReads - Start Date/Time: October 10, 2022 9:16:44 PM CST
21:16:44.822 INFO SplitNCigarReads - ------------------------------------------------------------
21:16:44.822 INFO SplitNCigarReads - ------------------------------------------------------------
21:16:44.822 INFO SplitNCigarReads - HTSJDK Version: 2.23.0
21:16:44.822 INFO SplitNCigarReads - Picard Version: 2.23.3
21:16:44.822 INFO SplitNCigarReads - HTSJDK Defaults.COMPRESSION_LEVEL : 2
21:16:44.822 INFO SplitNCigarReads - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
21:16:44.822 INFO SplitNCigarReads - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
21:16:44.823 INFO SplitNCigarReads - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
21:16:44.823 INFO SplitNCigarReads - Deflater: IntelDeflater
21:16:44.823 INFO SplitNCigarReads - Inflater: IntelInflater
21:16:44.823 INFO SplitNCigarReads - GCS max retries/reopens: 20
21:16:44.823 INFO SplitNCigarReads - Requester pays: disabled
21:16:44.823 INFO SplitNCigarReads - Initializing engine
21:16:45.716 INFO SplitNCigarReads - Done initializing engine
21:16:45.810 INFO ProgressMeter - Starting traversal
21:16:45.810 INFO ProgressMeter - Current Locus Elapsed Minutes Reads Processed Reads/Minute
21:16:55.873 INFO ProgressMeter - 2:9752880 0.2 121000 724045.1
21:17:06.088 INFO ProgressMeter - 2:103802097 0.3 204000 603609.8
21:17:16.201 INFO ProgressMeter - 3:131269692 0.5 316000 623868.9
21:17:26.205 INFO ProgressMeter - 5:85284657 0.7 484000 718900.9
21:17:36.225 INFO ProgressMeter - 6:92243697 0.8 605000 720023.8
21:17:46.272 INFO ProgressMeter - 8:38989418 1.0 732000 726406.7
21:17:56.327 INFO ProgressMeter - 13:3426068 1.2 941000 800658.0
21:18:06.346 INFO ProgressMeter - 15:79446478 1.3 1098000 818019.3
21:18:16.563 INFO ProgressMeter - AEMK02000489.1:44287 1.5 1830000 1209944.0
21:18:19.563 INFO SplitNCigarReads - 0 read(s) filtered by: AllowAllReadsReadFilter
21:18:19.878 INFO OverhangFixingManager - Overhang Fixing Manager saved 471 reads in the first pass
21:18:19.882 INFO SplitNCigarReads - Starting traversal pass 2
21:18:27.039 INFO ProgressMeter - 2:26787 1.7 2357000 1397030.5
21:18:37.046 INFO ProgressMeter - 2:143124918 1.9 2488000 1342023.6
21:18:47.093 INFO ProgressMeter - 4:89297314 2.0 2627000 1299605.1
21:18:57.106 INFO ProgressMeter - 5:80209723 2.2 2754000 1258539.9
21:19:07.129 INFO ProgressMeter - 6:87818207 2.4 2869000 1218103.9
21:19:17.173 INFO ProgressMeter - 7:98110444 2.5 2984000 1182859.6
21:19:27.234 INFO ProgressMeter - 12:35924633 2.7 3175000 1180121.9
21:19:37.361 INFO ProgressMeter - 14:106921439 2.9 3339000 1167816.0
21:19:47.363 INFO ProgressMeter - MT:10479 3.0 3550000 1173211.1
21:19:57.822 INFO ProgressMeter - AEMK02000695.1:7519 3.2 4315000 1348360.3
21:20:01.128 INFO SplitNCigarReads - 0 read(s) filtered by: AllowAllReadsReadFilter
21:20:01.128 INFO ProgressMeter - unmapped 3.3 4542796 1395507.6
21:20:01.128 INFO ProgressMeter - Traversal complete. Processed 4542796 total reads in 3.3 minutes.
21:20:07.804 INFO SplitNCigarReads - Shutting down engine
[October 10, 2022 9:20:07 PM CST] org.broadinstitute.hellbender.tools.walkers.rnaseq.SplitNCigarReads done. Elapsed time: 3.39 minutes.
Runtime.totalMemory()=5970067456
Using GATK jar /home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar SplitNCigarReads --tmp-dir /home/agis/likui_group/yaowenye/project/tmp -R /home/agis/likui_group/yaowenye/project/Sscrofa11.1/Sus_scrofa.Sscrofa11.1.dna.toplevel.fa -I dedupped_SAMEA5337669-STARsorted.bam -O SplitNCigarReads_SAMEA5337669-STARsorted.bam
21:20:14.566 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Oct 10, 2022 9:20:14 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
21:20:14.852 INFO BaseRecalibrator - ------------------------------------------------------------
21:20:14.853 INFO BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.1.9.0
21:20:14.853 INFO BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
21:20:14.853 INFO BaseRecalibrator - Executing as yaowenye@comput42 on Linux v3.10.0-862.el7.x86_64 amd64
21:20:14.853 INFO BaseRecalibrator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
21:20:14.853 INFO BaseRecalibrator - Start Date/Time: October 10, 2022 9:20:14 PM CST
21:20:14.854 INFO BaseRecalibrator - ------------------------------------------------------------
21:20:14.854 INFO BaseRecalibrator - ------------------------------------------------------------
21:20:14.854 INFO BaseRecalibrator - HTSJDK Version: 2.23.0
21:20:14.854 INFO BaseRecalibrator - Picard Version: 2.23.3
21:20:14.854 INFO BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
21:20:14.854 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
21:20:14.854 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
21:20:14.855 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
21:20:14.855 INFO BaseRecalibrator - Deflater: IntelDeflater
21:20:14.855 INFO BaseRecalibrator - Inflater: IntelInflater
21:20:14.855 INFO BaseRecalibrator - GCS max retries/reopens: 20
21:20:14.855 INFO BaseRecalibrator - Requester pays: disabled
21:20:14.855 INFO BaseRecalibrator - Initializing engine
21:20:15.451 INFO FeatureManager - Using codec VCFCodec to read file file:///home/agis/likui_group/yaowenye/project/ASE/02.RNA.known.sites/sus_scrofa.snp.edited.vcf.gz
21:20:15.602 WARN IndexUtils - Feature file "/home/agis/likui_group/yaowenye/project/ASE/02.RNA.known.sites/sus_scrofa.snp.edited.vcf.gz" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file
21:20:15.823 WARN IndexUtils - Index file /home/agis/likui_group/yaowenye/project/ASE/02.RNA.known.sites/sus_scrofa.snp.edited.vcf.gz.tbi is out of date (index older than input file). Use IndexFeatureFile to make a new index.
21:20:15.831 INFO BaseRecalibrator - Done initializing engine
21:20:15.899 INFO BaseRecalibrationEngine - The covariates being used here:
21:20:15.899 INFO BaseRecalibrationEngine - ReadGroupCovariate
21:20:15.899 INFO BaseRecalibrationEngine - QualityScoreCovariate
21:20:15.899 INFO BaseRecalibrationEngine - ContextCovariate
21:20:15.899 INFO BaseRecalibrationEngine - CycleCovariate
21:20:15.918 INFO ProgressMeter - Starting traversal
21:20:15.919 INFO ProgressMeter - Current Locus Elapsed Minutes Reads Processed Reads/Minute
21:20:25.925 INFO ProgressMeter - 2:108516603 0.2 259000 1553378.6
21:20:35.945 INFO ProgressMeter - 5:87486823 0.3 595000 1782682.5
21:20:46.022 INFO ProgressMeter - 9:15541190 0.5 939000 1871636.4
21:20:56.120 INFO ProgressMeter - 13:189282167 0.7 1216000 1814880.2
21:21:06.138 INFO ProgressMeter - X:16449353 0.8 1470000 1756342.3
21:21:11.347 INFO BaseRecalibrator - 41833 read(s) filtered by: MappingQualityNotZeroReadFilter
0 read(s) filtered by: MappingQualityAvailableReadFilter
0 read(s) filtered by: MappedReadFilter
619864 read(s) filtered by: NotSecondaryAlignmentReadFilter
300868 read(s) filtered by: NotDuplicateReadFilter
0 read(s) filtered by: PassesVendorQualityCheckReadFilter
0 read(s) filtered by: WellformedReadFilter
962565 total reads filtered
21:21:11.348 INFO ProgressMeter - AEMK02000695.1:7243 0.9 1687886 1827079.0
21:21:11.348 INFO ProgressMeter - Traversal complete. Processed 1687886 total reads in 0.9 minutes.
21:21:11.398 INFO BaseRecalibrator - Calculating quantized quality scores...
21:21:11.414 INFO BaseRecalibrator - Writing recalibration report...
21:21:11.961 INFO BaseRecalibrator - ...done!
21:21:11.961 INFO BaseRecalibrator - BaseRecalibrator was able to recalibrate 1687886 reads
21:21:11.961 INFO BaseRecalibrator - Shutting down engine
[October 10, 2022 9:21:11 PM CST] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.96 minutes.
Runtime.totalMemory()=6302466048
Using GATK jar /home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar BaseRecalibrator --tmp-dir /home/agis/likui_group/yaowenye/project/tmp -I SplitNCigarReads_SAMEA5337669-STARsorted.bam -R /home/agis/likui_group/yaowenye/project/Sscrofa11.1/Sus_scrofa.Sscrofa11.1.dna.toplevel.fa --known-sites /home/agis/likui_group/yaowenye/project/ASE/02.RNA.known.sites/sus_scrofa.snp.edited.vcf.gz -O SAMEA5337669-recal.table
21:21:15.197 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Oct 10, 2022 9:21:15 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
21:21:15.462 INFO ApplyBQSR - ------------------------------------------------------------
21:21:15.463 INFO ApplyBQSR - The Genome Analysis Toolkit (GATK) v4.1.9.0
21:21:15.463 INFO ApplyBQSR - For support and documentation go to https://software.broadinstitute.org/gatk/
21:21:15.463 INFO ApplyBQSR - Executing as yaowenye@comput42 on Linux v3.10.0-862.el7.x86_64 amd64
21:21:15.463 INFO ApplyBQSR - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
21:21:15.463 INFO ApplyBQSR - Start Date/Time: October 10, 2022 9:21:15 PM CST
21:21:15.463 INFO ApplyBQSR - ------------------------------------------------------------
21:21:15.463 INFO ApplyBQSR - ------------------------------------------------------------
21:21:15.464 INFO ApplyBQSR - HTSJDK Version: 2.23.0
21:21:15.464 INFO ApplyBQSR - Picard Version: 2.23.3
21:21:15.464 INFO ApplyBQSR - HTSJDK Defaults.COMPRESSION_LEVEL : 2
21:21:15.464 INFO ApplyBQSR - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
21:21:15.464 INFO ApplyBQSR - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
21:21:15.464 INFO ApplyBQSR - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
21:21:15.464 INFO ApplyBQSR - Deflater: IntelDeflater
21:21:15.465 INFO ApplyBQSR - Inflater: IntelInflater
21:21:15.465 INFO ApplyBQSR - GCS max retries/reopens: 20
21:21:15.465 INFO ApplyBQSR - Requester pays: disabled
21:21:15.465 INFO ApplyBQSR - Initializing engine
21:21:15.987 INFO ApplyBQSR - Done initializing engine
21:21:16.024 INFO ProgressMeter - Starting traversal
21:21:16.024 INFO ProgressMeter - Current Locus Elapsed Minutes Reads Processed Reads/Minute
21:21:26.043 INFO ProgressMeter - 5:709258 0.2 514000 3078458.8
21:21:36.046 INFO ProgressMeter - 12:1322606 0.3 1119000 3353311.4
21:21:46.050 INFO ProgressMeter - AEMK02000489.1:6378 0.5 1729000 3455005.7
21:21:56.057 INFO ProgressMeter - AEMK02000695.1:7130 0.7 2344000 3513189.4
21:22:00.843 INFO ApplyBQSR - 6407 read(s) filtered by: WellformedReadFilter
21:22:00.843 INFO ProgressMeter - unmapped 0.7 2644044 3539629.2
21:22:00.844 INFO ProgressMeter - Traversal complete. Processed 2644044 total reads in 0.7 minutes.
21:22:00.891 INFO ApplyBQSR - Shutting down engine
[October 10, 2022 9:22:00 PM CST] org.broadinstitute.hellbender.tools.walkers.bqsr.ApplyBQSR done. Elapsed time: 0.76 minutes.
Runtime.totalMemory()=2823815168
Using GATK jar /home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar ApplyBQSR -R /home/agis/likui_group/yaowenye/project/Sscrofa11.1/Sus_scrofa.Sscrofa11.1.dna.toplevel.fa -I SplitNCigarReads_SAMEA5337669-STARsorted.bam --bqsr-recal-file SAMEA5337669-recal.table -O BQSR_SAMEA5337669-STARsorted.bam
21:22:04.312 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Oct 10, 2022 9:22:04 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
21:22:04.497 INFO HaplotypeCaller - ------------------------------------------------------------
21:22:04.498 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.1.9.0
21:22:04.498 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
21:22:04.498 INFO HaplotypeCaller - Executing as yaowenye@comput42 on Linux v3.10.0-862.el7.x86_64 amd64
21:22:04.498 INFO HaplotypeCaller - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
21:22:04.498 INFO HaplotypeCaller - Start Date/Time: October 10, 2022 9:22:04 PM CST
21:22:04.498 INFO HaplotypeCaller - ------------------------------------------------------------
21:22:04.499 INFO HaplotypeCaller - ------------------------------------------------------------
21:22:04.499 INFO HaplotypeCaller - HTSJDK Version: 2.23.0
21:22:04.499 INFO HaplotypeCaller - Picard Version: 2.23.3
21:22:04.499 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
21:22:04.499 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
21:22:04.499 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
21:22:04.499 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
21:22:04.500 INFO HaplotypeCaller - Deflater: IntelDeflater
21:22:04.500 INFO HaplotypeCaller - Inflater: IntelInflater
21:22:04.500 INFO HaplotypeCaller - GCS max retries/reopens: 20
21:22:04.500 INFO HaplotypeCaller - Requester pays: disabled
21:22:04.500 INFO HaplotypeCaller - Initializing engine
21:22:05.045 INFO FeatureManager - Using codec VCFCodec to read file file:///home/agis/likui_group/yaowenye/project/ASE/02.RNA.known.sites/sus_scrofa.snp.edited.vcf.gz
21:22:05.161 WARN IndexUtils - Feature file "/home/agis/likui_group/yaowenye/project/ASE/02.RNA.known.sites/sus_scrofa.snp.edited.vcf.gz" appears to contain no sequence dictionary. Attempting to retrieve a sequence dictionary from the associated index file
21:22:05.377 WARN IndexUtils - Index file /home/agis/likui_group/yaowenye/project/ASE/02.RNA.known.sites/sus_scrofa.snp.edited.vcf.gz.tbi is out of date (index older than input file). Use IndexFeatureFile to make a new index.
21:22:05.436 INFO HaplotypeCaller - Done initializing engine
21:22:05.463 INFO HaplotypeCallerEngine - Disabling physical phasing, which is supported only for reference-model confidence output
21:22:05.572 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
21:22:05.735 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/home/agis/likui_group/yaowenye/software/gatk-4.1.9.0/gatk-package-4.1.9.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
21:22:05.825 INFO IntelPairHmm - Using CPU-supported AVX-512 instructions
21:22:05.825 INFO IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
21:22:05.826 INFO IntelPairHmm - Available threads: 2
21:22:05.826 INFO IntelPairHmm - Requested threads: 4
21:22:05.826 WARN IntelPairHmm - Using 2 available threads, but 4 were requested
21:22:05.826 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
21:22:05.915 INFO ProgressMeter - Starting traversal
21:22:05.915 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
21:22:07.202 WARN InbreedingCoeff - InbreedingCoeff will not be calculated; at least 10 samples must have called genotypes
21:22:15.915 INFO ProgressMeter - 1:13676305 0.2 46030 276180.0
21:22:25.915 INFO ProgressMeter - 1:41906728 0.3 140770 422310.0
21:22:35.920 INFO ProgressMeter - 1:73121790 0.5 245480 490878.2
21:22:45.920 INFO ProgressMeter - 1:101530706 0.7 340810 511151.1
21:22:55.921 INFO ProgressMeter - 1:128260272 0.8 430800 516898.0
21:23:05.921 INFO ProgressMeter - 1:161898418 1.0 543570 543515.6
21:23:15.921 INFO ProgressMeter - 1:188965103 1.2 634690 543973.4
21:23:25.921 INFO ProgressMeter - 1:220551642 1.3 740560 555378.3
21:23:35.921 INFO ProgressMeter - 1:249927264 1.5 839300 559496.0
21:23:45.928 INFO ProgressMeter - 1:270206516 1.7 907970 544711.2
21:23:56.692 INFO ProgressMeter - 2:6334510 1.8 943990 511292.1
21:24:06.695 INFO ProgressMeter - 2:13327549 2.0 968400 481073.0
21:24:16.695 INFO ProgressMeter - 2:39006107 2.2 1054760 483908.9
21:24:26.699 INFO ProgressMeter - 2:61207228 2.3 1129780 481498.5
21:24:36.762 INFO ProgressMeter - 2:73288425 2.5 1171650 466028.5
21:24:46.774 INFO ProgressMeter - 2:82661135 2.7 1204470 449264.3
21:24:56.776 INFO ProgressMeter - 2:115874208 2.8 1315760 462045.8
21:25:06.778 INFO ProgressMeter - 2:141966048 3.0 1403670 465660.0
21:25:16.777 INFO ProgressMeter - 3:6753547 3.2 1460400 459096.1
21:25:26.784 INFO ProgressMeter - 3:17325052 3.3 1496650 447052.6
21:25:36.793 INFO ProgressMeter - 3:33858412 3.5 1553150 441909.5
21:25:46.793 INFO ProgressMeter - 3:47364942 3.7 1599520 434498.7
21:25:56.793 INFO ProgressMeter - 3:71601149 3.8 1681380 436952.8
21:26:06.793 INFO ProgressMeter - 3:98451710 4.0 1771840 441345.4
21:26:16.794 INFO ProgressMeter - 3:121059658 4.2 1848270 442030.6
21:26:26.794 INFO ProgressMeter - 4:5405803 4.3 1906450 438467.6
21:26:36.807 INFO ProgressMeter - 4:35988676 4.5 2008980 444970.0
21:26:46.808 INFO ProgressMeter - 4:67746816 4.7 2115470 451875.5
21:26:56.812 INFO ProgressMeter - 4:93490010 4.8 2202250 454232.9
21:27:06.817 INFO ProgressMeter - 4:107629342 5.0 2250720 448794.6
21:27:16.845 INFO ProgressMeter - 5:22117 5.2 2329270 449478.0
21:27:26.853 INFO ProgressMeter - 5:11220104 5.3 2367670 442640.6
21:27:36.854 INFO ProgressMeter - 5:21729937 5.5 2403790 435812.6
21:27:46.854 INFO ProgressMeter - 5:48292582 5.7 2493210 438766.5
21:27:56.854 INFO ProgressMeter - 5:68332744 5.8 2560980 437850.5
21:28:06.854 INFO ProgressMeter - 5:93295993 6.0 2645220 439723.1
21:28:16.959 INFO ProgressMeter - 6:10717867 6.2 2719100 439694.5
21:28:26.970 INFO ProgressMeter - 6:28546557 6.4 2779840 437706.9
21:28:36.970 INFO ProgressMeter - 6:48524650 6.5 2847620 436913.5
21:28:46.978 INFO ProgressMeter - 6:59605167 6.7 2885950 431745.1
21:28:56.983 INFO ProgressMeter - 6:75376558 6.9 2939670 429079.0
21:29:06.983 INFO ProgressMeter - 6:89300393 7.0 2987600 425717.5
21:29:16.985 INFO ProgressMeter - 6:115879077 7.2 3077140 428302.6
21:29:26.986 INFO ProgressMeter - 6:148711253 7.4 3187210 433564.2
21:29:36.987 INFO ProgressMeter - 6:168067683 7.5 3252880 432687.5
21:29:46.996 INFO ProgressMeter - 7:20820371 7.7 3332380 433639.2
21:29:56.996 INFO ProgressMeter - 7:35788066 7.9 3383590 430956.5
21:30:06.999 INFO ProgressMeter - 7:54886121 8.0 3448420 430081.2
21:30:17.000 INFO ProgressMeter - 7:77687118 8.2 3525500 430740.1
21:30:27.000 INFO ProgressMeter - 7:100832556 8.4 3603820 431522.0
21:30:37.006 INFO ProgressMeter - 8:3798691 8.5 3687390 432884.6
21:30:47.006 INFO ProgressMeter - 8:31485318 8.7 3780300 435275.2
21:30:57.006 INFO ProgressMeter - 8:64598395 8.9 3891300 439619.6
21:31:07.006 INFO ProgressMeter - 8:94411123 9.0 3991450 442600.2
21:31:17.006 INFO ProgressMeter - 8:127396572 9.2 4102000 446605.0
21:31:27.006 INFO ProgressMeter - 9:11465311 9.4 4179680 446952.1
21:31:37.006 INFO ProgressMeter - 9:40860588 9.5 4278450 449502.8
21:31:47.007 INFO ProgressMeter - 9:67115583 9.7 4366890 450898.3
21:31:57.007 INFO ProgressMeter - 9:102004021 9.9 4483710 455128.1
21:32:07.007 INFO ProgressMeter - 9:126859472 10.0 4567600 455930.2
21:32:17.021 INFO ProgressMeter - 10:13816731 10.2 4656650 457202.2
21:32:27.022 INFO ProgressMeter - 10:31312927 10.4 4716040 455577.5
21:32:37.022 INFO ProgressMeter - 10:54104073 10.5 4792940 455669.8
21:32:47.022 INFO ProgressMeter - 11:8684917 10.7 4873670 456117.6
21:32:57.022 INFO ProgressMeter - 11:38729707 10.9 4974570 458410.4
21:33:07.022 INFO ProgressMeter - 11:72262542 11.0 5086930 461673.8
21:33:17.043 INFO ProgressMeter - 12:6248422 11.2 5131910 458801.6
21:33:27.053 INFO ProgressMeter - 12:21797604 11.4 5185110 456745.3
21:33:37.056 INFO ProgressMeter - 12:38900368 11.5 5243250 455182.1
21:33:47.141 INFO ProgressMeter - 12:52872349 11.7 5291340 452750.5
21:33:57.141 INFO ProgressMeter - 13:13560858 11.9 5366750 452746.4
21:34:07.146 INFO ProgressMeter - 13:32728422 12.0 5431900 451885.7
21:34:17.146 INFO ProgressMeter - 13:59442475 12.2 5521840 453085.8
21:34:27.146 INFO ProgressMeter - 13:81096185 12.4 5595010 452896.1
21:34:37.146 INFO ProgressMeter - 13:116331586 12.5 5712990 456290.3
21:34:47.146 INFO ProgressMeter - 13:142695250 12.7 5801910 457304.8
21:34:57.146 INFO ProgressMeter - 13:180075446 12.9 5926930 461101.5
21:35:07.150 INFO ProgressMeter - 13:207445858 13.0 6018990 462267.3
21:35:17.156 INFO ProgressMeter - 14:24316755 13.2 6103970 462865.5
21:35:27.159 INFO ProgressMeter - 14:45241490 13.4 6174860 462396.1
21:35:37.158 INFO ProgressMeter - 14:62250265 13.5 6232840 460984.4
21:35:47.159 INFO ProgressMeter - 14:89076512 13.7 6323240 461975.2
21:35:57.162 INFO ProgressMeter - 14:113707560 13.9 6406410 462419.2
21:36:07.167 INFO ProgressMeter - 14:140351748 14.0 6496100 463316.6
21:36:17.185 INFO ProgressMeter - 15:29723056 14.2 6600520 465224.5
21:36:27.184 INFO ProgressMeter - 15:57934133 14.4 6695310 466426.4
21:36:37.184 INFO ProgressMeter - 15:87123239 14.5 6793520 467836.2
21:36:47.189 INFO ProgressMeter - 15:117095959 14.7 6894260 469383.6
21:36:57.189 INFO ProgressMeter - 15:139376748 14.9 6969470 469180.3
21:37:07.208 INFO ProgressMeter - 16:27946733 15.0 7066770 470442.1
21:37:17.208 INFO ProgressMeter - 16:55563042 15.2 7159740 471401.0
21:37:27.208 INFO ProgressMeter - 17:5321057 15.4 7259430 472776.6
21:37:37.213 INFO ProgressMeter - 17:32962699 15.5 7352400 473687.3
21:37:47.221 INFO ProgressMeter - 17:50796478 15.7 7413070 472518.2
21:37:57.221 INFO ProgressMeter - 18:11337386 15.9 7494140 472664.3
21:38:07.221 INFO ProgressMeter - 18:42217281 16.0 7597820 474218.6
21:38:17.221 INFO ProgressMeter - X:8187085 16.2 7671720 473901.3
21:38:27.221 INFO ProgressMeter - X:36695318 16.4 7767390 474921.6
21:38:37.221 INFO ProgressMeter - X:60243315 16.5 7846990 474948.6
21:38:47.238 INFO ProgressMeter - X:96169182 16.7 7967180 477399.2
21:38:57.242 INFO ProgressMeter - X:125387413 16.9 8065340 478500.4
21:39:07.242 INFO ProgressMeter - Y:39508260 17.0 8198990 481666.9
21:39:17.471 INFO ProgressMeter - MT:9631 17.2 8212530 477678.2
21:39:27.487 INFO ProgressMeter - AEMK02000361.1:1982027 17.4 8241140 474733.3
21:39:37.486 INFO ProgressMeter - AEMK02000171.1:88201 17.5 8337900 475739.6
21:39:47.486 INFO ProgressMeter - AEMK02000312.1:20101 17.7 8428850 476398.7
21:39:48.465 INFO HaplotypeCaller - 955146 read(s) filtered by: MappingQualityReadFilter
0 read(s) filtered by: MappingQualityAvailableReadFilter
0 read(s) filtered by: MappedReadFilter
0 read(s) filtered by: NotSecondaryAlignmentReadFilter
101998 read(s) filtered by: NotDuplicateReadFilter
0 read(s) filtered by: PassesVendorQualityCheckReadFilter
0 read(s) filtered by: NonZeroReferenceLengthAlignmentReadFilter
0 read(s) filtered by: GoodCigarReadFilter
0 read(s) filtered by: WellformedReadFilter
1057144 total reads filtered
21:39:48.466 INFO ProgressMeter - AEMK02000519.1:12601 17.7 8435938 476359.5
21:39:48.466 INFO ProgressMeter - Traversal complete. Processed 8435938 total regions in 17.7 minutes.
21:39:48.580 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 0.19054606100000002
21:39:48.580 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 23.596010852000003
21:39:48.580 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 52.69 sec
21:39:50.107 INFO HaplotypeCaller - Shutting down engine
[October 10, 2022 9:39:50 PM CST] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 17.76 minutes.
Runtime.totalMemory()=4287102976 -
Hi Wen-ye Yao,
Thank you for being so patient while we reviewed your program log! I have some feedback and the next steps for you to try.
You could try using the --force-active argument when running HaplotypeCaller. It will make it run much slower, but it won't skip over any potential regions of interest. You can add some intervals if you want to zero in on a particular region.
You could also try adjusting the mapping quality. If you lower the mapping quality, HaplotypeCaller will capture more reads.
I hope this helps! Please let me know if this leads you to success. If not, we will go back to the drawing board.
I look forward to your response!
Best,
Anthony
-
Hi Wen-ye Yao,
We haven't heard from you in a while so we're going to close out this ticket. If you still require assistance, simply respond to this email and we'll be happy to pick up where we left off!
Kind regards,
Anthony
Please sign in to leave a comment.
8 comments