why does HaplotypeCaller still report a diploid genotype even though I specified haploid by -ploidy 1
REQUIRED for all errors and issues:
a) GATK version used:
The Genome Analysis Toolkit (GATK) v4.4.0.0
b) Exact command used:
gatk --java-options "-Xmx10g" HaplotypeCaller \
-R $genome_idx \
-I $output_dir/gatk_dataPreprocess/${sample_prefix}${sample}_recal.bam \
-O $output_dir/gatk_callVariants/${sample_prefix}${sample}.rawVariants_recal.gvcf \
-ERC GVCF \
-ploidy 1
c) Entire program log:
output example is like this:
HROM POS ID REF ALT QUAL FILTER INFO FORMAT seed_S1
chrI 1 . C <NON_REF> . . END=5 GT:DP:GQ:MIN_DP:PL 0/0:149:0:84:0,0,0
chrI 6 . C <NON_REF> . . END=18 GT:DP:GQ:MIN_DP:PL 0/0:316:99:180:0,120,1800
.
.
.
chrI 385 . C CACTACCACACATT,CACCACTACCACACATT,<NON_REF> 3982.60 . BaseQRankSum=1.256;DP=2210;ExcessHet=0.0000;MLEAC=0,1,0;MLEAF=0.00,0.500,0.00;MQRankSum=-37.182;RAW_MQandDP=7020373,2210;ReadPosRankSum=-4.924 GT:AD:DP:GQ:PL:SB 0/2:1799,2,200,0:2001:99:3990,7909,83254,0,75383,74865,8763,83418,75539,84145:933,866,72,130
-
Hi Yq Lan
This is an unexpected result. Can you share the header portion of your bam file? Also can you try the same command using the latest GATK version 4.5.0.0? You can see the header portion by using
samtools view -H bamfile.bam
-
Hi Gökalp Çelik
Thanks for your reply. here is the header of my bam file:
@HD VN:1.6 GO:none SO:coordinate
@SQ SN:chrI LN:230218
@SQ SN:chrII LN:813184
@SQ SN:chrIII LN:316620
@SQ SN:chrIV LN:1531933
@SQ SN:chrV LN:576874
@SQ SN:chrVI LN:270161
@SQ SN:chrVII LN:1090940
@SQ SN:chrVIII LN:562643
@SQ SN:chrIX LN:439888
@SQ SN:chrX LN:745751
@SQ SN:chrXI LN:666816
@SQ SN:chrXII LN:1078177
@SQ SN:chrXIII LN:924431
@SQ SN:chrXIV LN:784333
@SQ SN:chrXV LN:1091291
@SQ SN:chrXVI LN:948066
@SQ SN:chrM LN:85779
@RG ID:seed_S1 SM:seed_S1 LB:pe100 PL:Novaseq6000
@PG ID:bwa PN:bwa VN:0.7.17-r1198-dirty CL:bwa mem -Y -M -B 3 -K 100000000 -t 4 -R @RG\tID:seed_S1\tSM:seed_S1\tLB:pe100\tPL:Novaseq6000 /project2/araman/yqlan/yeastAdaptation_seqData/yeast_reference_genome-genes/sacCer3_bwa-mem_idx/sacCer3.fa /project2/araman/yqlan/yeastAdaptation_seqData/wholeGenomeSeq_results/20240131_LH00315_0080_B22H3W5LT3-AR-Lan-8s/fastp/AR-YL-8s-seed_S1_L001_R1_001_fastped.fastq.gz /project2/araman/yqlan/yeastAdaptation_seqData/wholeGenomeSeq_results/20240131_LH00315_0080_B22H3W5LT3-AR-Lan-8s/fastp/AR-YL-8s-seed_S1_L001_R2_001_fastped.fastq.gz
@PG ID:samtools PN:samtools PP:bwa VN:1.18 CL:samtools view -bS -
@PG ID:samtools.1 PN:samtools PP:samtools VN:1.18 CL:samtools sort -o /project2/araman/yqlan/yeastAdaptation_seqData/wholeGenomeSeq_results/20240131_LH00315_0080_B22H3W5LT3-AR-Lan-8s/bwa-mem/AR-YL-8s-seed_S1_sorted.bam -
@PG ID:samtools.2 PN:samtools PP:samtools.1 VN:1.18 CL:samtools sort -n -o AR-YL-8s-seed_S1_sorted2n.bam AR-YL-8s-seed_S1_sorted.bam
@PG ID:GATK ApplyBQSR VN:4.4.0.0 CL:ApplyBQSR --output /project2/araman/yqlan/yeastAdaptation_seqData/wholeGenomeSeq_results/20240131_LH00315_0080_B22H3W5LT3-AR-Lan-8s/gatk_dataPreprocess/AR-YL-8s-seed_S1_recal.bam --bqsr-recal-file /project2/araman/yqlan/yeastAdaptation_seqData/wholeGenomeSeq_results/20240131_LH00315_0080_B22H3W5LT3-AR-Lan-8s/gatk_dataPreprocess/AR-YL-8s-seed_S1.recal_data.table --input /project2/araman/yqlan/yeastAdaptation_seqData/wholeGenomeSeq_results/20240131_LH00315_0080_B22H3W5LT3-AR-Lan-8s/gatk_dataPreprocess/AR-YL-8s-seed_S1_markdup.bam --reference /project2/araman/yqlan/yeastAdaptation_seqData/yeast_reference_genome-genes/sacCer3_WGS_idx/sacCer3.fa --preserve-qscores-less-than 6 --use-original-qualities false --quantize-quals 0 --round-down-quantized false --emit-original-quals false --global-qscore-prior -1.0 --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --max-variants-per-shard 0 --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false PN:GATK ApplyBQSR
@PG ID:samtools.3 PN:samtools PP:samtools.2 VN:1.18 CL:samtools view -H AR-YL-8s-seed_S1_recal.bam -
Hi again. Can you also share the header of your GVCF output as well? I am trying to replicate the issue but so far I could not get the same result. I suspect there is something about a parameter passing issue to the main command line.
-
Hi Gökalp Çelik,
Thanks, here is header of the GVCF file.
##fileformat=VCFv4.2
##ALT=<ID=NON_REF,Description="Represents any possible alternative allele not already represented at this location by REF and ALT">
##FILTER=<ID=LowQual,Description="Low quality">
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=MIN_DP,Number=1,Type=Integer,Description="Minimum DP observed within the GVCF block">
##FORMAT=<ID=PGT,Number=1,Type=String,Description="Physical phasing haplotype information, describing how the alternate alleles are phased in relation to one another; will always be heterozygous and is not intended to describe called alleles">
##FORMAT=<ID=PID,Number=1,Type=String,Description="Physical phasing ID information, where each unique ID within a given sample (but not across samples) connects records within a phasing group">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
##FORMAT=<ID=PS,Number=1,Type=Integer,Description="Phasing set (typically the position of the first variant in the set)">
##FORMAT=<ID=SB,Number=4,Type=Integer,Description="Per-sample component statistics which comprise the Fisher's Exact Test to detect strand bias.">
##GATKCommandLine=<ID=HaplotypeCaller,CommandLine="HaplotypeCaller --emit-ref-confidence GVCF --output /project2/araman/yqlan/yeastAdaptation_seqData/wholeGenomeSeq_results/20240131_LH00315_0080_B22H3W5LT3-AR-Lan-8s/gatk_callVariants/AR-YL-8s-seed_S1.rawVariants_recal.gvcf --input /project2/araman/yqlan/yeastAdaptation_seqData/wholeGenomeSeq_results/20240131_LH00315_0080_B22H3W5LT3-AR-Lan-8s/gatk_dataPreprocess/AR-YL-8s-seed_S1_recal.bam --reference /project2/araman/yqlan/yeastAdaptation_seqData/yeast_reference_genome-genes/sacCer3_WGS_idx/sacCer3.fa --use-posteriors-to-calculate-qual false --dont-use-dragstr-priors false --use-new-qual-calculator true --annotate-with-num-discovered-alleles false --heterozygosity 0.001 --indel-heterozygosity 1.25E-4 --heterozygosity-stdev 0.01 --standard-min-confidence-threshold-for-calling 30.0 --max-alternate-alleles 6 --max-genotype-count 1024 --sample-ploidy 2 --num-reference-samples-if-no-call 0 --genotype-assignment-method USE_PLS_TO_ASSIGN --contamination-fraction-to-filter 0.0 --output-mode EMIT_VARIANTS_ONLY --all-site-pls false --flow-likelihood-parallel-threads 0 --flow-likelihood-optimized-comp false --flow-use-t0-tag false --flow-probability-threshold 0.003 --flow-remove-non-single-base-pair-indels false --flow-remove-one-zero-probs false --flow-quantization-bins 121 --flow-fill-empty-bins-value 0.001 --flow-symmetric-indel-probs false --flow-report-insertion-or-deletion false --flow-disallow-probs-larger-than-call false --flow-lump-probs false --flow-retain-max-n-probs-base-format false --flow-probability-scaling-factor 10 --flow-order-cycle-length 4 --flow-number-of-uncertain-flows-to-clip 0 --flow-nucleotide-of-first-uncertain-flow T --keep-boundary-flows false --gvcf-gq-bands 1 --gvcf-gq-bands 2 --gvcf-gq-bands 3 --gvcf-gq-bands 4 --gvcf-gq-bands 5 --gvcf-gq-bands 6 --gvcf-gq-bands 7 --gvcf-gq-bands 8 --gvcf-gq-bands 9 --gvcf-gq-bands 10 --gvcf-gq-bands 11 --gvcf-gq-bands 12 --gvcf-gq-bands 13 --gvcf-gq-bands 14 --gvcf-gq-bands 15 --gvcf-gq-bands 16 --gvcf-gq-bands 17 --gvcf-gq-bands 18 --gvcf-gq-bands 19 --gvcf-gq-bands 20 --gvcf-gq-bands 21 --gvcf-gq-bands 22 --gvcf-gq-bands 23 --gvcf-gq-bands 24 --gvcf-gq-bands 25 --gvcf-gq-bands 26 --gvcf-gq-bands 27 --gvcf-gq-bands 28 --gvcf-gq-bands 29 --gvcf-gq-bands 30 --gvcf-gq-bands 31 --gvcf-gq-bands 32 --gvcf-gq-bands 33 --gvcf-gq-bands 34 --gvcf-gq-bands 35 --gvcf-gq-bands 36 --gvcf-gq-bands 37 --gvcf-gq-bands 38 --gvcf-gq-bands 39 --gvcf-gq-bands 40 --gvcf-gq-bands 41 --gvcf-gq-bands 42 --gvcf-gq-bands 43 --gvcf-gq-bands 44 --gvcf-gq-bands 45 --gvcf-gq-bands 46 --gvcf-gq-bands 47 --gvcf-gq-bands 48 --gvcf-gq-bands 49 --gvcf-gq-bands 50 --gvcf-gq-bands 51 --gvcf-gq-bands 52 --gvcf-gq-bands 53 --gvcf-gq-bands 54 --gvcf-gq-bands 55 --gvcf-gq-bands 56 --gvcf-gq-bands 57 --gvcf-gq-bands 58 --gvcf-gq-bands 59 --gvcf-gq-bands 60 --gvcf-gq-bands 70 --gvcf-gq-bands 80 --gvcf-gq-bands 90 --gvcf-gq-bands 99 --floor-blocks false --indel-size-to-eliminate-in-ref-model 10 --disable-optimizations false --dragen-mode false --flow-mode NONE --apply-bqd false --apply-frd false --disable-spanning-event-genotyping false --transform-dragen-mapping-quality false --mapping-quality-threshold-for-genotyping 20 --max-effective-depth-adjustment-for-frd 0 --just-determine-active-regions false --dont-genotype false --do-not-run-physical-phasing false --do-not-correct-overlapping-quality false --use-filtered-reads-for-annotations false --use-flow-aligner-for-stepwise-hc-filtering false --adaptive-pruning false --do-not-recover-dangling-branches false --recover-dangling-heads false --kmer-size 10 --kmer-size 25 --dont-increase-kmer-sizes-for-cycles false --allow-non-unique-kmers-in-ref false --num-pruning-samples 1 --min-dangling-branch-length 4 --recover-all-dangling-branches false --max-num-haplotypes-in-population 128 --min-pruning 2 --adaptive-pruning-initial-error-rate 0.001 --pruning-lod-threshold 2.302585092994046 --pruning-seeding-lod-threshold 9.210340371976184 --max-unpruned-variants 100 --linked-de-bruijn-graph false --disable-artificial-haplotype-recovery false --enable-legacy-graph-cycle-detection false --debug-assembly false --debug-graph-transformations false --capture-assembly-failure-bam false --num-matching-bases-in-dangling-end-to-recover -1 --error-correction-log-odds -Infinity --error-correct-reads false --kmer-length-for-read-error-correction 25 --min-observations-for-kmer-to-be-solid 20 --likelihood-calculation-engine PairHMM --base-quality-score-threshold 18 --dragstr-het-hom-ratio 2 --dont-use-dragstr-pair-hmm-scores false --pair-hmm-gap-continuation-penalty 10 --expected-mismatch-rate-for-read-disqualification 0.02 --pair-hmm-implementation FASTEST_AVAILABLE --pcr-indel-model CONSERVATIVE --phred-scaled-global-read-mismapping-rate 45 --disable-symmetric-hmm-normalizing false --disable-cap-base-qualities-to-map-quality false --enable-dynamic-read-disqualification-for-genotyping false --dynamic-read-disqualification-threshold 1.0 --native-pair-hmm-threads 4 --native-pair-hmm-use-double-precision false --flow-hmm-engine-min-indel-adjust 6 --flow-hmm-engine-flat-insertion-penatly 45 --flow-hmm-engine-flat-deletion-penatly 45 --pileup-detection false --pileup-detection-enable-indel-pileup-calling false --num-artificial-haplotypes-to-add-per-allele 5 --artifical-haplotype-filtering-kmer-size 10 --pileup-detection-snp-alt-threshold 0.1 --pileup-detection-indel-alt-threshold 0.5 --pileup-detection-absolute-alt-depth 0.0 --pileup-detection-snp-adjacent-to-assembled-indel-range 5 --pileup-detection-bad-read-tolerance 0.0 --pileup-detection-proper-pair-read-badness true --pileup-detection-edit-distance-read-badness-threshold 0.08 --pileup-detection-chimeric-read-badness true --pileup-detection-template-mean-badness-threshold 0.0 --pileup-detection-template-std-badness-threshold 0.0 --bam-writer-type CALLED_HAPLOTYPES --dont-use-soft-clipped-bases false --override-fragment-softclip-check false --min-base-quality-score 10 --smith-waterman JAVA --max-mnp-distance 0 --force-call-filtered-alleles false --reference-model-deletion-quality 30 --soft-clip-low-quality-ends false --allele-informative-reads-overlap-margin 2 --smith-waterman-dangling-end-match-value 25 --smith-waterman-dangling-end-mismatch-penalty -50 --smith-waterman-dangling-end-gap-open-penalty -110 --smith-waterman-dangling-end-gap-extend-penalty -6 --smith-waterman-haplotype-to-reference-match-value 200 --smith-waterman-haplotype-to-reference-mismatch-penalty -150 --smith-waterman-haplotype-to-reference-gap-open-penalty -260 --smith-waterman-haplotype-to-reference-gap-extend-penalty -11 --smith-waterman-read-to-haplotype-match-value 10 --smith-waterman-read-to-haplotype-mismatch-penalty -15 --smith-waterman-read-to-haplotype-gap-open-penalty -30 --smith-waterman-read-to-haplotype-gap-extend-penalty -5 --flow-assembly-collapse-hmer-size 0 --flow-assembly-collapse-partial-mode false --flow-filter-alleles false --flow-filter-alleles-qual-threshold 30.0 --flow-filter-alleles-sor-threshold 3.0 --flow-filter-lone-alleles false --flow-filter-alleles-debug-graphs false --min-assembly-region-size 50 --max-assembly-region-size 300 --active-probability-threshold 0.002 --max-prob-propagation-distance 50 --force-active false --assembly-region-padding 100 --padding-around-indels 75 --padding-around-snps 20 --padding-around-strs 75 --max-extension-into-assembly-region-padding-legacy 25 --max-reads-per-alignment-start 50 --enable-legacy-assembly-region-trimming false --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --max-variants-per-shard 0 --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false --minimum-mapping-quality 20 --disable-tool-default-annotations false --enable-all-annotations false --allow-old-rms-mapping-quality-annotation-data false",Version="4.4.0.0",Date="March 13, 2024 at 5:58:18 PM CDT">
##GVCFBlock0-1=minGQ=0(inclusive),maxGQ=1(exclusive)
##GVCFBlock1-2=minGQ=1(inclusive),maxGQ=2(exclusive)
##GVCFBlock10-11=minGQ=10(inclusive),maxGQ=11(exclusive)
##GVCFBlock11-12=minGQ=11(inclusive),maxGQ=12(exclusive)
##GVCFBlock12-13=minGQ=12(inclusive),maxGQ=13(exclusive)
##GVCFBlock13-14=minGQ=13(inclusive),maxGQ=14(exclusive)
##GVCFBlock14-15=minGQ=14(inclusive),maxGQ=15(exclusive)
##GVCFBlock15-16=minGQ=15(inclusive),maxGQ=16(exclusive)
##GVCFBlock16-17=minGQ=16(inclusive),maxGQ=17(exclusive)
##GVCFBlock17-18=minGQ=17(inclusive),maxGQ=18(exclusive)
##GVCFBlock18-19=minGQ=18(inclusive),maxGQ=19(exclusive)
##GVCFBlock19-20=minGQ=19(inclusive),maxGQ=20(exclusive)
##GVCFBlock2-3=minGQ=2(inclusive),maxGQ=3(exclusive)
##GVCFBlock20-21=minGQ=20(inclusive),maxGQ=21(exclusive)
##GVCFBlock21-22=minGQ=21(inclusive),maxGQ=22(exclusive)
##GVCFBlock22-23=minGQ=22(inclusive),maxGQ=23(exclusive)
##GVCFBlock23-24=minGQ=23(inclusive),maxGQ=24(exclusive)
##GVCFBlock24-25=minGQ=24(inclusive),maxGQ=25(exclusive)
##GVCFBlock25-26=minGQ=25(inclusive),maxGQ=26(exclusive)
##GVCFBlock26-27=minGQ=26(inclusive),maxGQ=27(exclusive)
##GVCFBlock27-28=minGQ=27(inclusive),maxGQ=28(exclusive)
##GVCFBlock28-29=minGQ=28(inclusive),maxGQ=29(exclusive)
##GVCFBlock29-30=minGQ=29(inclusive),maxGQ=30(exclusive)
##GVCFBlock3-4=minGQ=3(inclusive),maxGQ=4(exclusive)
##GVCFBlock30-31=minGQ=30(inclusive),maxGQ=31(exclusive)
##GVCFBlock31-32=minGQ=31(inclusive),maxGQ=32(exclusive)
##GVCFBlock32-33=minGQ=32(inclusive),maxGQ=33(exclusive)
##GVCFBlock33-34=minGQ=33(inclusive),maxGQ=34(exclusive)
##GVCFBlock34-35=minGQ=34(inclusive),maxGQ=35(exclusive)
##GVCFBlock35-36=minGQ=35(inclusive),maxGQ=36(exclusive)
##GVCFBlock36-37=minGQ=36(inclusive),maxGQ=37(exclusive)
##GVCFBlock37-38=minGQ=37(inclusive),maxGQ=38(exclusive)
##GVCFBlock38-39=minGQ=38(inclusive),maxGQ=39(exclusive)
##GVCFBlock39-40=minGQ=39(inclusive),maxGQ=40(exclusive)
##GVCFBlock4-5=minGQ=4(inclusive),maxGQ=5(exclusive)
##GVCFBlock40-41=minGQ=40(inclusive),maxGQ=41(exclusive)
##GVCFBlock41-42=minGQ=41(inclusive),maxGQ=42(exclusive)
##GVCFBlock42-43=minGQ=42(inclusive),maxGQ=43(exclusive)
##GVCFBlock43-44=minGQ=43(inclusive),maxGQ=44(exclusive)
##GVCFBlock44-45=minGQ=44(inclusive),maxGQ=45(exclusive)
##GVCFBlock45-46=minGQ=45(inclusive),maxGQ=46(exclusive)
##GVCFBlock46-47=minGQ=46(inclusive),maxGQ=47(exclusive)
##GVCFBlock47-48=minGQ=47(inclusive),maxGQ=48(exclusive)
##GVCFBlock48-49=minGQ=48(inclusive),maxGQ=49(exclusive)
##GVCFBlock49-50=minGQ=49(inclusive),maxGQ=50(exclusive)
##GVCFBlock5-6=minGQ=5(inclusive),maxGQ=6(exclusive)
##GVCFBlock50-51=minGQ=50(inclusive),maxGQ=51(exclusive)
##GVCFBlock51-52=minGQ=51(inclusive),maxGQ=52(exclusive)
##GVCFBlock52-53=minGQ=52(inclusive),maxGQ=53(exclusive)
##GVCFBlock53-54=minGQ=53(inclusive),maxGQ=54(exclusive)
##GVCFBlock54-55=minGQ=54(inclusive),maxGQ=55(exclusive)
##GVCFBlock55-56=minGQ=55(inclusive),maxGQ=56(exclusive)
##GVCFBlock56-57=minGQ=56(inclusive),maxGQ=57(exclusive)
##GVCFBlock57-58=minGQ=57(inclusive),maxGQ=58(exclusive)
##GVCFBlock58-59=minGQ=58(inclusive),maxGQ=59(exclusive)
##GVCFBlock59-60=minGQ=59(inclusive),maxGQ=60(exclusive)
##GVCFBlock6-7=minGQ=6(inclusive),maxGQ=7(exclusive)
##GVCFBlock60-70=minGQ=60(inclusive),maxGQ=70(exclusive)
##GVCFBlock7-8=minGQ=7(inclusive),maxGQ=8(exclusive)
##GVCFBlock70-80=minGQ=70(inclusive),maxGQ=80(exclusive)
##GVCFBlock8-9=minGQ=8(inclusive),maxGQ=9(exclusive)
##GVCFBlock80-90=minGQ=80(inclusive),maxGQ=90(exclusive)
##GVCFBlock9-10=minGQ=9(inclusive),maxGQ=10(exclusive)
##GVCFBlock90-99=minGQ=90(inclusive),maxGQ=99(exclusive)
##GVCFBlock99-100=minGQ=99(inclusive),maxGQ=100(exclusive)
##INFO=<ID=BaseQRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt Vs. Ref base qualities">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth; some reads may have been filtered">
##INFO=<ID=END,Number=1,Type=Integer,Description="Stop position of the interval">
##INFO=<ID=ExcessHet,Number=1,Type=Float,Description="Phred-scaled p-value for exact test of excess heterozygosity">
##INFO=<ID=InbreedingCoeff,Number=1,Type=Float,Description="Inbreeding coefficient as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation">
##INFO=<ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed">
##INFO=<ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed">
##INFO=<ID=MQRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref read mapping qualities">
##INFO=<ID=RAW_MQandDP,Number=2,Type=Integer,Description="Raw data (sum of squared MQ and total depth) for improved RMS Mapping Quality calculation. Incompatible with deprecated RAW_MQ formulation.">
##INFO=<ID=ReadPosRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias">
##contig=<ID=chrI,length=230218>
##contig=<ID=chrII,length=813184>
##contig=<ID=chrIII,length=316620>
##contig=<ID=chrIV,length=1531933>
##contig=<ID=chrV,length=576874>
##contig=<ID=chrVI,length=270161>
##contig=<ID=chrVII,length=1090940>
##contig=<ID=chrVIII,length=562643>
##contig=<ID=chrIX,length=439888>
##contig=<ID=chrX,length=745751>
##contig=<ID=chrXI,length=666816>
##contig=<ID=chrXII,length=1078177>
##contig=<ID=chrXIII,length=924431>
##contig=<ID=chrXIV,length=784333>
##contig=<ID=chrXV,length=1091291>
##contig=<ID=chrXVI,length=948066>
##contig=<ID=chrM,length=85779>
##source=HaplotypeCaller
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT seed_S1 -
Hi again.
Judging by the command line string in the header your --sample-ploidy is still set to 2 so you might need to check your environment and the way command line parameters are fed to GATK. Looks like the command parameter you mentioned up above was not passed to GATK properly.
##GATKCommandLine=<ID=HaplotypeCaller,CommandLine="HaplotypeCaller ..... --sample-ploidy 2
It should look like this
##GATKCommandLine=<ID=HaplotypeCaller,CommandLine="HaplotypeCaller ..... --sample-ploidy 1
I hope this helps
-
Hi Gökalp Çelik,
Thanks so much, I think it is because there are 2 lines of annotation before line '-ploidy 1' in the command, which makes the '-ploidy 1' option not work.
the wrong command i used is like this:
gatk --java-options "-Xmx10g" HaplotypeCaller \
-R $genome_idx \
-I $output_dir/gatk_dataPreprocess/${sample_prefix}-${sample}_recal.bam \
-O $output_dir/gatk_callVariants/${sample_prefix}-${sample}.rawVariants_recal.gvcf \
-ERC GVCF \
# for GVCF mode, see https://gatk.broadinstitute.org/hc/en-us/articles/360035531532-HaplotypeCaller-Reference-Confidence-Model-GVCF-mode
# and https://gatk.broadinstitute.org/hc/en-us/articles/360035531812-GVCF-Genomic-Variant-Call-Format
-ploidy 1
Please sign in to leave a comment.
6 comments