VariantFiltration Invalid Argument
Can you please provide
a) GATK version used 4.1.4.1
b) Exact GATK commands used
c) The entire error log if applicable.
I have the following snakemake rule:
rule gatk_filter:
input:
genome = "../../../../data/external/genome.fa",
vcf =expand("../../../../data/final/pipeline_2/bwa_mem_freebayes_{sample}.vcf",sample=SAMPLES)
output:
vcf=expand("../../../../data/final/pipeline_2/bwa_mem_freebayes_{sample}.filtered.vcf",sample=SAMPLES)
params:
filters={"myfilter": "QA > 9 || DP > 4 || AF > 0.01 || AC > 1"}
conda:
"../../../environment.yml"
shell:
"gatk VariantFiltration -R {input.genome} -V {input.vcf} --filter-expression {params.filters} -O {output.vcf}"
Any time I try to run it, I get an error:
A USER ERROR has occurred: invalid argument 'QA > 9,'.
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
[Wed Apr 8 11:50:04 2020]
Error in rule gatk_filter:
jobid: 0
output: ../../../../data/final/pipeline_2/bwa_mem_freebayes_B.filtered.vcf
conda-env: /home/name/project/evaluating_the_performance_of_variant_calling_pipelines/src/pipelines/subworkflows/bwa_mem_freebayes/.snakemake/conda/24d9fba4
RuleException:
CalledProcessError in line 126 of /home/name/project/evaluating_the_performance_of_variant_calling_pipelines/src/pipelines/subworkflows/bwa_mem_freebayes/Snakefile:
Command 'source activate /home/name/project/evaluating_the_performance_of_variant_calling_pipelines/src/pipelines/subworkflows/bwa_mem_freebayes/.snakemake/conda/24d9fba4; set -euo pipefail; gatk VariantFiltration -R ../../../../data/external/genome.fa -V ../../../../data/final/pipeline_2/bwa_mem_freebayes_B.vcf --filter-expression {'myfilter': 'QA > 9', 'dp': 'DP > 4', 'af': 'AF > 0.01', 'ac': 'AC > 1'} -O ../../../../data/final/pipeline_2/bwa_mem_freebayes_B.filtered.vcf ' returned non-zero exit status 1.
File "/home/name/project/evaluating_the_performance_of_variant_calling_pipelines/src/pipelines/subworkflows/bwa_mem_freebayes/Snakefile", line 126, in __rule_gatk_filter
File "/home/name/miniconda3/envs/evaluating_the_performance_of_vcp/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /home/name/project/evaluating_the_performance_of_variant_calling_pipelines/src/pipelines/subworkflows/bwa_mem_freebayes/.snakemake/log/2020-04-08T115003.331551.snakemake.log
I don't understand why the argument is invalid, I was following the tool description page and copied the example filter expression but it does not work with it as well. Please help, all of my other rules get executed without an issue
-
Hi mons7re
What is the QA field you are filtering on? Is that field present in your vcf?
-
I think so. Here is a row of my vcf:
AB=0.0952381;ABP=32.8939;AC=1;AF=0.5;AN=2;AO=2;CIGAR=1X;DP=21;DPB=21;DPRA=0;EPP=7.35324;EPPR=4.03889;GTI=0;LEN=1;MEANALT=1;MQM=60;MQMR=58.5789;NS=1;NUMALT=1;ODDS=13.3916;PAIRED=1;PAIREDR=1;PAO=0;PQA=0;PQR=0;PRO=0;QA=73;QR=633;RO=19;RPL=1;RPP=3.0103;RPPR=12.2676;RPR=1;RUN=1;SAF=1;SAP=3.0103;SAR=1;SRF=6;SRP=8.61041;SRR=13;TYPE=snp;technology.illumina=1
Is there a better tool to filter quality, AC, AF and DP? I'm sorry if I am asking stupid questions, I am not experienced in bioinformatics, just working on a project!
-
Can you please provide the header and the first few records of the vcf file? This does not look right. Here is what a vcf format looks like: https://samtools.github.io/hts-specs/VCFv4.2.pdf
Please sign in to leave a comment.
3 comments