USER ERROR has occurred: Sorry, we only support a single reads input for for this spark tool.
When I run
gatk HaplotypeCallerSpark -R newzf20/GCF_008822105.2_bTaeGut2.pat.W.v2_genomic.fa -I RSFV1A_match.bam -I RSFV1B_match.bam -L NC_044998.1 --stand-call-conf 30 -mbq 20 --spark-runner LOCAL --spark-master local[2] --conf spark.executor.memoryOverhead=600 -O raw_variantsZF_NC_044998.1test.vcf
The program complains:
A USER ERROR has occurred: Sorry, we only support a single reads input for for this spark tool.
Then I have to run with only one individual/sample. Is this supposed to be?
Also, notice that I'm running SPARK locally there. What would be the master_url needed to run the cluster version, as in:
./gatk HaplotypeCallerSpark -I hdfs://path/to/input.bam -O hdfs://path/to/output.bam \
-- \
--spark-runner SPARK --spark-master <master_url> \
--num-executors 5 --executor-cores 2 --executor-memory 4g \
--conf spark.executor.memoryOverhead=600
-
Hi madzayasodara, HaplotypeCaller is meant to be run with one sample and joint genotyped later in the pipeline. Please see this doc: https://gatk.broadinstitute.org/hc/en-us/articles/360035890411-Calling-variants-on-cohorts-of-samples-using-the-HaplotypeCaller-in-GVCF-mode
Please sign in to leave a comment.
1 comment