GatherVcfs ( version 4.1.8.1 ) with INPUT.java.lang.NullPointerException
Hi GATK Teams,
I tried to run GatherVcfs ( version 4.1.8.1 ) to gather all chromosome phased vcf files of which heads were recoded but got error "There was a problem with gathering the INPUT.java.lang.NullPointerException".
My script is as following:
gatk=/PATH/gatk
time $gatk --java-options "-Xmx1G" GatherVcfs \
-I JC.chr1.filter.vcf.gz \
-I JC.chr2.filter.vcf.gz \
-I JC.chr3.filter.vcf.gz \
-I JC.chr4.filter.vcf.gz \
-I JC.chr5.filter.vcf.gz \
-I JC.chr6.filter.vcf.gz \
-I JC.chr7.filter.vcf.gz \
-I JC.chr8.filter.vcf.gz \
-I JC.chr9.filter.vcf.gz \
-I JC.chr10.filter.vcf.gz \
-I JC.chr11.filter.vcf.gz \
-I JC.chr12.filter.vcf.gz \
-I JC.chr13.filter.vcf.gz \
-I JC.chr14.filter.vcf.gz \
-I JC.chr15.filter.vcf.gz \
-I JC.chr16.filter.vcf.gz \
-I JC.chr17.filter.vcf.gz \
-I JC.chr18.filter.vcf.gz \
-I JC.chr19.filter.vcf.gz \
-I JC.chr20.filter.vcf.gz \
-I JC.chr21.filter.vcf.gz \
-I JC.chr22.filter.vcf.gz \
--CREATE_INDEX false \
-O SW709.filter.phased.vcf.gz
And the following is the running log:
12:57:20.481 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/PATH/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
[Sat Aug 22 12:57:20 CST 2020] GatherVcfs --INPUT JC.chr1.filter.vcf.gz --INPUT JC.chr2.filter.vcf.gz --INPUT JC.chr3.filter.vcf.gz --INPUT JC.chr4.filter.vcf.gz --INPUT JC.chr5.filter.vcf.gz --INPUT JC.chr6.filter.vcf.gz --INPUT JC.chr7.filter.vcf.gz --INPUT JC.chr8.filter.vcf.gz --INPUT JC.chr9.filter.vcf.gz --INPUT JC.chr10.filter.vcf.gz --INPUT JC.chr11.filter.vcf.gz --INPUT JC.chr12.filter.vcf.gz --INPUT JC.chr13.filter.vcf.gz --INPUT JC.chr14.filter.vcf.gz --INPUT JC.chr15.filter.vcf.gz --INPUT JC.chr16.filter.vcf.gz --INPUT JC.chr17.filter.vcf.gz --INPUT JC.chr18.filter.vcf.gz --INPUT JC.chr19.filter.vcf.gz --INPUT JC.chr20.filter.vcf.gz --INPUT JC.chr21.filter.vcf.gz --INPUT JC.chr22.filter.vcf.gz --OUTPUT SW709.filter.phased.vcf.gz --CREATE_INDEX false --REORDER_INPUT_BY_FIRST_VARIANT false --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 2 --MAX_RECORDS_IN_RAM 500000 --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
Aug 22, 2020 12:57:21 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
[Sat Aug 22 12:57:21 CST 2020] Executing as qianxiaobo@cngb-compute-m16-7.cngb.sz.hpc on Linux 2.6.32-696.30.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_101-b13; Deflater: Intel; Inflater: Intel; Provider GCS is available; Picard version: Version:4.1.8.1
INFO 2020-08-22 12:57:21 GatherVcfs Checking inputs.
INFO 2020-08-22 12:57:22 GatherVcfs Checking file headers and first records to ensure compatibility.
ERROR 2020-08-22 12:57:22 GatherVcfs There was a problem with gathering the INPUT.java.lang.NullPointerException
[Sat Aug 22 12:57:22 CST 2020] picard.vcf.GatherVcfs done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=643301376
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Using GATK jar /PATH/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx1G -jar /PATH/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar GatherVcfs -I JC.chr1.filter.vcf.gz -I JC.chr2.filter.vcf.gz -I JC.chr3.filter.vcf.gz -I JC.chr4.filter.vcf.gz -I JC.chr5.filter.vcf.gz -I JC.chr6.filter.vcf.gz -I JC.chr7.filter.vcf.gz -I JC.chr8.filter.vcf.gz -I JC.chr9.filter.vcf.gz -I JC.chr10.filter.vcf.gz -I JC.chr11.filter.vcf.gz -I JC.chr12.filter.vcf.gz -I JC.chr13.filter.vcf.gz -I JC.chr14.filter.vcf.gz -I JC.chr15.filter.vcf.gz -I JC.chr16.filter.vcf.gz -I JC.chr17.filter.vcf.gz -I JC.chr18.filter.vcf.gz -I JC.chr19.filter.vcf.gz -I JC.chr20.filter.vcf.gz -I JC.chr21.filter.vcf.gz -I JC.chr22.filter.vcf.gz --CREATE_INDEX false -O SW709.filter.phased.vcf.gz
There is no other ERROR information in this log file so that I do not know how to solve it.
Please note that I rehead the chromosome vcf file like this:
##fileformat=VCFv4.2
##filedate=20200624
##source="beagle.13Mar20.38e.jar"
##INFO=<ID=AF,Number=A,Type=Float,Description="Estimated ALT Allele Frequencies">
##INFO=<ID=DR2,Number=1,Type=Float,Description="Dosage R-Squared: estimated squared correlation between estimated REF dose [P(RA) + 2*P(RR)] and true REF dose">
##INFO=<ID=IMP,Number=0,Type=Flag,Description="Imputed marker">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=DS,Number=A,Type=Float,Description="estimated ALT dose [P(RA) + 2*P(AA)]">
I am not sure this issue will impact on GatherVcfs or not.
Hope for your reply.
-
Hi QIANxiaobo, could you check if your VCF files are properly sorted? We have a tool, SortVcf, that does this. Also, what do you mean about changing the header? What command did you use?
-
Dear Brandt,
I ran into the same issue and then trying to use SortVcf to sort every input before using GatherVcfs to concat them.
Then I hit this error ( I added the java option -DGATK_STACKTRACE_ON_USER_EXCEPTION=true ):
o get help, see http://broadinstitute.github.io/picard/index.html#GettingHelpjava.lang.NullPointerExceptionat htsjdk.variant.variantcontext.VariantContextComparator.compare(VariantContextComparator.java:87)at htsjdk.variant.variantcontext.VariantContextComparator.compare(VariantContextComparator.java:22)at java.base/java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)at java.base/java.util.TimSort.sort(TimSort.java:234)at java.base/java.util.ArraysParallelSortHelpers$FJObject$Sorter.compute(ArraysParallelSortHelpers.java:145)at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746)at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)at java.base/java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:408)at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:736)at java.base/java.util.Arrays.parallelSort(Arrays.java:1183)at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:247)at htsjdk.samtools.util.SortingCollection.add(SortingCollection.java:182)at picard.vcf.SortVcf.sortInputs(SortVcf.java:165)at picard.vcf.SortVcf.doWork(SortVcf.java:98)at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)at org.broadinstitute.hellbender.cmdline.PicardCommandLineProgramExecutor.instanceMain(PicardCommandLineProgramExecutor.java:25)at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)at org.broadinstitute.hellbender.Main.main(Main.java:289)The GATK version I used is: 4.1.8.1What could be wrong with my vcf file ? -
Hi Yangyxt,
What are you trying to combine here? GatherVcfs only works on VCFs that have exactly the same set of samples and totally discrete sets of loci.
You can check your VCFs for any other issues with ValidateVariants.
Best,
Genevieve
Please sign in to leave a comment.
3 comments