GenotypeGVCFs stalls while using --all-sites
Hello,
I am using GATK4.1.0.0 and trying to combine GVCFs using GenotypeGVCFs. I need to genotype at all sites (not just SNPs) for popgen measures (pi, dxy). When I run without --all-sites, it runs fine, but when I run with --all-sites it stalls after 200K. I have also tried with --include-non-variant-sites. Same problem. I've also tried with different chromosomes. Here is the command, where $ref and $db directories are defined earlier in the script. The output is also pasted below. Any advice would be very much appreciated! Thanks!
gatk --java-options "-Xmx8G" GenotypeGVCFs \
-R $ref \
-V gendb://$db \
-L Chr01 \
--all-sites \
-O Chr01_allsites_raw.vcf
Sep 15, 2020 5:26:28 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
WARNING: Failed to detect whether we are running on Google Compute Engine.
java.net.ConnectException: Network is unreachable
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
at sun.net.www.http.HttpClient.New(HttpClient.java:308)
at sun.net.www.http.HttpClient.New(HttpClient.java:326)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
at shaded.cloud_nio.com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:104)
at shaded.cloud_nio.com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
at shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials.runningOnComputeEngine(ComputeEngineCredentials.java:210)
at shaded.cloud_nio.com.google.auth.oauth2.DefaultCredentialsProvider.tryGetComputeCredentials(DefaultCredentialsProvider.java:290)
at shaded.cloud_nio.com.google.auth.oauth2.DefaultCredentialsProvider.getDefaultCredentialsUnsynchronized(DefaultCredentialsProvider.java:207)
at shaded.cloud_nio.com.google.auth.oauth2.DefaultCredentialsProvider.getDefaultCredentials(DefaultCredentialsProvider.java:124)
at shaded.cloud_nio.com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(GoogleCredentials.java:127)
at shaded.cloud_nio.com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(GoogleCredentials.java:100)
at com.google.cloud.ServiceOptions.defaultCredentials(ServiceOptions.java:304)
at com.google.cloud.ServiceOptions.<init>(ServiceOptions.java:278)
at com.google.cloud.storage.StorageOptions.<init>(StorageOptions.java:83)
at com.google.cloud.storage.StorageOptions.<init>(StorageOptions.java:31)
at com.google.cloud.storage.StorageOptions$Builder.build(StorageOptions.java:78)
at org.broadinstitute.hellbender.utils.gcs.BucketUtils.setGlobalNIODefaultOptions(BucketUtils.java:353)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:182)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
at org.broadinstitute.hellbender.Main.main(Main.java:291)
17:26:28.688 INFO GenotypeGVCFs - ------------------------------------------------------------
17:26:28.689 INFO GenotypeGVCFs - The Genome Analysis Toolkit (GATK) v4.1.0.0
17:26:28.689 INFO GenotypeGVCFs - For support and documentation go to https://software.broadinstitute.org/gatk/
17:26:28.689 INFO GenotypeGVCFs - Executing as rachbay@r080.pvt.bridges.psc.edu on Linux v3.10.0-957.27.2.el7.x86_64 amd64
17:26:28.689 INFO GenotypeGVCFs - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_73-b02
17:26:28.689 INFO GenotypeGVCFs - Start Date/Time: September 15, 2020 5:26:28 PM EDT
17:26:28.689 INFO GenotypeGVCFs - ------------------------------------------------------------
17:26:28.689 INFO GenotypeGVCFs - ------------------------------------------------------------
17:26:28.690 INFO GenotypeGVCFs - HTSJDK Version: 2.18.2
17:26:28.690 INFO GenotypeGVCFs - Picard Version: 2.18.25
17:26:28.690 INFO GenotypeGVCFs - HTSJDK Defaults.COMPRESSION_LEVEL : 2
17:26:28.690 INFO GenotypeGVCFs - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
17:26:28.690 INFO GenotypeGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
17:26:28.690 INFO GenotypeGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
17:26:28.690 INFO GenotypeGVCFs - Deflater: IntelDeflater
17:26:28.690 INFO GenotypeGVCFs - Inflater: IntelInflater
17:26:28.690 INFO GenotypeGVCFs - GCS max retries/reopens: 20
17:26:28.690 INFO GenotypeGVCFs - Requester pays: disabled
17:26:28.690 INFO GenotypeGVCFs - Initializing engine
WARNING: No valid combination operation found for INFO field DS - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field InbreedingCoeff - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field MLEAC - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field MLEAF - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field DS - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field InbreedingCoeff - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field MLEAC - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field MLEAF - the field will NOT be part of INFO fields in the generated VCF records
17:27:15.128 INFO IntervalArgumentCollection - Processing 42612672 bp from intervals
17:27:15.138 INFO GenotypeGVCFs - Done initializing engine
17:27:15.344 INFO ProgressMeter - Starting traversal
17:27:15.345 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
WARNING: No valid combination operation found for INFO field DS - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field InbreedingCoeff - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field MLEAC - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field MLEAF - the field will NOT be part of INFO fields in the generated VCF records
Chromosome Chr01 position 74432 (TileDB column 74431) has too many alleles in the combined VCF record : 61 : current limit : 50. Fields, such as PL, with length equal to the number of genotypes will NOT be added for this location.
Chromosome Chr01 position 143975 (TileDB column 143974) has too many alleles in the combined VCF record : 55 : current limit : 50. Fields, such as PL, with length equal to the number of genotypes will NOT be added for this location.
GENOMICSDB_TIMER,GenomicsDB iterator next() timer,Wall-clock time(s),685.9508288570295,Cpu time(s),252.06587346199646
17:42:10.112 INFO ProgressMeter - Chr01:1000 14.9 1000 67.1
17:42:29.410 INFO ProgressMeter - Chr01:2000 15.2 2000 131.3
17:43:18.380 INFO ProgressMeter - Chr01:6000 16.1 6000 373.8
17:43:29.260 INFO ProgressMeter - Chr01:8000 16.2 8000 492.9
17:44:01.384 INFO ProgressMeter - Chr01:9000 16.8 9000 536.8
17:44:18.603 INFO ProgressMeter - Chr01:10000 17.1 10000 586.4
17:44:34.819 INFO ProgressMeter - Chr01:11000 17.3 11000 634.9
17:44:51.079 INFO ProgressMeter - Chr01:12000 17.6 12000 682.0
17:45:04.837 INFO ProgressMeter - Chr01:14000 17.8 14000 785.4
17:45:15.947 INFO ProgressMeter - Chr01:15000 18.0 15000 832.9
17:45:34.110 INFO ProgressMeter - Chr01:17000 18.3 17000 928.3
17:45:46.279 INFO ProgressMeter - Chr01:20000 18.5 20000 1080.2
17:45:58.621 INFO ProgressMeter - Chr01:24000 18.7 24000 1282.0
17:46:11.520 INFO ProgressMeter - Chr01:28000 18.9 28000 1478.6
17:46:25.252 INFO ProgressMeter - Chr01:32000 19.2 32000 1669.7
17:46:35.329 INFO ProgressMeter - Chr01:34000 19.3 34000 1758.6
17:46:50.055 INFO ProgressMeter - Chr01:36000 19.6 36000 1838.8
17:47:01.808 INFO ProgressMeter - Chr01:37000 19.8 37000 1871.1
17:47:11.945 INFO ProgressMeter - Chr01:38000 19.9 38000 1905.4
17:47:29.569 INFO ProgressMeter - Chr01:40000 20.2 40000 1976.6
17:47:39.999 INFO ProgressMeter - Chr01:42000 20.4 42000 2057.7
17:47:53.201 INFO ProgressMeter - Chr01:44000 20.6 44000 2132.7
17:48:14.118 INFO ProgressMeter - Chr01:45000 21.0 45000 2145.0
17:48:40.184 INFO ProgressMeter - Chr01:46000 21.4 46000 2148.1
17:48:53.388 INFO ProgressMeter - Chr01:47000 21.6 47000 2172.5
17:49:11.885 INFO ProgressMeter - Chr01:49000 21.9 49000 2233.1
17:49:31.071 INFO ProgressMeter - Chr01:51000 22.3 51000 2290.9
17:49:43.809 INFO ProgressMeter - Chr01:53000 22.5 53000 2358.2
17:49:54.697 INFO ProgressMeter - Chr01:54000 22.7 54000 2383.5
17:50:21.743 INFO ProgressMeter - Chr01:55000 23.1 55000 2380.3
17:50:37.773 INFO ProgressMeter - Chr01:57000 23.4 57000 2438.6
17:51:03.435 INFO ProgressMeter - Chr01:59000 23.8 59000 2478.8
17:51:19.038 INFO ProgressMeter - Chr01:60000 24.1 60000 2493.6
17:51:50.084 INFO ProgressMeter - Chr01:61000 24.6 61000 2481.8
17:52:22.112 INFO ProgressMeter - Chr01:62000 25.1 62000 2468.9
17:52:32.430 INFO ProgressMeter - Chr01:64000 25.3 64000 2531.2
17:52:46.686 INFO ProgressMeter - Chr01:68000 25.5 68000 2664.3
17:52:58.951 INFO ProgressMeter - Chr01:72000 25.7 72000 2798.6
17:53:07.034 WARN MinimalGenotypingEngine - Attempting to genotype more than 50 alleles. Site will be skipped at location Chr01:74432
17:53:09.390 INFO ProgressMeter - Chr01:75000 25.9 75000 2895.7
17:53:22.277 INFO ProgressMeter - Chr01:79000 26.1 79000 3025.0
17:53:32.367 INFO ProgressMeter - Chr01:82000 26.3 82000 3119.8
17:53:42.633 INFO ProgressMeter - Chr01:84000 26.5 84000 3175.2
17:53:53.576 INFO ProgressMeter - Chr01:87000 26.6 87000 3266.1
17:54:03.672 INFO ProgressMeter - Chr01:90000 26.8 90000 3357.5
17:54:16.631 INFO ProgressMeter - Chr01:94000 27.0 94000 3478.7
17:54:28.404 INFO ProgressMeter - Chr01:97000 27.2 97000 3563.9
17:54:42.970 INFO ProgressMeter - Chr01:101000 27.5 101000 3678.0
17:54:55.067 INFO ProgressMeter - Chr01:105000 27.7 105000 3795.8
17:55:07.141 INFO ProgressMeter - Chr01:108000 27.9 108000 3876.1
17:55:18.010 INFO ProgressMeter - Chr01:111000 28.0 111000 3958.0
17:55:30.441 INFO ProgressMeter - Chr01:115000 28.3 115000 4070.6
17:55:42.281 INFO ProgressMeter - Chr01:119000 28.4 119000 4182.9
17:55:54.376 INFO ProgressMeter - Chr01:123000 28.7 123000 4293.1
17:56:07.351 INFO ProgressMeter - Chr01:127000 28.9 127000 4399.5
17:56:17.666 INFO ProgressMeter - Chr01:130000 29.0 130000 4476.8
17:56:28.967 INFO ProgressMeter - Chr01:133000 29.2 133000 4550.6
17:56:46.308 INFO ProgressMeter - Chr01:135000 29.5 135000 4573.8
17:56:59.394 INFO ProgressMeter - Chr01:137000 29.7 137000 4607.5
17:57:17.916 INFO ProgressMeter - Chr01:139000 30.0 139000 4626.7
17:57:31.680 INFO ProgressMeter - Chr01:141000 30.3 141000 4657.7
17:57:46.253 WARN MinimalGenotypingEngine - Attempting to genotype more than 50 alleles. Site will be skipped at location Chr01:143975
17:57:46.929 INFO ProgressMeter - Chr01:144000 30.5 144000 4717.2
17:58:00.245 INFO ProgressMeter - Chr01:148000 30.7 148000 4813.3
17:58:11.713 INFO ProgressMeter - Chr01:151000 30.9 151000 4880.5
17:58:22.299 INFO ProgressMeter - Chr01:154000 31.1 154000 4949.2
17:58:37.859 INFO ProgressMeter - Chr01:157000 31.4 157000 5003.9
17:58:48.262 INFO ProgressMeter - Chr01:160000 31.5 160000 5071.5
17:59:00.787 INFO ProgressMeter - Chr01:164000 31.8 164000 5164.2
17:59:14.582 INFO ProgressMeter - Chr01:167000 32.0 167000 5220.8
17:59:26.127 INFO ProgressMeter - Chr01:168000 32.2 168000 5220.7
17:59:44.531 INFO ProgressMeter - Chr01:170000 32.5 170000 5233.0
18:00:01.756 INFO ProgressMeter - Chr01:172000 32.8 172000 5248.1
18:00:14.106 INFO ProgressMeter - Chr01:174000 33.0 174000 5276.0
18:00:26.962 INFO ProgressMeter - Chr01:178000 33.2 178000 5362.5
18:00:38.501 INFO ProgressMeter - Chr01:181000 33.4 181000 5421.4
18:00:49.075 INFO ProgressMeter - Chr01:184000 33.6 184000 5482.4
18:01:00.092 INFO ProgressMeter - Chr01:187000 33.7 187000 5541.4
18:01:11.563 INFO ProgressMeter - Chr01:190000 33.9 190000 5598.6
18:01:22.578 INFO ProgressMeter - Chr01:193000 34.1 193000 5656.4
18:01:34.090 INFO ProgressMeter - Chr01:196000 34.3 196000 5712.2
18:01:47.073 INFO ProgressMeter - Chr01:200000 34.5 200000 5792.3
WARNING: No valid combination operation found for INFO field DS - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field InbreedingCoeff - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field MLEAC - the field will NOT be part of INFO fields in the generated VCF records
WARNING: No valid combination operation found for INFO field MLEAF - the field will NOT be part of INFO fields in the generated VCF records
-
Rachael Bay have you tried increasing your memory allocation to be larger (--java-options "-Xmx8G") ? If there is an issue with memory allocation another trick is to specify a temporary directory with the option --tmp-dir
-
Hi Genevieve Brandt - thanks for the response! We've tried with up to 96G for java and both with and without the --tmp-dir option. Any other ideas?
-
Rachael Bay since it stalls at the same location each time, are there many alleles at that location? It might be useful to check out DepthofCoverage or another tool and look at that area in the genome.
Another thing you can try is to update your GATK version since we continually improve the tools.
-
Hi Genevieve Brandt, I don't think it's the location, since we stall at the exact same position (200000) when we try a different chromosome. I've also tried updating my GATK version and same issue. We're really stuck here, any help would be great!
-
Rachael Bay have you had any successful runs using --all-sites with any of your chromosomes? Or do they all fail at position 200000?
Does the job fail, or just stall? And for how long?
-
Genevieve Brandt I have only tried the first 2 chromosomes, but they both stalled at position 200000. The jobs do not actually fail, they just stall. They reach 200000 quickly (an hour or so) and I let them run for 24 hours.
-
Rachael Bay thank you for this info. I am continuing to look into solutions and will let you know when I get more information.
-
Hi Rachael Bay,
Sometimes early positions in the chromosome do not have data and we are wondering if this is an issue with writing an enormous amount of information once it starts outputting data.
- Could you try writing to a .gz VCF file? It should have better performance.
- How big is your genomicsDB, how many samples are you working with?
- Could you give more information about your machine that you are running this on? How much memory is available to the machine and is it the exact amount that you are giving for the java memory? The machine will need slightly more memory than you give in the java options to account for the C++ memory.
- Could you start in the middle of the chromosome and see if it runs? Perhaps from position 500,000 to 1,000,000
-
Hi Genevieve Brandt,
*I tried writing to a .gz file - same issue
*The genomicsDB directory is large - looks like 793G. We have 471 individuals.
*I am running on the Bridges cluster through the NSF XSEDE program. I have tried both regular memory and large memory. On large memory I requested 128G from the machine while asking for 96 in the java options, so there should be extra memory there.
*Does the genomicsDB have to be remade to start in the middle of the chromosome? I tried specifying the interval you suggested above with Haplotype caller, but it JUST stalled (never got the progress meter showing position)
-
Hi Rachael Bay, thank you for those updates. I have some follow up questions to try to get to the bottom of this.
- We want to determine if this is an issue with the genomicsDB or GenotypeGVCFs with --all-sites. Could you run SelectVariants at position 200k-201k and see if it gets any results?
- What species are you working with? Is there anything weird happening at the reference at position 200k?
-
Hi Genevieve Brandt. When I look at that region, it actually looks like the last position is 1999988. This is exactly the same across multiple chromosomes (I have tried three of six). I am working with Zostera marina, but I don't really see why there would be something weird happening at that position in multiple chromosomes.
-
Did you run SelectVariants yet?
I am confused about what you mean by the last position is 1999988, since we are looking into position 200K? Am I misunderstanding, or did you possibly make a typo?
-
Genevieve Brandt Sorry, it was a typo sort of. Just looking right now. The last position recorded for Chr01 is 199988 and for Chr03 is 199998, so not EXACTLY the same position. Let me know if it would be at all useful for me to try this on the other chromosomes. I'm really at a loss for what else to try!
-
Also, I guess I'm confused about how running SelectVariants would see anything different since it would just return a subset of the raw vcf file, right?
-
Rachael Bay thanks for clarifying!
Running SelectVariants, though not helpful for your final result, helps us in our troubleshooting of this issue. We are trying to determine if this runtime issue is from the GenomicsDB or if it is from GenotypeGVCFs.
-
Genevieve Brandt thanks, that make sense. When I run SelectVariants for Chr01:200000-201000 it stalls with no variants output.
-
Rachael Bay thanks for the update, I'll bring this up with my team and let you know when I have more information
-
Hi Rachael Bay, here are a few more things that you could check that would give us more information:
- While you run GenotypeGVCFs and it is stalled, could you inspect the java process and what is going on? The command is jstack with the process ID: (https://docs.oracle.com/en/java/javase/13/docs/specs/man/jstack.html). Look at the jstack a few different times to try to figure out what is going on while it is stalled.
- Could you give more information about how you ran the import to your GenomicsDB? Was it in batches?
- Could you create a new GenomicsDB workspace with all the individuals on a small interval, for example 195,000-200,000 and see if the same issue persists?
And something to try for a possible improvement:
- What version of GATK did you use to create the GenomicsDB? We have been improving GenomicsDB quite a bit and our most up to date version is 4.1.9.0. If you re-run the import you may see better performance and it may help the GenotypeGVCFs process. [You will need to use the option --genomicsdb-shared-posixfs-optimizations for your cluster]
Please sign in to leave a comment.
18 comments