SelectVariants error
I need some help with GATK 4.1.7.0 I am running for multiple sample of plant genome having 8 chr and 7000 contigs . I used pipeline from https://gencore.bio.nyu.edu/variant-calling-pipeline-gatk4/
And I am using 6 different genotype so I used HaplotypeCaller in -ERC mode with command :
java -jar gatk-package-4.1.7.0-local.jar HaplotypeCaller -R Ca_genome.fasta -I C104_deduped.bam -O C104.raw_variants.g.vcf -ERC GVCF
this command generated raw VCF file of approx. 8 gb in size than I combined all the vcf files CombineGVCFs which generated 56 GB final Raw vcf file. Now I am extracting SNP and INDELS for hard filtering using command:
java -jar gatk-package-4.1.7.0-local.jar SelectVariants -R Ca_genome.fasta -V RAW_VCF/All.cohort.g.vcf --select-type SNP -O All.RAW.INDEL.vcf
this command generated just 400 kb file and took 7.6 mins
4:24:18.743 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/nchakraborty/Desktop/gatk-4.1.7.0/gatk-package-4.1.7.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jul 25, 2020 2:24:18 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
14:24:18.839 INFO SelectVariants - ------------------------------------------------------------
14:24:18.840 INFO SelectVariants - The Genome Analysis Toolkit (GATK) v4.1.7.0
14:24:18.840 INFO SelectVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
14:24:18.840 INFO SelectVariants - Executing as nchakraborty@NChakraborty on Linux v5.3.0-62-generic amd64
14:24:18.840 INFO SelectVariants - Java runtime: OpenJDK 64-Bit Server VM v11.0.7+10-post-Ubuntu-2ubuntu218.04
14:24:18.840 INFO SelectVariants - Start Date/Time: 25 July 2020 at 2:24:18 PM IST
14:24:18.840 INFO SelectVariants - ------------------------------------------------------------
14:24:18.840 INFO SelectVariants - ------------------------------------------------------------
14:24:18.841 INFO SelectVariants - HTSJDK Version: 2.21.2
14:24:18.841 INFO SelectVariants - Picard Version: 2.21.9
14:24:18.841 INFO SelectVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
14:24:18.841 INFO SelectVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
14:24:18.841 INFO SelectVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
14:24:18.841 INFO SelectVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
14:24:18.841 INFO SelectVariants - Deflater: IntelDeflater
14:24:18.841 INFO SelectVariants - Inflater: IntelInflater
14:24:18.841 INFO SelectVariants - GCS max retries/reopens: 20
14:24:18.841 INFO SelectVariants - Requester pays: disabled
14:24:18.841 INFO SelectVariants - Initializing engine
14:24:19.109 INFO FeatureManager - Using codec VCFCodec to read file file:///home/nchakraborty/Desktop/gatk-4.1.7.0/../RT/RAW_VCF/All.cohort.g.vcf
14:24:21.518 INFO SelectVariants - Done initializing engine
14:24:21.608 INFO ProgressMeter - Starting traversal
14:24:21.608 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
14:32:13.059 INFO ProgressMeter - unmapped 7.9 0 0.0
14:32:13.059 INFO ProgressMeter - Traversal complete. Processed 0 total variants in 7.9 minutes.
14:32:13.068 INFO SelectVariants - Shutting down engine
[25 July 2020 at 2:32:13 PM IST] org.broadinstitute.hellbender.tools.walkers.variantutils.SelectVariants done. Elapsed time: 7.91 minutes.
Runtime.totalMemory()=1950351360
where I am making mistake or as far I understand RAW SNP wont be such small file here the last 4 lines of output file
##contig=<ID=NC_011163.1,length=125319,assembly=Ca_genome.fasta>
##reference=file:///home/nchakraborty/Desktop/gatk-4.1.7.0/../RT/Ca_genome.fasta
##source=CombineGVCFs
##source=SelectVariants
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT C104_Sample6 CPS1_Sample1 ICCV2_Sample3 JG62_Sample4 K850_Sample2 WR315_Sample5
-
Please see our best practices documentation for up to date best usage for what you are trying to do: https://gatk.broadinstitute.org/hc/en-us/articles/360035535932-Germline-short-variant-discovery-SNPs-Indels-
Please sign in to leave a comment.
1 comment