GtcToVcf takes an Illumina GTC file and converts it to a VCF file using several supporting files. A GTC file is an Illumina-specific file containing called genotypes in AA/AB/BB format. A VCF, aka Variant Calling Format, is a text file for storing how a sequenced sample differs from the reference genome.
Usage example:
java -jar picard.jar GtcToVcf \
INPUT=input.gtc \
REFERENCE_SEQUENCE=reference.fasta \
OUTPUT=output.vcf \
EXTENDED_ILLUMINA_MANIFEST=chip_name.extended.csv \
CLUSTER_FILE=chip_name.egt \
ILLUMINA_NORMALIZATION_MANIFEST=chip_name.bpm.csv \
SAMPLE_ALIAS=my_sample_alias \
Category Genotyping Arrays Manipulation
Overview
Class to convert a GTC file and a BPM file to a VCF file.GtcToVcf (Picard) specific arguments
This table summarizes the command-line arguments that are specific to this tool. For more details on each argument, see the list further down below the table or click on an argument name to jump directly to that entry in the list.
Argument name(s) | Default value | Summary | |
---|---|---|---|
Required Arguments | |||
--CLUSTER_FILE -CF |
null | An Illumina cluster file (egt) | |
--EXTENDED_ILLUMINA_MANIFEST -MANIFEST |
null | An Extended Illumina Manifest file (csv). This is an extended version of the Illumina manifest it contains additional reference-specific fields | |
--ILLUMINA_NORMALIZATION_MANIFEST -NORM_MANIFEST |
null | An Illumina bead pool manifest (a manifest containing the Illumina normalization ids) (bpm.csv) | |
--INPUT -I |
null | GTC file to be converted | |
--OUTPUT -O |
null | The output VCF file to write. | |
--REFERENCE_SEQUENCE -R |
null | Reference sequence file. | |
--SAMPLE_ALIAS |
null | The sample alias | |
Optional Tool Arguments | |||
--ANALYSIS_VERSION_NUMBER |
null | The analysis version of the data used to generate this VCF | |
--arguments_file |
[] | read one or more arguments files and add them to the command line | |
--DO_NOT_ALLOW_CALLS_ON_ZEROED_OUT_ASSAYS |
false | Causes the program to fail if it finds a case where there is a call on an assay that is flagged as 'zeroed-out' in the Illumina cluster file. | |
--EXPECTED_GENDER -E_GENDER |
null | The expected gender for this sample. | |
--FINGERPRINT_GENOTYPES_VCF_FILE -FP_VCF |
null | The fingerprint VCF for this sample | |
--GENDER_GTC -G_GTC |
null | An optional GTC file that was generated by calling the chip using a cluster file designed to optimize gender calling. | |
--help -h |
false | display the help message | |
--version |
false | display the version number for this tool | |
Optional Common Arguments | |||
--COMPRESSION_LEVEL |
5 | Compression level for all compressed files created (e.g. BAM and VCF). | |
--CREATE_INDEX |
false | Whether to create a BAM index when writing a coordinate-sorted BAM file. | |
--CREATE_MD5_FILE |
false | Whether to create an MD5 digest for any BAM or FASTQ files created. | |
--GA4GH_CLIENT_SECRETS |
client_secrets.json | Google Genomics API client_secrets.json file path. | |
--MAX_RECORDS_IN_RAM |
500000 | When writing files that need to be sorted, this will specify the number of records stored in RAM before spilling to disk. Increasing this number reduces the number of file handles needed to sort the file, and increases the amount of RAM needed. | |
--QUIET |
false | Whether to suppress job-summary info on System.err. | |
--TMP_DIR |
[] | One or more directories with space available to be used by this program for temporary storage of working files | |
--USE_JDK_DEFLATER -use_jdk_deflater |
false | Use the JDK Deflater instead of the Intel Deflater for writing compressed output | |
--USE_JDK_INFLATER -use_jdk_inflater |
false | Use the JDK Inflater instead of the Intel Inflater for reading compressed input | |
--VALIDATION_STRINGENCY |
STRICT | Validation stringency for all SAM files read by this program. Setting stringency to SILENT can improve performance when processing a BAM file in which variable-length data (read, qualities, tags) do not otherwise need to be decoded. | |
--VERBOSITY |
INFO | Control verbosity of logging. | |
Advanced Arguments | |||
--showHidden |
false | display hidden arguments |
Argument details
Arguments in this list are specific to this tool. Keep in mind that other arguments are available that are shared with other tools (e.g. command-line GATK arguments); see Inherited arguments above.
--ANALYSIS_VERSION_NUMBER / NA
The analysis version of the data used to generate this VCF
Integer null
--arguments_file / NA
read one or more arguments files and add them to the command line
List[File] []
--CLUSTER_FILE / -CF
An Illumina cluster file (egt)
R File null
--COMPRESSION_LEVEL / NA
Compression level for all compressed files created (e.g. BAM and VCF).
int 5 [ [ -∞ ∞ ] ]
--CREATE_INDEX / NA
Whether to create a BAM index when writing a coordinate-sorted BAM file.
Boolean false
--CREATE_MD5_FILE / NA
Whether to create an MD5 digest for any BAM or FASTQ files created.
boolean false
--DO_NOT_ALLOW_CALLS_ON_ZEROED_OUT_ASSAYS / NA
Causes the program to fail if it finds a case where there is a call on an assay that is flagged as 'zeroed-out' in the Illumina cluster file.
boolean false
--EXPECTED_GENDER / -E_GENDER
The expected gender for this sample.
String null
--EXTENDED_ILLUMINA_MANIFEST / -MANIFEST
An Extended Illumina Manifest file (csv). This is an extended version of the Illumina manifest it contains additional reference-specific fields
R File null
--FINGERPRINT_GENOTYPES_VCF_FILE / -FP_VCF
The fingerprint VCF for this sample
File null
--GA4GH_CLIENT_SECRETS / NA
Google Genomics API client_secrets.json file path.
String client_secrets.json
--GENDER_GTC / -G_GTC
An optional GTC file that was generated by calling the chip using a cluster file designed to optimize gender calling.
File null
--help / -h
display the help message
boolean false
--ILLUMINA_NORMALIZATION_MANIFEST / -NORM_MANIFEST
An Illumina bead pool manifest (a manifest containing the Illumina normalization ids) (bpm.csv)
R File null
--INPUT / -I
GTC file to be converted
R File null
--MAX_RECORDS_IN_RAM / NA
When writing files that need to be sorted, this will specify the number of records stored in RAM before spilling to disk. Increasing this number reduces the number of file handles needed to sort the file, and increases the amount of RAM needed.
Integer 500000 [ [ -∞ ∞ ] ]
--OUTPUT / -O
The output VCF file to write.
R File null
--QUIET / NA
Whether to suppress job-summary info on System.err.
Boolean false
--REFERENCE_SEQUENCE / -R
Reference sequence file.
R File null
--SAMPLE_ALIAS / NA
The sample alias
R String null
--showHidden / -showHidden
display hidden arguments
boolean false
--TMP_DIR / NA
One or more directories with space available to be used by this program for temporary storage of working files
List[File] []
--USE_JDK_DEFLATER / -use_jdk_deflater
Use the JDK Deflater instead of the Intel Deflater for writing compressed output
Boolean false
--USE_JDK_INFLATER / -use_jdk_inflater
Use the JDK Inflater instead of the Intel Inflater for reading compressed input
Boolean false
--VALIDATION_STRINGENCY / NA
Validation stringency for all SAM files read by this program. Setting stringency to SILENT can improve performance when processing a BAM file in which variable-length data (read, qualities, tags) do not otherwise need to be decoded.
The --VALIDATION_STRINGENCY argument is an enumerated type (ValidationStringency), which can have one of the following values:
- STRICT
- LENIENT
- SILENT
ValidationStringency STRICT
--VERBOSITY / NA
Control verbosity of logging.
The --VERBOSITY argument is an enumerated type (LogLevel), which can have one of the following values:
- ERROR
- WARNING
- INFO
- DEBUG
LogLevel INFO
--version / NA
display the version number for this tool
boolean false
GATK version 4.1.3.0 built at Sat, 23 Nov 2019 16:20:54 -0500.
0 comments
Please sign in to leave a comment.