Getting Started
Best Practices Workflows
-
Getting started with GATK4
GATK — properly pronounced "Gee-ay-tee-kay" (/dʒi•eɪ•ti•keɪ/) and not "Gat-ka... -
About the GATK Best Practices
This document provides important context information about how the GATK Best ... -
GATK Best Practices for Structural Variation Discovery on Single Samples
GATK-SV is a structural variation discovery pipeline for Illumina short-read ... -
Mitochondrial short variant discovery (SNVs + Indels)
The mitochondrial genome poses several challenges to the identification and u... -
Somatic short variant discovery (SNVs + Indels)
Identify somatic short variants (SNVs and Indels) in one or more tumor sample... -
Germline short variant discovery (SNPs + Indels)
Purpose Identify germline short variants (SNPs and Indels) in one or more in...
Tutorials
-
(How to) Run germline single sample short variant discovery in DRAGEN mode
DRAGEN-GATK introduced several new changes to GATK, including two new tools, ... -
(How to) Generate an unmapped BAM from FASTQ or aligned BAM
Objective Here we outline how to generate an unmapped BAM (uBAM) from either... -
(Notebook) Intro to using Mutect2 for somatic data
In this hands-on tutorial — the Terra Workspace of which is available here — ... -
(How to) Install all software packages required to follow the GATK Best Practices
Objective Install all software packages required to follow the GATK Best Pra... -
(How to) Map and clean up short read sequence data efficiently
(How to) Map and clean up short read sequence data efficiently In this tut... -
(How to) Map reads to a reference with alternate contigs like GRCH38
This exploratory tutorial provides instructions and example data to map shor...
Computing Platforms
-
GATK on IBM Cloud
Running Cromwell on IBM Cloud IBM Cloud (formerly called IBM Bluemix and IBM... -
GATK on the cloud with Terra
Terra (formerly called FireCloud) is a cloud-based bioinformatics platform th... -
Running GATK on the cloud (Overview)
There are many ways to run GATK for your analyses, and the best option for yo... -
GATK on the cloud with Azure
We aim to provide the research community with a range of options for running ... -
GATK on local HPC infrastructure
GATK can be deployed on high performance computing (HPC) systems using an HPC... -
GATK on Alibaba Cloud
Alibaba Cloud, the largest cloud provider in China, has developed open-source...
FAQ
-
Which training sets arguments should I use for running VQSR?
This document describes the resource datasets and arguments that we recommend... -
FAQ for Mutect2
Here is a collection of questions related to Mutect2 that we frequently find ... -
What is physical phasing?
In the format field of a PGT (Pre-Implantation Genetic Testing) VCF, you may ... -
Where can I get the GATK source code? Is it open-source?
YES! Starting with GATK4 it is fully open-source under a BSD 3-clause license... -
Are there Best Practices for calling variants in RNAseq data?
We are working on updating our recommended workflow for calling variants in R... -
Does GATK work on non-diploid organisms?
YES! In general most GATK tools don't care about ploidy. The major exception...
Note that the information in this documentation guide is targeted at end-users. For developers, the source code and related resources are available on GitHub.