Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

(How to) Call common and rare germline copy number variants Follow

6 comments

  • Avatar
    Enrico Cocchi

    Is there any way to get the log2 output instead of the CN from PostprocessGermlineCNVCalls?

    1
    Comment actions Permalink
  • Avatar
    Calvin Hung

    Hi, I believe the *.tsv files in the tutorial_11684.tar.gz either from the GoogleDrive or from the FTP site are deprecated and cannot run through GermlineCNVCaller since GATK v4.1.x.x. I managed to hack the format and fixed it myself. You might want to update the tutorial files as well.

    0
    Comment actions Permalink
  • Avatar
    Ruqian Lyu

    Hi, 

    Thanks for the great tutorial.

    I'm trying to run the pipeline for 300 low coverage samples (~5X). At the step of running GermlineCNVCaller, I'm seeing the tool keeps increasing the number of epochs because CNV calling is not converged. It is now at 50 epochs. Is this something expected or is it possible the optimisation procedure has been "trapped"  ?


    0
    Comment actions Permalink
  • Avatar
    Ju Jose

    Thanks for the tutorial! Could you help me to understand her the NA19017.chr20sub.bam file was prepared? Is it just a BWA mapping reads? Does it got the sort and marked duplicates steps?

    0
    Comment actions Permalink
  • Avatar
    astiac

    I would like to know where I can get the following files:

    mappability-track regions file (in either .bed or .bed.gz format).
    segmental-duplication-track regions file (in either .bed or .bed.gz format).
    contig-ploidy-priors_contig_ploidy_priors.tsv 

    0
    Comment actions Permalink
  • Avatar
    jfarrell

    This link below is broken  from above.  Has there been an update with the Tutorial which matches the latest WDL pipeline?

    ftp://ftp.broadinstitute.org/tutorials/dataset

     

    Download tutorial_11684.tar.gz either from the GoogleDrive or from the FTP site. The bundle includes data for Notebook #11685 and Notebook #11686. To access the ftp site, leave the password field blank. If the GoogleDrive link is broken, please let us know. The tutorial also requires the GRCh38 reference FASTA, dictionary and index. These are available from the GATK Resource Bundle. The example data is from the 1000 Genomes project Phase 3 aligned to GRCh38.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk