Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

(How to) Map and clean up short read sequence data efficiently Follow


  • Avatar

    Dear Team,

    I have a WES illumina data set for variant calling. And both forward and reverse files have adapter read-through at the 3' end as attached fastqc report. In that case, if I do not trim those adapter parts (e.g. using a trimmer like trimmomatic), except run MarkIlluminaAdapters would be enough since this step will mark those parts with XT and later disregard in the alignment. Am I correct??

    I have fastq files(R1 & R2) and what I did was run FastqToSam with RG information to create uBAM -> MarkIlluminaAdapters -> SamToFastq -> BWA mem -> MergeBamAlignment 

    Thank you

    Best Regards



    R1 file

    R2 file 





    Comment actions Permalink
  • Avatar
    Mike Keehan

    Dear GATK team

    I wonder if you could help me. I am following these best practices to produce alignments intended for structural variant calling. The recommended workflow suggests to change the PRIMARY_ALIGNMENT_STRATEGY from the default of BestMapQ to MostDistant.   For SV calling we are essentially trying to work out the most probable insert size and surely we would want to use the pair that gave the best alignments i.e. BestMapQ(?) . I wonder if you could furthur advise on the merits of PRIMARY_ALIGNMENT_STRATEGY in regards to SV calling as many users might follow the tutorial to claim "best practise" adherence.  



    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk