Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

No downsampling with max-reads-per-alignment-start

0

3 comments

  • Avatar
    Pamela Bretscher

    Hi Liang Ye,

    I believe it would be expected not to see any downsampling occurring when the --max-reads-per-alignment-start argument is set as high as 1000. For example, if reads were 150 bases long, there could be as many as 150,000 reads overlapping that site (If 1000 reads can start at position 500, you can also have 1000 at position 499, 498, etc. which will all overlap that site). I hope this is helpful in explaining the results you are seeing, please let me know if this does not make sense.

    Kind regards,

    Pamela

    0
    Comment actions Permalink
  • Avatar
    Liang Ye

    Thanks Pamela! Thought about that too but I didn't realize tagmentation works so well for a 1.5kb amplicon. It looks there are starting sites all over the positions though the number at each site varies a lot.

    0
    Comment actions Permalink
  • Avatar
    Pamela Bretscher

    Hi Liang Ye,

    Yes, I think your results are expected given your data and arguments used. If you would like to remove reads to reach a specific coverage, you can use DownSampleSam or you could try specifying a lower --max-reads-per-alignment-start to reach your desired coverage. 

    Kind regards,

    Pamela

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk