I am very new to GATK and not sure where to begin for my analysis.
I sequenced targeted area's of 55 different potato varieties (tetraploid, highly heterozygous) using 15 small PCR amplicons. These amplicons are 180bp small and thus, one illumina read covers almost the entire amplicon.
These amplicons were select because they contain multiple SNPs which together can discriminate between all or most alleles present in potato.
Some alleles will be shared among many of my samples, some will be more unique. Some samples will have 4 different alleles for a given amplicon, some will have just two (and perhaps in a 1:3 ratio).
I have my data assembled against a reference sequence using NGEN from DNASTAR. I have 55 bam files (one for each sample). I am able to generate SNP tables using DNASTAR's Arraystar, but haplotype calling is not possible.
My question here is: what is good literature or youtube tutorials to get familiar with such a analysis, probably using Haplotypecaller.
Any suggestions would be very appreciated.
Please sign in to leave a comment.