GATK GRCh38 Reference
I am trying to find out more information on the GATK GRCh38 reference version available from AWS-iGenomes (as I intend to use it in Sarek) but a lot of information regarding it appears to be inaccessible/removed. What are the main changes with GATK's GRCh38 genome version? From what I understand, you perform genome masking and in several other blog posts: https://gatk.broadinstitute.org/hc/en-us/articles/4410456501915-Functional-equivalence-in-DRAGEN-GATK, you state that you use the illumina masked reference. Is GATK GRCh38 identical to the illumina masked reference?
-
Hi Yasir K
Currently we do not support AWS platforms therefore we cannot help on the availability of resources within AWS. Our original hg38 contained all unlocalized contigs, alt contigs and HLA sequences as a whole and our recommendation followed with bwa mem and using alt-aware mapping. However we recognized that alt contig handling is not optimal in most cases therefore it is not practical to continue using that reference any further. Our DRAGEN-GATK FE workflows recommend using the masked hg38 recommendation from Illumina which can be obtained by masking the reference genome during index creating step if DRAGMAP is used as mapper or can be masked by other tools before index creation by BWA.
You may be able to reach the fasta sequence and masking regions from the link below
I hope this helps.
Regards.
Please sign in to leave a comment.
1 comment