Dear GATK Community,
I understand there has already been a post regarding adapter trimming previously (Do I need to perform fastqc and adapters trimming before gatk pipeline?). However, many tools only appear to accept FASTQ files as input for adapter trimming rather than other formats, such as uBAM.
Therefore, I was hoping for some help with the following questions:
1) How can permanent adapter trimming be approached if one is following GATK Best Practices with uBAM as the initial file format?
I understand one can hard clip adapter sequences with SamToFastq prior to alignment. However, following alignment, all hard clips are transformed to soft clips with MergeBamAlignment and therefore, the adapter sequences would still be present in downstream data preprocessing and variant discovery.
2) If it is not possible to generate a clean, mapped BAM with hard-clipped adapters ((How to) Map and clean up short read sequence data efficiently), what effect would adapter sequences in the merged BAM have on downstream data preprocessing and variant discovery steps, such as those steps mentioned in Data pre-processing for variant discovery and Somatic short variant discovery (SNVs + Indels)?
Thank you for your time and help.
Please sign in to leave a comment.