Filter out reads that are over-soft-clipped
Category Read Filters
OverviewFilter out reads where the number of bases without soft-clips (M, I, X, and = CIGAR operators) is lower than a threshold.
This filter is intended to filter out reads that are potentially from foreign organisms. From experience with sequencing of human DNA we have found cases of contamination by bacterial organisms; the symptoms of such contamination are a class of reads with only a small number of aligned bases and additionally many soft-clipped bases. This filter is intended to remove such reads.
Note: Consecutive soft-clipped blocks are treated as a single block. For example, 1S2S10M1S2S is treated as 3S10M3S
OverclippedReadFilter specific arguments
This table summarizes the command-line arguments that are specific to this tool. For more details on each argument, see the list further down below the table or click on an argument name to jump directly to that entry in the list.
|Argument name(s)||Default value||Summary|
|Optional Tool Arguments|
||false||Allow a read to be filtered out based on having only 1 soft-clipped block. By default, both ends must have a soft-clipped block, setting this flag requires only 1 soft-clipped block|
||30||Minimum number of aligned bases|
Arguments in this list are specific to this tool. Keep in mind that other arguments are available that are shared with other tools (e.g. command-line GATK arguments); see Inherited arguments above.
Allow a read to be filtered out based on having only 1 soft-clipped block. By default, both ends must have a soft-clipped block, setting this flag requires only 1 soft-clipped block
--filter-too-short / NA
Minimum number of aligned bases
int 30 [ [ -∞ ∞ ] ]
GATK version 18.104.22.168 built at 25-15-2019 03:15:29.