How do I use different intervals with GenomicsDBImport?
AnsweredHi,
I have 3 exome data callsets produced using different capture kits with different intervals lists, say: kit-1, kit-2 and kit-3 and I would like to check if the commands below are still consistent with the best practices:
gatk --java-options "-Xmx4g -Xms4g" GenomicsDBImport \
--genomicsdb-workspace-path myDatabase \
-L kit-1_intervals.bed \
-L kit-2_intervals.bed \
--merge-input-intervals \
-V sample-1_kit-1.g.vcf.gz \
-V sample-2_kit-1.g.vcf.gz \
-V sample-1_kit-2.g.vcf.gz \
-V sample-2_kit-2.g.vcf.gz
and
gatk --java-options "-Xmx4g -Xms4g" GenomicsDBImport \ -V sample-1_kit-3.g.vcf.gz \ -V sample-2_kit-3.g.vcf.gz \
-L kit-3_intervals.bed \
--merge-input-intervals \ --genomicsdb-update-workspace-path myDatabase
Thanks for help.
Regards
-
Ahmed S. Chakroun the --genomicsdb-update-workspace-path can only be used to add more samples, not more intervals. You must use the same intervals that are originally used. More information can be found in the Tool Docs: https://gatk.broadinstitute.org/hc/en-us/articles/360057439331-GenomicsDBImport
Best,
Genevieve
-
So, it's not recommended to mix exome data captured using different kits (say Agilent SureSelect V6 and V7) within the same GenomicsDB, isn't it?
Thank you very much for your help.
Kind regards.Ahmed
-
Depends on your research, here is more information about how GATK uses intervals. The GATK engine will merge overlapping intervals.
Please sign in to leave a comment.
3 comments