PlotDenoisedCopyRatios lead to empty result for non-human results
REQUIRED for all errors and issues:
a) GATK version used: docker image spacecade7/tutorial_11682_11683:gatk4.0.1.1
b) Exact command used:
c) Entire program log:
I am using Drospophila samples to use gatk somatic CNV pipeline. The plot created by PlotDenoisedCopyRatios function turned out to be empty
When tracing back previous step files, I noticed that the DFG2.standardizedCR.tsv and DFG2.denoisedCR.tsv have entries like
CONTIG START END LOG2_COPY_RATIO
2L 1 23513712 -15.784881
2R 1 25286936 -15.237691
3L 1 28110227 -15.292732
3R 1 32079331 -15.153327
4 1 1348131 -18.814539
X 1 23542271 -16.758953
Y 1 3667352 13.513223
where in the tutorial, https://gatk.broadinstitute.org/hc/en-us/articles/360035531092--How-to-part-I-Sensitively-detect-copy-ratio-alterations-and-allelic-segments, hcc1143_T_clean.denoisedCR.tsv file contain results such as
CONTIG START END LOG2_COPY_RATIO
chr1 68839 70260 1.862435
chr1 925690 926265 -0.133840
chr1 929903 930588 0.114137
chr1 930787 931341 0.031698
chr1 935520 936148 0.127558
chr1 938788 939202 0.652190
chr1 939203 939712 -0.204635
chr1 940892 941558 -0.231103
chr1 942524 943155 0.321343
meaning the calculations should break the chromosome into parts rather than analyzing it as a whole.
Could you give me some idea of what could be wrong with the analysis?
Prior to this step, I followed
-
Hi Yuwei Bao
Can you provide us the first couple of intervals from the output of CollectFragmentCounts results?
Can you also post your command line for each step so that we can check a bit more deeper?
Regards.
-
Hi Gökalp:
Thank you very much for your response! I got it figured out. The step I did incorrectly was in the first step PreprocessIntervals. I want to share it here if others run into the same issue.
$gatk_path PreprocessIntervals \
-R $REF \
--bin-length 0 \
--interval-merging-rule OVERLAPPING_ONLY \
-O $out/$SAMPLE.preprocessed.interval_list"
Because I am working with non-human genome and don't have a interval list -L, if I keep the --bin-length 0, which will lead to the whole genome was not broken into parts. I was able to have PlotDenoisedCopyRatios work after customizing the --bin-length.
During the exploration, I also run into an issuejava.lang.OutofMemoryError: GC overhead limit exceeded
which can be fixed by increasing heap size such as
--java-options "-Xmx100g"
Thanks for the great tool and help!
Yuwei
Please sign in to leave a comment.
2 comments