How can i remove chr_Un* from somatic cnv plots?
Can you please provide
a) GATK version used - gatk 4.1.6.0
.(snapshot and seg file attached)
Hello
how can i remove the chr_Un* regions in my copy number plots?
My model final segment file doesn't contain chr_Un fields.
`
sa@sa-mbp ~ % less 8c444e42-2836-489c-8d01-aa8bb6842ec1_CNVSomaticPairWorkflow_8fc68c47-e081-4fde-ba35-69594b5a09f9_call-ModelSegmentsTumor_out_353.hg38.modelFinal.seg | grep -v "^@" | head
CONTIG START END NUM_POINTS_COPY_RATIO NUM_POINTS_ALLELE_FRACTION LOG2_COPY_RATIO_POSTERIOR_10 LOG2_COPY_RATIO_POSTERIOR_50 LOG2_COPY_RATIO_POSTERIOR_90 MINOR_ALLELE_FRACTION_POSTERIOR_10 MINOR_ALLELE_FRACTION_POSTERIOR_50 MINOR_ALLELE_FRACTION_POSTERIOR_90
chr1 14371 248937297 28109 2392 0.025404 0.035754 0.042479 0.489942 0.496955 0.499533
chr2 667105 15947289 839 95 0.029410 0.037730 0.046698 0.468089 0.488506 0.497851
chr2 16549224 89245753 6359 523 -0.230558 -0.227346 -0.221495 0.384030 0.393627 0.400412
chr2 89268176 89268698 1 0 -29.904706 -29.622139 -29.383854 NaN NaN NaN
chr2 89851557 169206858 5459 339 -0.223373 -0.219118 -0.213770 0.382595 0.391044 0.399919
chr2 169206859 241508948 7952 645 0.035662 0.047521 0.054743 0.487259 0.492968 0.498885
chr3 2238744 89482096 8140 595 -0.241909 -0.231896 -0.229027 0.383767 0.391069 0.400615
chr3 90202096 198177415 8257 782 0.039150 0.045634 0.050130 0.476704 0.493682 0.499261
chr4 1071379 8962092 1229 92 -0.194266 -0.182421 -0.174564 0.436532 0.443561 0.457972
sa@sa-mbp ~ % less 8c444e42-2836-489c-8d01-aa8bb6842ec1_CNVSomaticPairWorkflow_8fc68c47-e081-4fde-ba35-69594b5a09f9_call-ModelSegmentsTumor_out_353.hg38.modelFinal.seg | grep -v "^@" | tail
chr19 280762 47981260 11936 664 0.040364 0.047642 0.056114 0.474418 0.494501 0.498649
chr19 47991630 58370149 3301 234 0.255312 0.261823 0.270161 0.408744 0.419721 0.428110
chr20 478289 48638171 5009 370 0.023828 0.033652 0.043369 0.482503 0.495500 0.499050
chr20 48639522 64328210 1732 139 0.212166 0.222440 0.230756 0.414498 0.421943 0.429038
chr21 9026773 40369500 1750 168 0.048497 0.059627 0.072271 0.475200 0.491640 0.499016
chr21 40692425 41508499 88 8 0.895854 0.920153 0.938943 0.288503 0.301085 0.323544
chr21 41712533 46267397 854 82 0.112577 0.122776 0.138301 0.459592 0.475863 0.496405
chr22 15527854 50799370 5399 340 0.054828 0.062518 0.070282 0.482935 0.492056 0.498594
chrX 283856 156028226 8918 344 0.043786 0.048812 0.054462 0.474132 0.488578 0.497314
chrY 11332178 11396756 3 0 -0.281178 -0.143497 -0.004919 NaN NaN NaN
`
b) Exact GATK commands used
`Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx6000m -jar /root/gatk.jar PlotModeledSegments --denoised-copy-ratios /cromwell_root/fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-DenoiseReadCountsTumor/353.hg38.denoisedCR.tsv --allelic-counts /cromwell_root/fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-ModelSegmentsTumor/out/353.hg38.hets.tsv --segments /cromwell_root/fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-ModelSegmentsTumor/out/353.hg38.modelFinal.seg --sequence-dictionary /cromwell_root/gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dict --minimum-contig-length 1000000 --output out --output-prefix 353.hg38`
c) The entire error log if applicable
`
2020/07/28 02:33:29 Starting container setup. 2020/07/28 02:33:34 Done container setup. 2020/07/28 02:33:35 Starting localization. 2020/07/28 02:33:42 Localization script execution started... 2020/07/28 02:33:42 Localizing input gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dict -> /cromwell_root/gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dict 2020/07/28 02:33:43 Localizing input gs://fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-PlotModeledSegmentsTumor/script -> /cromwell_root/script 2020/07/28 02:33:44 Localizing input gs://fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-DenoiseReadCountsTumor/353.hg38.denoisedCR.tsv -> /cromwell_root/fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-DenoiseReadCountsTumor/353.hg38.denoisedCR.tsv 2020/07/28 02:33:45 Localizing input gs://fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-ModelSegmentsTumor/out/353.hg38.hets.tsv -> /cromwell_root/fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-ModelSegmentsTumor/out/353.hg38.hets.tsv 2020/07/28 02:33:47 Localizing input gs://fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-ModelSegmentsTumor/out/353.hg38.modelFinal.seg -> /cromwell_root/fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-ModelSegmentsTumor/out/353.hg38.modelFinal.seg Copying gs://fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-ModelSegmentsTumor/out/353.hg38.modelFinal.seg... / [0/1 files][ 0.0 B/590.2 KiB] 0% Done / [1/1 files][590.2 KiB/590.2 KiB] 100% Done Operation completed over 1 objects/590.2 KiB. 2020/07/28 02:33:49 Localization script execution complete. 2020/07/28 02:33:52 Done localization. 2020/07/28 02:33:53 Running user action: docker run -v /mnt/local-disk:/cromwell_root --entrypoint= us.gcr.io/broad-gatk/gatk@sha256:2c0e2ba20c9beb58842ba2149efc29059bc52a5178ce05debf0f38238c0bde86 /bin/bash /cromwell_root/script Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/cromwell_root/tmp.fb074081 02:33:57.924 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.1.6.0-local.jar!/com/intel/gkl/native/libgkl_compression.so 02:33:58.235 INFO PlotModeledSegments - ------------------------------------------------------------ 02:33:58.236 INFO PlotModeledSegments - The Genome Analysis Toolkit (GATK) v4.1.6.0 02:33:58.236 INFO PlotModeledSegments - For support and documentation go to https://software.broadinstitute.org/gatk/ 02:33:58.237 INFO PlotModeledSegments - Executing as root@ca5cc1d6b4e5 on Linux v4.19.112+ amd64 02:33:58.237 INFO PlotModeledSegments - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_212-8u212-b03-0ubuntu1.16.04.1-b03 02:33:58.237 INFO PlotModeledSegments - Start Date/Time: July 28, 2020 2:33:57 AM UTC 02:33:58.237 INFO PlotModeledSegments - ------------------------------------------------------------ 02:33:58.237 INFO PlotModeledSegments - ------------------------------------------------------------ 02:33:58.240 INFO PlotModeledSegments - HTSJDK Version: 2.21.2 02:33:58.240 INFO PlotModeledSegments - Picard Version: 2.21.9 02:33:58.240 INFO PlotModeledSegments - HTSJDK Defaults.COMPRESSION_LEVEL : 2 02:33:58.240 INFO PlotModeledSegments - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false 02:33:58.245 INFO PlotModeledSegments - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true 02:33:58.246 INFO PlotModeledSegments - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false 02:33:58.246 INFO PlotModeledSegments - Deflater: IntelDeflater 02:33:58.246 INFO PlotModeledSegments - Inflater: IntelInflater 02:33:58.246 INFO PlotModeledSegments - GCS max retries/reopens: 20 02:33:58.246 INFO PlotModeledSegments - Requester pays: disabled 02:33:58.246 INFO PlotModeledSegments - Initializing engine 02:33:58.247 INFO PlotModeledSegments - Done initializing engine 02:33:58.265 INFO PlotModeledSegments - Reading and validating input files... 02:34:00.585 INFO PlotModeledSegments - Contigs above length threshold: {chr1=248956422, chr2=242193529, chr3=198295559, chr4=190214555, chr5=181538259, chr6=170805979, chr7=159345973, chr8=145138636, chr9=138394717, chr10=133797422, chr11=135086622, chr12=133275309, chr13=114364328, chr14=107043718, chr15=101991189, chr16=90338345, chr17=83257441, chr18=80373285, chr19=58617616, chr20=64444167, chr21=46709983, chr22=50818468, chrX=156040895, chrY=57227415, chr16_KI270728v1_random=1872759, chr5_GL339449v2_alt=1612928, chr6_GL000250v2_alt=4672374, chr7_KI270803v1_alt=1111570, chr14_KI270847v1_alt=1511111, chr14_KI270846v1_alt=1351393, chr16_KI270853v1_alt=2659700, chr17_KI270857v1_alt=2877074, chr17_GL000258v2_alt=1821992, chr5_KI270897v1_alt=1144418, chr6_GL000251v2_alt=4795265, chr15_KI270905v1_alt=5161414, chr17_KI270908v1_alt=1423190, chr6_GL000252v2_alt=4604811, chr19_GL949748v2_alt=1064304, chr6_GL000253v2_alt=4677643, chr19_GL949749v2_alt=1091841, chr6_GL000254v2_alt=4827813, chr19_GL949750v2_alt=1066390, chr6_GL000255v2_alt=4606388, chr19_GL949751v2_alt=1002683, chr6_GL000256v2_alt=4929269, chr19_KI270938v1_alt=1066800} 02:34:00.763 INFO PlotModeledSegments - Writing plot to /cromwell_root/out/353.hg38.modeled.png... 02:34:02.224 INFO PlotModeledSegments - PlotModeledSegments complete. 02:34:02.224 INFO PlotModeledSegments - Shutting down engine [July 28, 2020 2:34:02 AM UTC] org.broadinstitute.hellbender.tools.copynumber.plotting.PlotModeledSegments done. Elapsed time: 0.07 minutes. Runtime.totalMemory()=311951360 Using GATK jar /root/gatk.jar defined in environment variable GATK_LOCAL_JAR Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx6000m -jar /root/gatk.jar PlotModeledSegments --denoised-copy-ratios /cromwell_root/fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-DenoiseReadCountsTumor/353.hg38.denoisedCR.tsv --allelic-counts /cromwell_root/fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-ModelSegmentsTumor/out/353.hg38.hets.tsv --segments /cromwell_root/fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-ModelSegmentsTumor/out/353.hg38.modelFinal.seg --sequence-dictionary /cromwell_root/gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dict --minimum-contig-length 1000000 --output out --output-prefix 353.hg38 2020/07/28 02:34:03 Starting delocalization. 2020/07/28 02:34:04 Delocalization script execution started... 2020/07/28 02:34:04 Delocalizing output /cromwell_root/memory_retry_rc -> gs://fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-PlotModeledSegmentsTumor/memory_retry_rc 2020/07/28 02:34:04 Delocalizing output /cromwell_root/rc -> gs://fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-PlotModeledSegmentsTumor/rc 2020/07/28 02:34:06 Delocalizing output /cromwell_root/stdout -> gs://fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-PlotModeledSegmentsTumor/stdout 2020/07/28 02:34:07 Delocalizing output /cromwell_root/stderr -> gs://fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-PlotModeledSegmentsTumor/stderr 2020/07/28 02:34:08 Delocalizing output /cromwell_root/out/353.hg38.modeled.png -> gs://fc-ac4624cb-a8fc-49a2-b071-d3a0ae799418/8c444e42-2836-489c-8d01-aa8bb6842ec1/CNVSomaticPairWorkflow/8fc68c47-e081-4fde-ba35-69594b5a09f9/call-PlotModeledSegmentsTumor/out/353.hg38.modeled.png 2020/07/28 02:34:09 Delocalization script execution complete. 2020/07/28 02:34:10 Done delocalization.
`
Thanks
Sam
-
Hi sahuno, I am looking at this tutorial and I found a note that may help you with this:
Section 8: Plot modeled copy ratio and allelic fraction segments with PlotModeledSegments
Comments on select parameters - The tutorial provides the
--sequence-dictionary
that matches the GRCh38 reference used in mapping - To omit alternate and decoy contigs from the plots, the tutorial adjusts the--minimum-contig-length
from the default value of 1,000,000 to 46,709,983, the length of the smallest of GRCh38's primary assembly contigs.Also below that are options for interactively visualizing the data since the option -L for intervals is not available.
-
Genevieve Brandt (she/her) thanks!
that was helpful!
SA
Please sign in to leave a comment.
2 comments