Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

SelectVariants v4.1.6.0 doesn't select the variants as expected

0

5 comments

  • Avatar
    Bhanu Gandham

    Hi ABours

     

    I am looking into this and will get back to you soon.

    0
    Comment actions Permalink
  • Avatar
    Bhanu Gandham

    HI ABours

     

    Can you please share a few vcf variant records for the type 5(spanning deletion + snp). Also please share the exact command you used to generate that vcf i.e. SelectVariants with select-type-to-include SNP. This will help us troubleshoot this issue. 

    0
    Comment actions Permalink
  • Avatar
    ABours

    Hi Bhanu,

    Thank you for looking into this.

    The command I used was (gatk4 her is v4.1.6.0):

    java -Xms24G -Xmx24G -jar ${gatk4} SelectVariants -R ${ref} -V final_v4.1.6.0.vcf.gz -select-type SNP -O final_snp_v4.1.6.0.vcf.gz

    and these are a few of those type 5  sites from the subsetted vcf:

    chr_1 3632 . A C,* 55.53 . AC=1,7;AF=0.050,0.350;AN=20;BaseQRankSum=0.060;DP=233;ExcessHet=7.9825;FS=2.137;InbreedingCoeff=-0.0586;MLEAC=2,12;MLEAF=0.100,0.600;MQ=51.08;MQRankSum=-1.981e+00;QD=0.87;ReadPosRankSum=-1.465e+00;SOR=0.809 GT:AD:DP:GQ:PGT:PID:PL:PS 2|2:0,0,3:3:9:1|1:3628_CGGGATGGGACAGAGATCCCT_C:135,135,135,9,9,0:3628 ./.:11,0,0:11:.:.:.:0,0,0,0,0,0 ./.:20,0,0:20:.:.:.:0,0,0,0,0,0 0|1:5,2,0:7:55:0|1:3632_A_C:55,0,204,70,210,280:3632 0|2:5,0,9:14:99:0|1:3628_CGGGATGGGACAGAGATCCCT_C:348,363,573,0,210,182:3628 ./.:22,0,0:22:.:.:.:0,0,0,0,0,0 ./.:12,0,0:12:.:.:.:0,0,0,0,0,0 0/0:17,0,0:17:0:.:.:0,0,155,0,155,155 ./.:0,0,0:0:.:.:.:0,0,0,0,0,0 0/0:5,0,0:5:0:.:.:0,0,117,0,117,117 ./.:16,0,0:16:.:.:.:0,0,0,0,0,0 0|2:2,0,7:9:62:0|1:3628_CGGGATGGGACAGAGATCCCT_C:289,295,379,0,84,62:3628 ./.:13,0,0:13:.:.:.:0,0,0,0,0,0 0|2:3,0,3:6:95:0|1:3629_GGGATGGGA_G:95,104,230,0,126,117:3629 ./.:28,0,0:28:.:.:.:0,0,0,0,0,0 0/0:6,0,0:6:15:.:.:0,15,225,15,225,225 0|2:8,0,5:13:99:0|1:3628_CGGGATGGGACAGAGATCCCT_C:188,212,511,0,300,283:3628 0|2:3,0,9:12:80:0|1:3628_CGGGATGGGACAGAGATCCCT_C:370,379,487,0,108,80:3628 ./.:11,0,0:11:.:.:.:0,0,0,0,0,0
    chr_1 3633 . T *,C,A 1545.11 . AC=7,16,1;AF=0.250,0.571,0.036;AN=28;BaseQRankSum=1.15;DP=164;ExcessHet=4.0770;FS=0.000;InbreedingCoeff=0.1758;MLEAC=9,19,1;MLEAF=0.321,0.679,0.036;MQ=36.05;MQRankSum=0.00;QD=16.79;ReadPosRankSum=1.15;SOR=0.774 GT:AD:DP:GQ:PGT:PID:PL:PS 1|1:0,3,0,0:3:9:1|1:3628_CGGGATGGGACAGAGATCCCT_C:135,9,0,135,9,135,135,9,135,135:3628 ./.:11,0,0,0:11:.:.:.:0,0,0,0,0,0,0,0,0,0 2|2:0,0,2,0:2:6:1|1:3633_T_C:90,90,90,6,6,0,90,90,6,90:3633 0|3:6,0,0,2:8:52:0|1:3632_A_C:52,70,303,70,303,303,0,233,233,227:3632 1/2:0,9,5,0:14:99:.:.:553,184,183,365,0,347,554,199,364,563 2|2:0,0,3,0:3:9:1|1:3633_T_C:119,119,119,9,9,0,119,119,9,119:3633 2|2:0,0,1,0:1:3:1|1:3633_T_C:42,42,42,3,3,0,42,42,3,42:3633 2|2:0,0,8,0:8:24:1|1:3633_T_C:352,352,352,24,24,0,352,352,24,352:3633 ./.:0,0,0,0:0:.:.:.:0,0,0,0,0,0,0,0,0,0 0|2:1,0,3,0:4:20:0|1:3633_T_C:108,111,139,0,29,20,111,139,29,139:3633 2/2:0,0,4,0:4:13:.:.:181,181,181,13,13,0,181,181,13,181 0|1:2,7,0,0:9:62:0|1:3628_CGGGATGGGACAGAGATCCCT_C:289,0,62,295,84,379,295,84,379,379:3628 ./.:13,0,0,0:13:.:.:.:0,0,0,0,0,0,0,0,0,0 1/2:0,3,3,0:6:96:.:.:196,99,116,104,0,96,202,114,105,214 ./.:28,0,0,0:28:.:.:.:0,0,0,0,0,0,0,0,0,0 2/2:0,0,6,0:6:18:.:.:245,245,245,18,18,0,245,245,18,245 0|1:8,5,0,0:13:99:0|1:3628_CGGGATGGGACAGAGATCCCT_C:188,0,283,212,300,511,212,300,511,511:3628 1/2:1,9,2,0:12:56:.:.:434,59,56,353,0,373,437,84,378,462 ./.:11,0,0,0:11:.:.:.:0,0,0,0,0,0,0,0,0,0
    chr_1 3736 . G A,* 232.35 . AC=2,1;AF=0.053,0.026;AN=38;BaseQRankSum=1.05;DP=229;ExcessHet=3.3995;FS=2.781;InbreedingCoeff=-0.1209;MLEAC=2,1;MLEAF=0.053,0.026;MQ=34.01;MQRankSum=1.56;QD=5.16;ReadPosRankSum=1.25;SOR=1.312 GT:AD:DP:GQ:PGT:PID:PL:PS 0/0:3,0,0:3:9:.:.:0,9,74,9,74,74 0/0:10,0,0:10:30:.:.:0,30,327,30,327,327 0/0:6,0,0:6:18:.:.:0,18,241,18,241,241 0/0:25,0,0:25:66:.:.:0,66,990,66,990,990 0|1:7,3,2:12:99:0|1:3628_CGGGATGGGACAGAGATCCCT_C:105,0,270,126,279,405:3628 0/0:5,0,0:5:15:.:.:0,15,140,15,140,140 0/0:5,0,0:5:15:.:.:0,15,150,15,150,150 0|2:15,1,6:22:99:0|1:3633_T_C:203,248,841,0,592,574:3633 0/0:9,0,0:9:27:.:.:0,27,310,27,310,310 0/0:2,0,0:2:6:.:.:0,6,37,6,37,37 0/0:15,0,0:15:42:.:.:0,42,630,42,630,630 0|1:7,4,0:11:99:0|1:3628_CGGGATGGGACAGAGATCCCT_C:147,0,267,168,279,447:3628 0/0:3,0,0:3:9:.:.:0,9,92,9,92,92 0/0:9,0,0:9:27:.:.:0,27,319,27,319,319 0/0:39,0,0:39:80:.:.:0,80,1355,80,1355,1355 0/0:10,0,0:10:0:.:.:0,0,151,0,151,151 0/0:17,0,0:17:10:.:.:0,10,534,10,534,534 0/0:9,0,0:9:24:.:.:0,24,360,24,360,360 0/0:16,0,0:16:29:.:.:0,29,536,29,536,536

    if you want to have the numbers, on the output file of my command I call ~30.3 million SNPs, and of this 2,5% are type 5 sites.

    I hope this helps,

     

    0
    Comment actions Permalink
  • Avatar
    Bhanu Gandham

    Hi ABours

     

    Looking into the code it seems like the tool is doing what its supposed to. To clarify, when we say MIXED we mean SNP/Indel + Symbolic. Symbolic variation is represented like this <*>.

    SelectVariants actually ignores spanning deletion, which is represented by *.

     

    I agree this is confusing, but I hope this explanation helps,

    0
    Comment actions Permalink
  • Avatar
    ABours

    Hi Bhanu,

    Ok, it's nice to know that the tool does what the code says. also the confusion remains, but I can work with it.

    I would say I'm just questioning whether the code is logical for what one expects. Of course, I realize that I'll have to work with what I've got (at least at the moment). I mainly just wanted to provide some feedback/suggestion to one of your tools in your very nice toolset.

    Thanks for check this and best,

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk