Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

ASEReadCount Only Outputting the First Chromosome

Answered
0

3 comments

  • Avatar
    Leslie Youtsey

    I couldnt add this to my post because I had reached the max number of characters. 

    d) File output (head and tail of one file) 

    contig position variantID refAllele altAllele refCounaltCount totalCount lowMAPQDepth lowBaseQDepth rawDepth otherBases improperPairs

    NC_059305.1 88554 . C T 1 2 3 0 0

    NC_059305.1 306126 . G A 6 3 9 0 0

    NC_059305.1 306130 . C T 5 3 8 0 0

    NC_059305.1 318568 . C T 3 5 8 0 17 0 9

    NC_059305.1 379937 . C T 3 2 5 0 0

    NC_059305.1 1098226 . G T 1 2 3 0 0

    NC_059305.1 1098246 . A G 1 2 3 0 0

    NC_059305.1 1100651 . A G 1 3 4 0 0

    NC_059305.1 1129066 . C T 3 3 6 0 1

    .

    .

    .

    NC_059305.1 2903861 . A G 1 1 2 0 1

    NC_059305.1 2903985 . A G 2 3 5 0 1

    NC_059305.1 2904143 . T C 8 4 12 0 13 0 1

    NC_059305.1 2904663 . A G 1 4 5 0 3

    NC_059305.1 2904681 . A G 1 4 5 0 3

    NC_059305.1 2904730 . T C 5 4 9 0 10 0 1

    NC_059305.1 2904743 . A G 5 4 9 0 10 0 1

    NC_059305.1 2904792 . C G 4 4 8 0 12 0 4

    NC_059305.1 2904796 . C T 4 4 8 0 12 0 4

    NC_059305.1 2904797 . A T 4 4 8 0 12 0 4

    0
    Comment actions Permalink
  • Avatar
    Anthony DiCi

    Hi Leslie Youtsey,

    Thank you again for writing to the GATK forum! Let’s see how we can solve this.

    I identified a seemingly identical issue that occurred to another user last year. They also were getting the “more than one variant context at position” error message when running ASWReadCounter. If you scroll to the bottom of this GitHub thread, bw2 commented with a potential solution. 

    The error seems to be caused by the input VCF “having two SNPs on separate rows but with the same position… or by an INDEL that overlaps a SNP”.

    They suggest first filtering out indels and multiallelic sites before running ASEReadCounter. Since it seems to have led them to success, I suggest first trying this out.

    The command they included to filter out the indels and multiallelic sites is as follows:

    gatk SelectVariants  -R hg38.fa.  --variant dna_variants.vcf.bgz --restrict-alleles-to BIALLELIC -select 'vc.getHetCount()==1'                  --select-type-to-include SNP -O dna_variants.selected.vcf.bgz

    bcftools norm --rm-dup all dna_variants.selected.vcf.bgz | bgzip > out.vcf.gz

    I hope this helps! Please let me know if this leads you to success. If you have any questions in the meantime, please do not hesitate to reach back out.

    Best,
    Anthony

    0
    Comment actions Permalink
  • Avatar
    Anthony DiCi

    Hi Leslie Youtsey,

    We haven’t heard from you in a while so we will be closing out your ticket in our system. If you still require assistance, you need only respond to this thread, and we’ll create a follow-up ticket to pick up where we left off.

    Thank you again for contributing to our GATK forum!

    Best,

    Anthony

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk