Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

How to stop FastaAlternateReferenceMaker from renaming my chromosomes?

0

6 comments

  • Avatar
    David Rinker

    Just following up. I still have not found an option to stop this behavior.

    The work around I implemented is simple enough

    sed -i "s/>1/>chr$CHR/g" chr${CHR}${INDIV}.fa

    ...however it should be noted that the GATK output dictionary and index will then need to be deleted and rebuilt.

    I would propose that this is not ideal (default) behavior, and that passing the chromosome labels of the input files would be more expected and useful.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt

    Hi David Rinker, could you provide the GATK version, the complete command, and a more specific example of the problem you are seeing?

    0
    Comment actions Permalink
  • Avatar
    David Rinker

    Sure. It's a simple command that I am running on a per-chromosome basis:

    java -jar /bin/gatk-4.1.7.0/gatk-package-4.1.7.0-local.jar FastaAlternateReferenceMaker\
    -R /data/dna/human/hg19/chr${CHR}.fa\
    -V chr${CHR}${INDIV}.vcf.gz\
    -O chr${CHR}${INDIV}_hg19_full.fa

    The output fasta has the input chromosome name (e.g. >chr12) always replaced by ">1"

     

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt

    Hi David Rinker, thank you, I found in the tool documentation that this is expected behavior for the tool. The sequences are named numerically in order to account for disjoint intervals. It looks like the best way to get around it is changing the naming after running FastaAlternateReferenceMaker. Sorry I was not able to provide an easier solution.

    We cannot guarantee any changes, but I'll make note of this as a feature request for the GATK team to consider in the future.

    0
    Comment actions Permalink
  • Avatar
    David Rinker

    Thank you. Once I figured out I needed to rename them manually AND THEN rebuild the fai and dict files, everything fell into place.

    0
    Comment actions Permalink
  • Avatar
    Genevieve Brandt

    David Rinker thanks for posting your solution! I am sure other GATK users will find this helpful in the future.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk