Counld I use bam files from HISAT2 to call snp/indels?
AnsweredIf you are seeing an error, please provide(REQUIRED) :
a) GATK version used:
b) Exact command used:
c) Entire error log:
If not an error, choose a category for your question(REQUIRED):
a)How do I (......)?
b) What does (......) mean?
c) Why do I see (......)?
d) Where do I find (......)?
e) Will (......) be in future releases?
-
Official comment
Hi Qun,
Yes, as long as the output from HISAT2 are valid BAM files. Our best practices recommendation uses BWA, but you should be able to use HISAT2 if you prefer it. If you run into issues when you try to use these BAM files, let us know and we can help troubleshoot.
Best,
Genevieve
Comment actions -
Hi Qun,
The GATK support team is focused on resolving questions about GATK tool-specific errors and abnormal results from the tools. For all other questions, such as this one, we are building a backlog to work through when we have the capacity.
Please continue to post your questions because we will be mining them for improvements to documentation, resources, and tools.
We cannot guarantee a reply, however, we ask other community members to help out if you know the answer.
For context, check out our support policy.
-
hi, ı'm struggling to run SplitNCigar because I used Hisat2 in my rnaseq mapping that's why ı have 8 .ht2 files and they are not in fasta format. do you have any suggestions for me to try solving this index reference error?
- gatk SplitNCigarReads -R Homo_sapiens.GRCh38.dna.primary_assembly.fa -I mySampleB68dupli.bam -O mySample68snc.bam
ı used this command.
-
Hello, Ekin. Your BAM should work regardless of what method you used to create it, assuming it was properly converted.
If you are receiving an index reference error, please post your error log and the version of GATK you are using.
-
first of all, thank you for this fast return.
gatk SplitNCigarReads \
-R Homo_sapiens.GRCh38.dna.primary_assembly.fa \
-I mySample68dupli.bam \
-O mySample68snc.bam
A USER ERROR has occurred: Fasta index file file:///mnt/e/thesis/data/Homo_sapiens.GRCh38.dna.primary_assembly.fa.fai for reference file:///mnt/e/thesis/data/Homo_sapiens.GRCh38.dna.primary_assembly.fa does not exist.
you can see the command ı used and the error which comes afterward.
ı checked my gatk version and it is (GATK) v4.2.2.0.
ı am pretty sure that my reference genome resides in the file that ı'm dictating and until now ı didn't encounter an error like this. ı thought it might be due to the index system that hisat2 comes with. ı tried many options to solve this and now ı am stuck.
ı hope we can solve this. thank you.
-
Hi Ekin, it seems like GATK cannot find your reference file and/or index file. You'll encounter that "A USER ERROR has occurred" message if the files are missing, or if they are not formatted correctly.
To troubleshoot this, the first thing you need to do is make sure that the files are accessible to GATK. Check that your *.fa and *.fai files are, indeed, located within the file:///mnt/... directory that you are telling SplitNCigarReads to look in. Know that GATK requires your .fa references to be co-located with the appropriate *.dict and *.fai files. Kind of a head-slapping error, but it's always worth a sanity check.
Next, make sure that your file:///mnt/... directory is actually accessible to GATK at whatever location you are running your scripts. For example, you might run into problems if your reference files are located on your local machine, but your scripts are running on your institute's server. If this is the case, you'll need to copy your files to the location where they can be accessed by the script while it is running. You may have already tried something like this, but try to `echo` out the names for all $files within that particular directory from within your script.
Finally, if you are sure that (1) the files exist, and (2) the files are accessible to GATK, then the issue is most likely the content of the files themselves, as you suggested. We would recommend that you use samtools to regenerate these files before running them with SplitNCigarReads to see if they are, indeed, mis-formatted.
For more information on references and how to generate them for GATK, please take a look at the following article on FASTA reference formats available in our GATK Glossary.
-
That's incredible. as you said I include the dict file and the reference-related error is now disappeared. thank you for your kind response, it was really helpful for me and I hope that it will help someone else as well.
-
Fantastic, I'm glad it was an easy fix. Best of luck with your project!
-
Is it pronounced Gat Kay or Gee Ay Tee Kay
Please sign in to leave a comment.
10 comments