MarkDuplicates analysis of large wheat chromosomes
AnsweredI am working with the wheat genome and I am seeing the following warning when marking duplicates
WARNING 2021-11-11 12:37:03 BAMRecordCodec Reference length is too large for BAM bin field.
WARNING 2021-11-11 12:37:03 BAMRecordCodec Reads on references longer than 536870912bp will have bin set to 0.
Is there a work around in order to mark duplicates on the large wheat chromosomes?
Regards,
John.
-
Hi John,
Could you share your entire command, stack trace, and GATK version? We did have a somewhat similar issue come up with the wheat genome and MarkDuplicates but I don't have enough information about your case to know if there is a good workaround.
Best,
Genevieve
-
Hi Genevieve,
We have also the same problem with the durum wheat genome.
Warning messages:
WARNING 2023-02-23 09:22:48 BAMRecordCodec Reference length is too large for BAM bin field.
WARNING 2023-02-23 09:22:48 BAMRecordCodec Reads on references longer than 536870912bp will have bin set to 0.The command was:
gatk MarkDuplicates -I sample.sorted.bam -M sample_dedup_metrics.txt -O sample_sorted_dedup.bam
GATK version:
The Genome Analysis Toolkit (GATK) v4.3.0.0
HTSJDK Version: 3.0.1
Picard Version: 2.27.5Thank you in advance,
Miriam
Please sign in to leave a comment.
2 comments