Research on YOLOv10-Mamba-Based Object Detection Algorithm
DOI:
https://doi.org/10.56028/aetr.14.1.940.2025Keywords:
Unified convolutional neural network, YOLOv10-Mamba, spatial model.Abstract
To address the dual challenges of limited local receptive fields in traditional convolutional neural networks and high computational complexity in Transformer-based models for real-time object detection tasks, this study proposes YOLOv10-Mamba, an enhanced object detection algorithm integrating State Space Models (SSM) and a dual-branch detection architecture. Building upon Mamba-YOLO's strengths in global feature modeling, this algorithm systematically reconstructs the detection head module: a one-to-many (o2m) and one-to-one (o2o) dual-branch detection head, innovatively introduced from YOLOv10, is adopted to establish a dynamic label assignment strategy and a post-processing-free (NMS-free) detection paradigm. Additionally, the detection head network is restructured using depth-wise separable convolutions to achieve effective compression of model complexity. Specifically, the o2m branch employs a dense supervision strategy to enhance feature discriminability, while the o2o branch realizes end-to-end prediction via optimal transport theory. A dynamic gradient coordination strategy is implemented to synergistically optimize supervision signals between the dual branches. Experimental results demonstrate that the improved algorithm achieves 66.9% mAP on the COCO dataset.