YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-Time Object Detection

Chen, Yuming; Yuan, Xinbin; Wang, Jiabao; Wu, Ruiqi; Li, Xiang; Hou, Qibin; Cheng, Ming‐Ming

doi:10.1109/tpami.2025.3538473

articleIEEE Transactions on Pattern Analysis and Machine IntelligenceFeb 4, 2025Closed access

YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-Time Object Detection

YCYuming ChenXYXinbin Yuan JWJiabao Wang RWRuiqi Wu XLXiang Li

Nankai University

PubMed

Indexed incrossrefpubmed

Abstract

We aim at providing the object detection community with an efficient and performant object detector, termed YOLO-MS. The core design is based on a series of investigations on how multi-branch features of the basic block and convolutions with different kernel sizes affect the detection performance of objects at different scales. The outcome is a new strategy that can significantly enhance multi-scale feature representations of real-time object detectors. To verify the effectiveness of our work, we train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets, like ImageNet or pre-trained weights. Without bells and whistles, our YOLO-MS outperforms the recent…

Citation impact

150

total citations

FWCI: 146.48
Percentile: 100%
References: 79

Citations per year

Authors

7

YC
Yuming ChenCorresponding
Nankai University
XY
Xinbin Yuan
Nankai University
JW
Jiabao Wang
Nankai University
RW
Ruiqi Wu
Nankai University
XL
Xiang Li
Nankai University

Topics & keywords

Topics

Keywords

Artificial intelligence
Object detection
Computer science
Scale (ratio)
Representation (politics)
Object (grammar)
Computer vision
Cognitive neuroscience of visual object recognition

No related works found for this paper.