知识嵌入引导的双分支融合增强开放词汇 目标检测

  • 打印
  • 收藏
收藏成功


打开文本图片集

关键词:开放词汇目标检测;知识嵌入;标签匹配;双分支融合中图分类号:TP391.41 文献标识码:Adoi:10.37188/OPE.20253318.2929 CSTR:32169.14.OPE.20253318.2929

Abstract: To address the issues of weak understanding of new classconcepts,label confusion,and insuficient detection performance of new classes in open-set scenarios,a Knowledge Integration-guided Dualbranch Fusion Open-Vocabulary Object Detection (KI-DBFOVD) method was proposed in this paper. Firstly,a Knowledge Integration (KI) module was designed,where pseudo-labels generated by a VisionLanguage Model were embedded into the detector to learn about new class concepts. Subsequently,a Label Match (LM) module was introduced to refine the label matching process through multi-level threshold adjustment and independent matching between base and new classes,thereby aleviating the label confusion between base and new classes during detection. Finaly,a novel Dual-branch Fusion module(DBF) was constructed by fusing the traditional visual branch and the vision-language branch via geometric averaging.This fusion maintained the detection accuracy of base classes and more effectively detected and localized new class objects,then enhanced the overall detection performance of the KI-DBFOVD method. Ex- perimental results demonstrate that this method achieves a detection accuracy of 38.6% for new classes on the COCO dataset and 25.4% on the more challenging LVIS dataset,which contains a larger number of categories. These results outperform several mainstream methods and indicate that this approach is more suitable for different open-set scenarios..

Key words:open-vocabulary object detection; knowledge integration;label match;dual-branch fusion

1引言

目标检测是计算机视觉的一项基础任务,已广泛应用于环境资源监测[1]、交通安全监控[2]、工业产品检测[3等领域。(剩余21591字)

monitor
客服机器人