条件感知跨模态相似度模型的设计与分析

  • 打印
  • 收藏
收藏成功


打开文本图片集

中图分类号:TP391.4;TP18 文献标识码:A 文章编号:2096-4706(2026)04-0039-05

Design and Analysis of Condition-Aware Cross-Modal Similarity Model

QIAN Xinqiao (School ofArtificial Intelligence,Jiangsu Vocational College ofBusiness,Nantong 226011,China)

Abstract:Crossmodal similaritycomputation isoneof thecore tasks inmultimodallearning.In terms of multimodal image similaritycomputation,existing methods generallyadoptastatic mapping mechanism between textand images,which makes it difcult todynamicallyadjust thefeaturesimilarity weightsaccording tothesemanticconditionsspecifedbyusers.To addressthis challenge,thispaperproposesaCondition-AwareCross-ModalSimilarity(CACMS)model,whichcandyamically adjustimage featurevaluesacording todiferentsemanticconditions,therebyrealizingcondition-controledimagesimilarity computation.The key innovations of themodelinclude two points.Firstly,a dynamic gated fusion module that decouples conditions into shared atributes suchaspose andunique atributes such as category is designed tofurther generateconditionbased featuregating vectors.Secondlyanadversarialdecoupling contrastiveleamingstrategyis proposedtofurtheroptiizethe dynamic reorganization of the feature space.

Keywords:cros-modal learnig;conditionalawarene;dynamic imagesimilarity;featuredecoupling;contrastivelearing

0 引言

多模态学习与跨模态检索是人工智能领域的核心研究方向,旨在突破单一模态的信息壁垒,使文本、图像、语音等不同类型的数据能够相互理解和关联。(剩余7465字)

monitor
客服机器人