GTransFusion:基于Transformer的多模态表示学习与图结构对齐的融合方法

打开文本图片集
中图分类号:TP242.6 文献标识码:A 文章编号:2096-4706(2026)04-0049-07
GTransFusion: Fusion Method of Multimodal Representation Learning and Graph Structure Alignment Based on Transformer
ZHANG Xian, PANG Hui, LIU Jiajun (SchoolofInformationEngineering,Hebei UniversityofArchitecture,Zhangjiakou O75ooo,China)
Abstract: With the emergence of multi-source medical data such as high-throughput genome sequencing and highresolution digitalpathologicalimages,multimodal biological modeling becomes the keytoartificialintellgence-asssted pathological dagnosis.Thisstudyproposesanewultimodalepresentationleaingmethod,GransFusion,tojointlyalye pathological Whole Slide Imagesandomicsdata,soasto improve thediagnosticaccuracyofvariouscancers.Thismethodmaps diferentmodaldata intoaunifiedsequencerepresentation througha Transformer-based jointrepresentationlearningmodule, explicitlymodels modal typeencoding intheprocess,andrealizesdynamicmodalweighting byvirtueoftheself-attention mechanism.Meanwhile,thismethodconstructsacross-modal featurealignmentgraphstructure,utilizesaGraphNeuralNetwork tocapture inter-modalassociationandcommoninformation,andfeedsbacktotheTransfomerrepresentationlearingtoealize cross-modalfeature alignmentandrelationshipmodeling.Experimentsonmultipletumordatasetsshowthattheproposed method is significantlysuperir tocomparisonmethodsinsurvivalpredictionperformanceindicators,whichverifes theefectivnesof multimodal joint representation and graph structure alignment.
Keywords: multimodal fusion; Transformer; heterogeneous graph; joint representation learning
0 引言
病理学是现代医学的基石,在癌症诊断和治疗规划中发挥着重要作用。(剩余10086字)