基于大模型标签蒸馏的搜索意图识别

打开文本图片集
中图分类号:TP391.3;TP18 文献标识码:A 文章编号:2096-4706(2026)03-0040-05
Abstract:Insearchengines,accuratelyrecognizing theintentofuser queries iscrucial for improving search experience. Search intentrecognitionbelongstothetaskofshorttextclassification.Traditionalmethodsrelyonmassivemanuallylabeled data,which implies highcostsanddiffcultyinadapting totherapidemergenceofnewintents.Thispaperproposesasearch intentecognitionmethodbasedonlargemodellabeldistilation.Itutilizesthepowerfulsmanticunderstandingcapabilitiesof Large Language Models (suchas GPT4o,DeepSeek-R1,and Spark xl) to generate high-quality intent labels forunlabeled query instructionsandconstructtraining datasets.Furthermore,through Knowledge Distilation technology,the knowledgeoflarge modelsis transferredtolightweight pre-trainedmodels (suchasERNIE3.0andBERT)forfine-tuning.Experimentalresults showthat this method significantly improves model performanceona Chinese dataset withascaleof136ooo,and efectively enhances intent recognition efficiency while reducing labeling costs.
Keywords: intent recognition; text classification; label distillation; large model; pre-trained model
0 引言
在搜索引擎中,准确识别用户查询的搜索意图是提升信息匹配精度与用户体验的关键环节。(剩余7305字)