面向社交媒体的中文文本毒性检测研究综述

打开文本图片集
中图分类号:TP391.1 文献标志码:A 文章编号:1001-3695(2026)01-002-0011-12
doi:10.19734/j. issn.1001-3695.2025.05.0173
Survey of Chinese toxic text detection in social media contexts
Sun Lianyi,Xu Jingwen†(ScholofburitfodellcialtigatceUsfn)
Abstract:Withtherapiddevelopmentof social media,alarge volumeof toxiccontentcontainingdiscrimination,abuse,and hate speechhasemergedonline,posingseriousthreats totheinformationenvironmentandsocialstabilityThissurveyinvestigatedthecurrentstatusandtechnicaladvancesin Chinesetoxictextdetection.Itprovidedacomprehensivereviewof task definitions,evaluationmetrics,datasetresources,modeling approaches,typicalapplications,andmajorchallenges.Thestudyfocused onrule-based methods,featureengineering,deep learning,and pretrained language models,andfurtherdiscuedadvanced directionssuchasmulti-task learning,cros-lingual transfer,andlargemodeldistilation.Bycomparingtheperformanceandaplicabilityofdiferenttechnques,itsummarizedkeyostaclesinatasarityimplicitexpressions,modelbias, andmultimodalmodeling,andoutlinedpotentialresearchdirections.Thisreviewoferedtheoreticalinsightsandtechnicalreferences for the standardization and practical implementation of toxic text detection in Chinese contexts.
Key Words:toxicity detection;natural language processing;deep learning;data augmentation
0 引言
互联网高速发展,微博、推特、论坛等社交媒体平台中每天会产生大量的用户生成内容,如用户发布的微博评论、推特评论、论坛评论等。(剩余40820字)