基于最小生成树与统计特征的层次聚类算法

  • 打印
  • 收藏
收藏成功


打开文本图片集

中图分类号:TP181 文献标识码:A DOI:10.7535/hbkd.2026yx01006

Abstract:Toaddress the limitations of the Chameleon algorithm in terms of parameter sensitivity,noise robustness,and computationaleficiency,this studyproposed astatistical-MSTintegrated hierarchical clusteringalgorithm(SHCA)basedon the minimum spanning tree and statistical features.The minimum spanning tree was used to construct asparse graph, eliminatingmanualparameterintervention,andtheglobaloptimalityoftheminimumspanning tree wasused toavoid false crossclusterconnections.Thedynamic statistical mergingstrategy was designed to filterthe noisecombined withthe local distance threshold,andthesubclusters were mergediterativelythrough theinterclusterconnectivitytesttoensure theintra cluster compactnessand inter cluster separation.Experiment on 20synthetic datasetsand1real-world datasetswas conducted.The result shows thatthe proposed SHCA algorithm outperforms existing methods in clustering performance;In cases whereperformance degradationisobservedoncertaindatasets,theanalysis revealsthat manifoldoverlapistheprimary contributing factor.Overall,SHCA significantlyenhances clustering accuracyandresult stability,providing somereferencefor subsequent research on clustering of large-scale and complex manifold data.

Keywords:artificial intellgence theory;clustering;hierarchicalclusteringalgorithm;minimum spanning tree;dynamicstatistical merging strategy

一般而言,聚类是指将无标签数据集划分为若干类别,使得类内数据相似度高、类间数据相似度低的过程,是一种无监督的机器学习分类方法[1-2]。(剩余13343字)

monitor
客服机器人