基于自适应高斯混合模型的数据库基数估计方法

打开文本图片集
关键词:查询优化;基数估计;自适应高斯混合模型;自回归模型;偏差校正中图分类号:TP311.13 文献标志码:A 文章编号:1001-3695(2026)04-024-1171-09doi:10.19734/j.issn.1001-3695.2025.07.0292
Database cardinalityestimation method based on adaptive Gaussian mixture model
Li Hao,Liu Mengchi ,Zou Ruiji,Liu Mingkai (SchoolofComputer Science,SouthChina Normal University,Guangzhou 51oooo,China)
Abstract:Cardinalityestimationisa criticalcomponent ofdatabase queryoptimization,where itsaccuracy directlyimpacts theexecution eficiencyofqueryplans.Deepautoregresive model-basedcardinalityestimators havedemonstratedremarkable accuracyinpriorstudies.However,theystrugletocapturedatadistributionpaternswhenhandlinglarge-domaincontinuous atributes,whichleadtosignificantperformancedegradation.Toadessthesechalenges,thispaperproposedanovelcardinalityestimatorbasedonanadaptive Gaussianmixturemodel,calledAGCardIt firstdynamicallyadjusted thenumberand parameters ofGaussancomponents toadaptivelyfitthedatadistributionofcontiuous atributes,therebyreducingthedomain scale.Subsequently,AGCard employedabiascorrctionalgorithmtocompensate fortheestimationdeviations introducedby the progresivesampling process whileavoiding additional computationaloverhead.Extensive experiments on threereal-world datasets (including WISDM)demonstrate thatthe proposed method outperforms existing mainstream baselines in terms of estimationaccuracy,inferencelatency,andstorageoverhead.Theresultsconfirmtheefectiveness oftheadaptiveGaussianmixture model and the bias correction algorithm.
Key words:queryoptimization;cardinalityestimation;adaptive Gaussanmixture model;;autoregressive model;bias correc tion
0引言
基数估计作为数据库管理系统查询优化的关键环节,在系统执行查询前估计满足查询语句的元组数量,对提升数据库系统的性能至关重要。(剩余22128字)