一種基于仿射傳播的增強(qiáng)型流聚類算法
發(fā)布時(shí)間:2018-05-13 09:35
本文選題:流聚類 + 仿射傳播 ; 參考:《西安交通大學(xué)學(xué)報(bào)》2017年03期
【摘要】:針對(duì)目前流聚類算法無法有效處理數(shù)據(jù)流離群點(diǎn)的檢測(cè)和處理,以及增量式數(shù)據(jù)流聚類效率較低等問題,提出了一種基于密度度量的異常檢測(cè)、刪除的增強(qiáng)型仿射傳播流聚類算法。在仿射傳播流聚類算法的基礎(chǔ)上,所提算法通過引進(jìn)異常檢測(cè)和刪除機(jī)制改善了異常點(diǎn)對(duì)聚類精度、聚類效率的影響。利用仿射傳播聚類實(shí)現(xiàn)在線數(shù)據(jù)流的聚類過程,同時(shí)檢測(cè)數(shù)據(jù)漂移現(xiàn)象,即數(shù)據(jù)流分布特征隨時(shí)間發(fā)生變化,并采用基于密度度量的局部異常因子檢測(cè)技術(shù)(LOF)對(duì)儲(chǔ)備池?cái)?shù)據(jù)進(jìn)行異常檢測(cè)和刪除處理,通過對(duì)當(dāng)前類簇和處理過的儲(chǔ)備池?cái)?shù)據(jù)重聚類來重建動(dòng)態(tài)數(shù)據(jù)流模型。在真實(shí)網(wǎng)絡(luò)數(shù)據(jù)(KDD’99)上進(jìn)行了實(shí)驗(yàn),結(jié)果表明,所提算法不僅減少了重聚類構(gòu)建動(dòng)態(tài)模型的次數(shù),改善了聚類效率,而且在同時(shí)考慮聚類精度、純度和熵3種聚類評(píng)價(jià)標(biāo)準(zhǔn)下,均優(yōu)于傳統(tǒng)的仿射傳播流聚類算法。
[Abstract]:Aiming at the problem that current flow clustering algorithm can not effectively deal with outlier detection and processing of data stream, and the efficiency of incremental data stream clustering is low, a density metric based anomaly detection method is proposed. Deletes an enhanced affine propagation flow clustering algorithm. Based on the affine propagation flow clustering algorithm, the proposed algorithm improves the effect of outlier points on clustering accuracy and clustering efficiency by introducing anomaly detection and deletion mechanisms. The affine propagation clustering is used to realize the online data flow clustering process, and the data drift phenomenon is detected at the same time, that is, the distribution characteristics of the data flow change with time. The local anomaly factor detection technique based on density metric is used to detect and delete the data of the reserve pool, and the dynamic data flow model is reconstructed by clustering the current cluster and the processed data of the storage pool. The experimental results on the real network data show that the proposed algorithm not only reduces the number of times of reclustering to construct dynamic model, but also improves the clustering efficiency, and considers the clustering accuracy at the same time. It is superior to the traditional affine propagation flow clustering algorithm under three clustering criteria of purity and entropy.
【作者單位】: 西安交通大學(xué)軟件學(xué)院;西安交通大學(xué)電子與信息工程學(xué)院;
【基金】:國家自然科學(xué)基金資助項(xiàng)目(61371087,61531013) 國家“863計(jì)劃”資助項(xiàng)目(2015AA015702)
【分類號(hào)】:TP311.13
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 徐結(jié)綠,徐漢良,呂述望;仿射全向置換的構(gòu)造和計(jì)數(shù)[J];通信技術(shù);2003年05期
2 龔石鈺;;兩平面場(chǎng)仿射及其在工程上的應(yīng)用[J];成都科技大學(xué)學(xué)報(bào);1989年06期
3 李天寶,陳文波,石世宏;仿射圖形的計(jì)算機(jī)作圖方法的研究[J];南華大學(xué)學(xué)報(bào)(理工版);2003年01期
4 劉黎,董培蓓;平行線束法的仿射研究[J];工程圖學(xué)學(xué)報(bào);2004年04期
5 張青,李永慈,唐守正;基于仿射重構(gòu)的樹高測(cè)量[J];計(jì)算機(jī)工程與應(yīng)用;2005年31期
6 張桂梅;任偉;儲(chǔ)s,
本文編號(hào):1882612
本文鏈接:http://www.sikaile.net/kejilunwen/ruanjiangongchenglunwen/1882612.html
最近更新
教材專著