基于聚類和SOM的復雜網絡中社團挖掘算法的研究
本文關鍵詞: 最短路徑 中介系數 相似度 凝聚系數 特征屬性 自組織競爭神經網絡 出處:《江西理工大學》2017年碩士論文 論文類型:學位論文
【摘要】:隨著計算機技術的迅猛發(fā)展,收集并處理規(guī)模龐大且種類繁多的實際網絡數據成為滿足物質與文化需求的必要途徑,網絡科學也隨之扮演著愈來愈重要的角色。與人們生活緊密相關的網絡,如社會網,生物網,信息網,交通運輸網等,這些網絡之間相互交錯關聯。揭示網絡中共性的問題以及解決這些問題的普適方法便成為了網絡研究的一個重點,而這些網絡可以歸納于復雜網絡的范疇。挖掘出其中隱藏的社團結構,對病毒傳播的預防、輿情的控制、以及未知生物功能的預測均起到至關重要的作用。本文針對復雜網絡中社團結構的挖掘所做工作如下:(1)綜述了復雜網絡的研究現狀、相關定義、性質及模型,分析了社團結構的層次劃分,敘述了研究社團結構的意義,總結了典型的社團結構劃分算法的優(yōu)缺點,論述了利用聚類的算法思想以及自組織競爭神經網絡(簡稱SOM)的相關知識對社團結構進行挖掘。(2)提出了基于最短路徑特征的社團挖掘算法(Community Discovery Algorithm Based on Shortest Path Feature,SPCDA)。基于最短路徑的特征,由其數目的特征計算每個節(jié)點的中介系數從而獲取社團中心,據其長度的特征計算節(jié)點之間的相似度值。約定一種閾值作為劃分規(guī)則,該閾值最終由所有節(jié)點的平均相似度值確定。如此以來構成類似于聚類的模型,最后按照劃分規(guī)則將每個節(jié)點(不包括社團中心的節(jié)點)分別與閾值進行比較,取超過閾值的節(jié)點劃分聚類,據此過程不斷迭代,直至劃分完成。將該算法應用于經典的復雜網絡實驗仿真平臺,并與典型的GN算法和LPA算法進行比較分析,結果證實SPCDA算法能夠快速、準確的挖掘隱藏的社團結構。(3)提出了基于自組織競爭神經網絡的多特征社團挖掘算法(Multi-Feature Community Discovery Algorithm Based on Self-Organizing Competitive Neural Network,SOMCD A)。考慮網絡的拓撲結構兼顧節(jié)點特征屬性,將聚類思想與SOM相結合。提出的算法基于節(jié)點的影響力,結合節(jié)點的度及其相鄰節(jié)點之間的連接邊數來計算每個節(jié)點的凝聚系數,從凝聚系數值較大的節(jié)點中提取出特征節(jié)點,并將這些有代表性的特征節(jié)點作為樣本節(jié)點。然后針對樣本節(jié)點的多特征屬性信息用SOM對其進行訓練,再將非樣本節(jié)點提供給經過訓練的SOM。依據SOM的結構存儲模式的特征,競爭網絡就會做出識別,從而實現社團劃分的目的。最后根據每次仿真所取的競爭層神經元個數的不同,采用模塊度函數來確定最佳社團結構。
[Abstract]:With the rapid development of computer technology, the collection and processing of a large and diverse range of actual network data has become a necessary way to meet the material and cultural needs. Network science also plays a more and more important role. Networks closely related to people's lives, such as social networks, biological networks, information networks, transportation networks and so on. These networks are interlaced with each other. Revealing the common problems in the network and the universal methods to solve these problems have become a focus of the network research. These networks can be summed up in the category of complex networks, mining out the hidden community structure, the prevention of virus transmission, the control of public opinion. The prediction of unknown biological functions plays an important role. In this paper, the research status and definitions of complex networks are summarized as follows: 1) for the mining of community structures in complex networks. Properties and models, analysis of the hierarchy of community structure, the significance of the study of community structure, summed up the advantages and disadvantages of typical community structure division algorithm. This paper discusses how to mine the community structure by using the idea of clustering and the knowledge of self-organizing competitive neural network (SOM). The algorithm based on the shortest path feature is proposed. Community Discovery Algorithm Based on Shortest Path Feature. Based on the features of the shortest path, the mediation coefficient of each node is calculated from the number of features to obtain the community center. The similarity value between nodes is calculated according to its length feature. A threshold value is agreed as a partition rule, which is determined by the average similarity value of all nodes. Thus, a clustering model is constructed. Finally, each node (not including the node in the community center) is compared with the threshold value according to the partition rule, and the node that exceeds the threshold value is taken to divide and cluster, according to which the process is iterated. The algorithm is applied to the classical simulation platform of complex network experiments and compared with the typical GN algorithm and LPA algorithm. The results show that the SPCDA algorithm can be fast. In this paper, we propose a multi-feature association mining algorithm based on self-organizing competitive neural network. Multi-Feature Community Discovery Algorithm Based on Self-Organizing. Competitive Neural Network. Considering the topological structure of the network and the characteristic attributes of the nodes, the clustering idea is combined with the SOM. The proposed algorithm is based on the influence of the nodes. The coacervation coefficient of each node is calculated by combining the degree of nodes and the number of connecting edges between adjacent nodes, and the feature nodes are extracted from the nodes with larger coacervation coefficient. These representative feature nodes are taken as sample nodes, and then the multi-feature attribute information of the sample nodes is trained with SOM. Then the non-sample nodes are provided to the trained SOM. According to the characteristics of the structural storage mode of SOM, the competitive network will be identified. Finally, according to the different number of competition layer neurons in each simulation, the module degree function is used to determine the optimal community structure.
【學位授予單位】:江西理工大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:O157.5;TP301.6
【參考文獻】
相關期刊論文 前10條
1 劉文奇;;復雜網絡上的公共數據演化博弈與數據質量控制[J];中國科學:信息科學;2016年11期
2 賀臘容;黃創(chuàng)霞;文鳳華;楊曉光;;基于復雜網絡的滬深300股票重要節(jié)點的評估和分析[J];經濟數學;2016年03期
3 杜斐;黃宏偉;張東明;張帆;;上海軌道交通網絡的復雜網絡特性及魯棒性研究[J];武漢大學學報(工學版);2016年05期
4 李凡;郭瑞軍;;城市主干道路網的復雜網絡特性[J];山東理工大學學報(自然科學版);2016年06期
5 王韶;劉沛錚;董光德;張煜成;;基于復雜網絡理論計及校正控制的電力系統(tǒng)連鎖故障模型[J];電力自動化設備;2016年09期
6 喬煌煌;馬丁;沈沉;王雅婷;郭小江;張一馳;;基于復雜網絡理論的直流受端系統(tǒng)暫態(tài)電壓控制選點方法[J];電網技術;2016年10期
7 劉云;陳昌凱;;隨機網絡中鄰居節(jié)點發(fā)現算法優(yōu)化研究[J];計算機工程;2016年08期
8 安子強;;復雜網絡隱蔽信道資源快速調度方法[J];計算機仿真;2016年08期
9 王新贈;李剛;常正波;;社會信息網絡結構和規(guī)模演變研究[J];數學建模及其應用;2016年02期
10 李曉軍;劉懷亮;杜坤;;一種基于復雜網絡模型的作者身份識別方法[J];圖書情報工作;2015年18期
相關會議論文 前2條
1 劉s,
本文編號:1493237
本文鏈接:http://www.sikaile.net/kejilunwen/ruanjiangongchenglunwen/1493237.html