社會網(wǎng)絡中的社區(qū)發(fā)現(xiàn)與節(jié)點評估算法研究

發(fā)布時間：2018-03-06 09:53

本文選題：社會網(wǎng)絡　切入點：社區(qū)發(fā)現(xiàn)　出處：《吉林大學》2014年碩士論文　論文類型：學位論文

【摘要】：隨著互聯(lián)網(wǎng)的發(fā)展和普及，在線社會網(wǎng)絡已經(jīng)滲透到人們生活中的每個角落，拉近了人們彼此之間的距離，對社會網(wǎng)絡的研究能夠讓我們了解社會網(wǎng)絡的結(jié)構(gòu)特征以及演化規(guī)律讓其更好地為人類服務。在網(wǎng)絡研究方面有兩個熱點問題：社區(qū)發(fā)現(xiàn)和節(jié)點的重要性評估，這兩個問題的研究對于我們認識復雜社會網(wǎng)絡的結(jié)構(gòu)以及特征具有非常重要的意義。社會網(wǎng)絡研究中的社區(qū)發(fā)現(xiàn)工作可以把大的網(wǎng)絡分成粒度更小的社區(qū)，讓我們發(fā)現(xiàn)內(nèi)部個體聯(lián)系緊密的團體，節(jié)點評估可以對網(wǎng)絡中的節(jié)點以不同的角度進行重要性評估，發(fā)現(xiàn)重要節(jié)點。目前的社區(qū)發(fā)現(xiàn)算法大部分基于圖形分割和層次聚類思想，雖然這些算法大部分情況下能夠有效地對社區(qū)進行識別，但都必須指定社區(qū)的數(shù)量或社區(qū)的規(guī)模，顯然這是不合理的。遺傳算法作為一種搜索最優(yōu)解的方法能夠在沒有先驗信息的情況下自動識別社區(qū)的數(shù)量，高效準確的對社區(qū)進行發(fā)現(xiàn)，但是傳統(tǒng)的種群的初始化方法僅僅考慮到鄰接信息，沒有充分考慮網(wǎng)絡的拓撲結(jié)構(gòu)，因此得到的種群的質(zhì)量比較差，影響算法的收斂速度。節(jié)點的評估算法也存在許多，有的依據(jù)節(jié)點的局部特征，有的依據(jù)整個網(wǎng)絡的拓撲結(jié)構(gòu)，作為用在搜索引擎中評估網(wǎng)頁重要程度的PageRank算法，在社會節(jié)點評估中也有很廣泛的應用，但傳統(tǒng)的PageRank算法在權(quán)值分配的時候都是均勻分配，這種分配方式在社會網(wǎng)絡中是不合理的，因為社會網(wǎng)絡反映的是用戶與用戶之間的關(guān)系，，這種關(guān)系是有親疏之分的，不能同等對待。針對上面的分析，本文主要對社區(qū)發(fā)現(xiàn)的遺傳算法和節(jié)點評估的PageRank算法存在的不足進行改進，主要的工作如下：首先，對用于社區(qū)發(fā)現(xiàn)的遺傳算法的種群初始化方法進行了改進，根據(jù)社會網(wǎng)絡的特性，給出了信息在網(wǎng)絡中傳播的特征定義，然后根據(jù)社會網(wǎng)絡的自身特征和信息在網(wǎng)絡中傳播的特性提出了能夠充分應用網(wǎng)絡拓撲結(jié)構(gòu)的初始化方法k-path方法,并且給出了基于k-path初始化的遺傳算法的計算過程。然后，依據(jù)節(jié)點之間的親密程度，提出了節(jié)點間的認可度概念，針對用于節(jié)點評估的PageRank算法權(quán)值均勻分配的不合理性問題，提出以節(jié)點間的認可度為依據(jù)來分配權(quán)值，最后給了改進的節(jié)點評估算法ARank算法。最后，在數(shù)據(jù)集上驗證了改進的遺傳算法和PageRank算法，實驗結(jié)果表明改進的遺傳算法在收斂速度要比傳統(tǒng)的算法快，改進的PageRank算法對節(jié)點的評估比傳統(tǒng)的評估方式得到結(jié)果合理。
[Abstract]:With the development and popularity of the Internet, the online social network has penetrated into every corner of people's lives, drawing closer the distance between people. The study of social networks allows us to understand the structural characteristics and evolutionary laws of social networks so that they can better serve humanity. There are two hot issues in network research: community discovery and the importance of nodes. The study of these two issues is of great significance for us to understand the structure and characteristics of complex social networks. Community discovery in social network studies can divide large networks into smaller ones. Let us find the group which is closely connected with each other. The node evaluation can evaluate the importance of the nodes in the network from different angles and find the important nodes. Most of the current community discovery algorithms are based on the idea of graph segmentation and hierarchical clustering. Although these algorithms can effectively identify communities in most cases, they must specify the number of communities or the size of communities. Obviously, this is not reasonable. Genetic algorithm, as a method of searching the optimal solution, can automatically identify the number of communities without prior information, and efficiently and accurately find the communities. However, the traditional initialization method only considers the adjacent information and does not fully consider the topology of the network, so the quality of the population is poor, which affects the convergence speed of the algorithm, and there are many evaluation algorithms for nodes. Some are based on the local characteristics of nodes, some are based on the topology of the entire network, as a PageRank algorithm used to evaluate the importance of web pages in search engines, and they are also widely used in social node evaluation. However, the traditional PageRank algorithm is distributed evenly when the weights are allocated, which is unreasonable in the social network, because the social network reflects the relationship between the user and the user. In view of the above analysis, this paper mainly improves the genetic algorithm found in the community and the PageRank algorithm based on node evaluation. The main work is as follows:. Firstly, the population initialization method of genetic algorithm for community discovery is improved. According to the characteristics of social network, the characteristic definition of information transmission in the network is given. Then, according to the characteristics of social network and the characteristics of information spreading in the network, an initialization method, k-path method, which can make full use of the network topology, is proposed, and the calculation process of genetic algorithm based on k-path initialization is given. Then, according to the degree of intimacy between nodes, the concept of recognition degree between nodes is proposed. In view of the unreasonable distribution of weight values of PageRank algorithm used for node evaluation, it is proposed that the weights should be assigned according to the recognition degree between nodes. Finally, an improved node evaluation algorithm, ARank algorithm, is presented. Finally, the improved genetic algorithm and PageRank algorithm are verified on the data set. The experimental results show that the improved genetic algorithm converges faster than the traditional algorithm. The result of the improved PageRank algorithm is more reasonable than that of the traditional method.
【學位授予單位】：吉林大學
【學位級別】：碩士
【學位授予年份】：2014
【分類號】：TP301.6

【參考文獻】

相關(guān)期刊論文前5條

1 王林;張婧婧;;復雜網(wǎng)絡的中心化[J];復雜系統(tǒng)與復雜性科學;2006年01期

2 蔡曉妍;戴冠中;楊黎斌;;基于譜聚類的復雜網(wǎng)絡社團發(fā)現(xiàn)算法[J];計算機科學;2009年09期

3 劉旭;易東云;;基于局部相似性的復雜網(wǎng)絡社區(qū)發(fā)現(xiàn)方法[J];自動化學報;2011年12期

4 楊博;劉大有;金弟;馬海賓;;復雜網(wǎng)絡聚類方法[J];軟件學報;2009年01期

5 周濤,柏文潔,汪秉宏,劉之景,嚴鋼;復雜網(wǎng)絡研究概述[J];物理;2005年01期

相關(guān)博士學位論文前1條

1 韓毅;社會網(wǎng)絡分析與挖掘的若干關(guān)鍵問題研究[D];國防科學技術(shù)大學;2011年

本文編號：1574351

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://www.sikaile.net/kejilunwen/sousuoyinqinglunwen/1574351.html

上一篇：互聯(lián)網(wǎng)上與昆蟲學相關(guān)的搜索引擎
下一篇：海量Web搜索引擎系統(tǒng)中用戶行為的分布特征及其啟示

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

社會網(wǎng)絡中的社區(qū)發(fā)現(xiàn)與節(jié)點評估算法研究