郵件社團(tuán)特殊人物發(fā)現(xiàn)算法的研究
[Abstract]:With the arrival of the information age, mail has become a universal way of information transmission. The mail network is formed through the communication behavior of people, which contains rich social information of users. Therefore, social network analysis of (SNA) has great potential significance for the mining of email networks in social relations. The main work of this paper is to mine the special characters in the mail network. There are two kinds of special people studied in this paper: the spammers and the key leaders. The spam senders discovery algorithm is mainly based on the spam community mining algorithm. By using directed weighted topology to construct mail network communication, it can better reflect the true transmission of information in mail network. According to the characteristics of spam sender, the idea of first stripping off and then integrating is adopted. Using the mean density function and Dijkstra algorithm (Dijkstra algorithm), the spam sender and other evaluation functions can find the spam. The idea of connection analysis can be used to find important leaders in mail network. On the basis of directed graph, PageRank algorithm is first used to calculate the importance of nodes according to the sending and receiving relationships of nodes, to sort the importance degrees and to expand the set. The initial seed set is filtered by calculating similarity, and the discovery and culling of one-way malicious link nodes are improved. By adding the bi-directional connection degree of nodes as the basis for eliminating one-way malicious nodes, the filtered set of nodes is used as the object of EHITS algorithm. The node PageRank value is used as the node importance, and the EHITS algorithm is used to calculate the node authority value and the hinge value. The node with high authority value is the important leader we are looking for. Finally, compared with the degree center degree, the intermediate center degree and the PageRank algorithm, the confusion degree is defined as the evaluation index to evaluate the validity and superiority of the algorithm.
【學(xué)位授予單位】:吉林大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TP393.08;TP393.098
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 劉馨月;趙明硯;張憲超;劉芳芳;;基于最大流HITS的改進(jìn)算法[J];計算機(jī)工程與應(yīng)用;2008年17期
2 孫名松;高慶國;王宣丹;;基于雙隸屬度模糊支持向量機(jī)的郵件過濾[J];計算機(jī)工程與應(yīng)用;2010年02期
3 楊勁松;凌培亮;;搜索引擎PageRank算法的改進(jìn)[J];計算機(jī)工程;2009年22期
4 喬少杰;唐常杰;彭京;劉威;溫粉蓮;邱江濤;;基于個性特征仿真郵件分析系統(tǒng)挖掘犯罪網(wǎng)絡(luò)核心[J];計算機(jī)學(xué)報;2008年10期
5 唐常杰;劉威;溫粉蓮;喬少杰;;社會網(wǎng)絡(luò)分析和社團(tuán)信息挖掘的三項探索——挖掘虛擬社團(tuán)的結(jié)構(gòu)、核心和通信行為[J];計算機(jī)應(yīng)用;2006年09期
6 鄧維斌;洪智勇;;基于粗糙集的兩階段郵件過濾方法[J];計算機(jī)應(yīng)用;2010年08期
7 熊金;劉悅;白碩;;基于結(jié)構(gòu)的e-mail挖掘算法:EHITS[J];計算機(jī)應(yīng)用研究;2008年04期
8 李瀟;羅軍勇;尹美娟;;基于郵件通聯(lián)關(guān)系的郵箱用戶權(quán)威別名評估[J];計算機(jī)應(yīng)用與軟件;2011年04期
9 劉松彬;都云程;施水才;;基于分解轉(zhuǎn)移矩陣的PageRank迭代計算方法[J];中文信息學(xué)報;2007年05期
10 劉伍穎;王挺;;基于多過濾器集成學(xué)習(xí)的在線垃圾郵件過濾[J];中文信息學(xué)報;2008年01期
本文編號:2158943
本文鏈接:http://www.sikaile.net/guanlilunwen/ydhl/2158943.html