天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

新浪微博中熱點(diǎn)檢測(cè)子網(wǎng)的選點(diǎn)策略

發(fā)布時(shí)間:2018-10-26 08:18
【摘要】:微博已經(jīng)成為了信息交流和傳播的流行手段,大量的社會(huì)事件都會(huì)在微博中傳播,,檢測(cè)微博中的熱點(diǎn)事件也變得越來(lái)越重要。然而,微博熱點(diǎn)事件檢測(cè)面臨著一些巨大的挑戰(zhàn)。微博的用戶數(shù)量龐大且用戶相對(duì)比較活躍,使得微博中短時(shí)間內(nèi)就可以產(chǎn)生大量的微博。對(duì)這些數(shù)量龐大的微博進(jìn)行處理需要大量的計(jì)算能力。 該文在中國(guó)最大的微博服務(wù)商新浪微博平臺(tái)上實(shí)現(xiàn)了一個(gè)熱點(diǎn)檢測(cè)系統(tǒng)。由于實(shí)時(shí)處理一段時(shí)間內(nèi)產(chǎn)生的所有微博來(lái)檢測(cè)熱點(diǎn)事件在經(jīng)濟(jì)上是不可行的,文章采取了一種策略,即通過(guò)監(jiān)控新浪微博中的一小部分微博用戶的微博,實(shí)現(xiàn)在有限的資源下對(duì)熱點(diǎn)事件進(jìn)行檢測(cè)。文章的主要研究目的是為通過(guò)監(jiān)控子網(wǎng)節(jié)點(diǎn)實(shí)現(xiàn)熱點(diǎn)事件檢測(cè)的系統(tǒng)提供子網(wǎng)節(jié)點(diǎn)的選點(diǎn)算法。該文首先提出了熱點(diǎn)事件的覆蓋度的概念,并提出了一種針對(duì)覆蓋所有樣本熱點(diǎn)事件的子網(wǎng)選點(diǎn)算法。通過(guò)對(duì)該算法的研究,針對(duì)其不足,該文又提出了節(jié)點(diǎn)的熱點(diǎn)事件參與概率的概念,并據(jù)此提出了一種概率算法選擇子網(wǎng)節(jié)點(diǎn)?紤]到監(jiān)控子網(wǎng)節(jié)點(diǎn)微博的開銷的差別,該文最后提出了節(jié)點(diǎn)開銷的概念,并結(jié)合節(jié)點(diǎn)的熱點(diǎn)事件參與概率,提出了一種最優(yōu)化算法。該文一共收集了525個(gè)熱點(diǎn)事件,其中294個(gè)熱點(diǎn)事件作為訓(xùn)練集,231個(gè)熱點(diǎn)事件作為測(cè)試集,并將提出的三種子網(wǎng)選點(diǎn)算法分別應(yīng)用于該數(shù)據(jù)集。研究結(jié)果表明,相比于其它算法,最優(yōu)化算法能夠以更小的系統(tǒng)開銷,檢測(cè)到更多的熱點(diǎn)事件,熱點(diǎn)事件檢測(cè)率為70%。
[Abstract]:Weibo has become a popular means of information exchange and communication, a large number of social events will spread in Weibo, the detection of hot events in Weibo has become more and more important. However, Weibo hot spot event detection is facing some huge challenges. Weibo has a large number of users and relatively active users, which can produce a large number of Weibo in a short period of time. Dealing with these large numbers of Weibo requires a lot of computing power. This paper implements a hot spot detection system on the platform of China's largest Weibo service provider Weibo. Since it is not economically feasible for Weibo to detect hot spot events in real time processing, the article has adopted a strategy, that is, by monitoring a small number of Weibo users in Sina. The detection of hot events is realized with limited resources. The main purpose of this paper is to provide a subnet node selection algorithm for the system that monitors the subnet nodes to realize the hot event detection. In this paper, the concept of coverage of hot spot events is proposed, and a subnet algorithm is proposed for covering all hot events in samples. Based on the research of the algorithm and its deficiency, this paper puts forward the concept of the participation probability of hot spot events of nodes, and then proposes a probability algorithm to select the nodes in subnets. Considering the difference of the overhead of monitoring subnet node Weibo, this paper proposes the concept of node overhead and proposes an optimization algorithm based on the participation probability of hot spot events. In this paper, a total of 525 hot spot events are collected, of which 294 are as training sets and 231 as test sets. The proposed algorithm is applied to the data set. The results show that compared with other algorithms, the optimization algorithm can detect more hot events with less system overhead, and the detection rate of hot spot events is 70%.
【學(xué)位授予單位】:上海交通大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.092

【參考文獻(xiàn)】

相關(guān)期刊論文 前1條

1 林小燕;;微博客流行的學(xué)理思考[J];新聞愛好者;2010年22期



本文編號(hào):2295127

資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/guanlilunwen/ydhl/2295127.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶6255a***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com