天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 搜索引擎論文 >

互聯(lián)網(wǎng)輿情信息采集分析系統(tǒng)關(guān)鍵技術(shù)研究

發(fā)布時(shí)間:2018-04-14 20:04

  本文選題:輿情 + 網(wǎng)絡(luò)爬蟲。 參考:《天津大學(xué)》2012年碩士論文


【摘要】:在當(dāng)前Internet網(wǎng)絡(luò)環(huán)境日趨復(fù)雜的條件下,網(wǎng)絡(luò)輿情已經(jīng)對社會(huì)的穩(wěn)定和眾多上網(wǎng)的人們產(chǎn)生了重大的影響。網(wǎng)絡(luò)輿情發(fā)生的范圍廣,傳播的速度快,并且輿情的爆發(fā)點(diǎn)具有不易發(fā)現(xiàn)和控制等特點(diǎn),這使得對互聯(lián)網(wǎng)中輿情信息采集和分析變得非常重要。 本文對互聯(lián)網(wǎng)中輿情信息采集系統(tǒng)的需求進(jìn)行深入分析,然后將網(wǎng)絡(luò)拓?fù)浜突陉P(guān)鍵字網(wǎng)頁內(nèi)容過濾技術(shù)以及廣度優(yōu)先搜索技術(shù)設(shè)計(jì)并實(shí)現(xiàn)了一個(gè)面向輿情信息采集的垂直搜索引擎爬蟲,并采用分詞和主題詞抽取方法分析出相應(yīng)的熱點(diǎn)輿情專題,并實(shí)現(xiàn)對突發(fā)輿情事件、涉及內(nèi)容安全的敏感話題及時(shí)發(fā)現(xiàn)與預(yù)警,通過機(jī)器自動(dòng)識別本地區(qū)的突發(fā)輿情,同時(shí)設(shè)計(jì)并實(shí)現(xiàn)了一種輿情報(bào)告半自動(dòng)生成系統(tǒng)的算法,將檢索的結(jié)果數(shù)據(jù)依據(jù)關(guān)鍵詞的頻率、權(quán)重,網(wǎng)頁類別,網(wǎng)頁內(nèi)容預(yù)警,網(wǎng)頁熱度進(jìn)行相關(guān)指標(biāo)進(jìn)行排序,半自動(dòng)生成輿情簡報(bào)。 該系統(tǒng)實(shí)現(xiàn)了對新聞網(wǎng)站、論壇網(wǎng)站、博客和貼吧等網(wǎng)站的輿情信息的有效采集,,并能實(shí)現(xiàn)對采集結(jié)果進(jìn)行統(tǒng)計(jì)分析、主題分析,實(shí)現(xiàn)輿情報(bào)告的半自動(dòng)輸出。
[Abstract]:With the increasing complexity of Internet network environment, network public opinion has had a great impact on social stability and many Internet users.The network public opinion has a wide range of occurrence, the speed of dissemination is fast, and the burst point of public opinion is difficult to find and control, which makes the collection and analysis of public opinion information in the Internet become very important.In this paper, the requirements of the public opinion information collection system in the Internet are deeply analyzed.Then we design and implement a vertical search engine crawler based on Web topology, keyword based content filtering technology and breadth-first search technology, which is oriented to the collection of public opinion information.And using word segmentation and theme word extraction method to analyze the corresponding hot topic of public opinion, and realize the emergency public opinion event, the sensitive topic related to the content security timely discovery and early warning, through the machine automatic identification of the sudden public opinion in the region,At the same time, an algorithm of semi-automatic generation system of public opinion report is designed and implemented. The result data are sorted according to the frequency, weight, category, early warning and heat of the page.Semi-automatic generation of public opinion briefings.The system realizes the effective collection of public opinion information of news website, forum website, blog and post bar, and can realize the statistical analysis of collection result, theme analysis and semi-automatic output of public opinion report.
【學(xué)位授予單位】:天津大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2012
【分類號】:TP393.09

【參考文獻(xiàn)】

相關(guān)期刊論文 前2條

1 曾潤喜;;網(wǎng)絡(luò)輿情管控工作機(jī)制研究[J];圖書情報(bào)工作;2009年18期

2 丁振國;吳寶貴;辛友強(qiáng);;基于Bloom Filter的大規(guī)模網(wǎng)頁去重策略研究[J];現(xiàn)代圖書情報(bào)技術(shù);2008年03期



本文編號:1750812

資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/kejilunwen/sousuoyinqinglunwen/1750812.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶a5a98***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請E-mail郵箱bigeng88@qq.com