天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 社科論文 > 社會學論文 >

基于內容的新浪微博輿情預測研究

發(fā)布時間:2018-08-18 09:33
【摘要】:隨著互聯(lián)網(wǎng)的飛速發(fā)展,網(wǎng)絡成為了人們獲取信息和發(fā)表意見的重要載體。新浪微博以其短小精悍、表達方式簡單等特征,吸引了大量的用戶。當今的新浪微博月活兩億以上,日活達到千萬數(shù)量,微博用戶每時每刻在平臺上進行大量的博文輸出,用戶轉評贊活躍。微博在給信息傳播和熱點討論帶來便利的同時也給虛假信息的滋生創(chuàng)造了條件,負面、虛假信息的傳播不僅會擾亂和諧的網(wǎng)絡環(huán)境也會給社會帶來負面的影響。然而微博平臺數(shù)據(jù)龐大,如果僅依靠人為的操作和管理不僅獲取的信息量有限而且會消耗大量的人力物力。輿情監(jiān)控系統(tǒng)既可以實現(xiàn)及時地發(fā)現(xiàn)熱點事件,又可將整個監(jiān)控過程平臺化、自動化,實現(xiàn)了高效地運作。本文使用文本挖掘的相關技術,實現(xiàn)了對海量博文的分類和聚類。在文本向量化階段使用分布式卡方特征提取法降維,tfidf值計算權重。采用支持向量機的分類方法和kmeans的聚類方法。在文本分類和聚類的基礎上形成事件。通過博文總量的轉發(fā)、評論和點贊數(shù)計算事件熱度。最終形成熱點事件的監(jiān)控數(shù)據(jù)。并可實現(xiàn)歷史事件的數(shù)據(jù)分析與展示。本文在之前輿情研究的基礎上,實現(xiàn)了基于內容的輿情監(jiān)控系統(tǒng),并在事件聚類之前進行了類別的劃分,使得監(jiān)控的事件覆蓋度更廣,內容更加豐富。
[Abstract]:With the rapid development of the Internet, the Internet has become an important carrier for people to obtain information and express their opinions. Sina Weibo to its short, simple expression and other characteristics, attracted a large number of users. Nowadays, Sina Weibo has more than 200 million active users every month, and millions of active users every day. Weibo users carry out a large number of blog posts on the platform every moment of the day. Weibo not only brings convenience to information dissemination and hot discussion, but also creates conditions for the breeding of false information. The spread of false information not only disturbs the harmonious network environment, but also brings negative influence to the society. However, the data of Weibo platform is huge, if it only depends on artificial operation and management, not only the amount of information obtained is limited, but also a lot of manpower and material resources will be consumed. The monitoring system of public opinion can not only discover hot events in time, but also make the whole monitoring process platform and automate, and realize efficient operation. In this paper, the text mining technology is used to realize the classification and clustering of massive blog articles. In the phase of text vectorization, the distributed chi-square feature extraction method is used to reduce the dimension and tfidf value to calculate the weight. Support vector machine classification method and kmeans clustering method are adopted. Events are formed on the basis of text classification and clustering. The heat of events is calculated by forwarding, commenting, and counting the total amount of blog posts. Finally, the monitoring data of hot spot events are formed. And can realize the historical event data analysis and display. Based on the previous research of public opinion, this paper implements a content-based monitoring system for public opinion, and classifies categories before event clustering, which makes the coverage of monitoring events wider and the content more abundant.
【學位授予單位】:首都經(jīng)濟貿易大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:G206;C912.63

【參考文獻】

相關期刊論文 前10條

1 楊愛東;劉東蘇;;基于Hadoop的微博輿情監(jiān)控系統(tǒng)模型研究[J];現(xiàn)代圖書情報技術;2016年05期

2 余秀才;;微博輿情研究中的大數(shù)據(jù)風險與挑戰(zhàn)[J];華中科技大學學報(社會科學版);2015年05期

3 蘭月新;董希琳;蘇國強;瞿志凱;;大數(shù)據(jù)背景下微博輿情信息交互模型研究[J];現(xiàn)代圖書情報技術;2015年05期

4 李天龍;李明德;張宏邦;;微博輿情生成機制研究[J];情報雜志;2014年09期

5 唐曉波;童海燕;嚴承希;;基于話題情感強度的微博輿情分析[J];圖書館學研究;2014年17期

6 張s,

本文編號:2189067


資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/shekelunwen/shgj/2189067.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權申明:資料由用戶ef24d***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com