基于Hadoop的微博用戶情感分類研究與實(shí)現(xiàn)
[Abstract]:With the development and popularity of new social networking services such as Weibo, people have become more flexible, free and quick to express their views and feelings through such media. Therefore, according to Weibo's emotional classification, it is becoming more and more important to understand the reaction of users to policies, products, public opinion hot spots, and so on, and better to the users themselves and the enterprises, through Weibo emotional classification. It is of great significance for the government to provide decision support. When carrying out emotion classification task on Weibo's massive data set, the expansibility of traditional emotion classification algorithm becomes the bottleneck of the system. Therefore, this paper first studies the main technology of cloud computing platform-Hadoop, and analyzes the feasibility of implementing emotion classification on Hadoop. On this basis, according to the emotional characteristics of Weibo text, this paper improves the feature extraction algorithm based on Weibo emotion elements and semantics through the combination of automatic and artificial construction of emotional corpus, and designs a distributed feature extraction algorithm based on Hadoop technology. Extensible, autonomous Weibo emotional classification model. Aiming at the emotion classification problem in this model, a naive Bayesian emotion classification algorithm based on Hadoop is designed and implemented. The test results show that using naive Bayesian emotion classification model based on Hadoop to classify massive Weibo data has good performance efficiency and high scalability.
【學(xué)位授予單位】:西安電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.092;TP391.1
【參考文獻(xiàn)】
相關(guān)期刊論文 前9條
1 胡光民;周亮;柯立新;;基于Hadoop的網(wǎng)絡(luò)日志分析系統(tǒng)研究[J];電腦知識(shí)與技術(shù);2010年22期
2 吳維;肖詩斌;;基于多特征與復(fù)合分類法的中文微博情感分析[J];北京信息科技大學(xué)學(xué)報(bào)(自然科學(xué)版);2013年04期
3 劉志明;劉魯;;基于機(jī)器學(xué)習(xí)的中文微博情感分類實(shí)證研究[J];計(jì)算機(jī)工程與應(yīng)用;2012年01期
4 張玉芳;彭時(shí)名;呂佳;;基于文本分類TFIDF方法的改進(jìn)與應(yīng)用[J];計(jì)算機(jī)工程;2006年19期
5 謝麗星;周明;孫茂松;;基于層次結(jié)構(gòu)的多策略中文微博情感分析和特征抽取[J];中文信息學(xué)報(bào);2012年01期
6 龐磊;李壽山;周國棟;;基于情緒知識(shí)的中文微博情感分類方法[J];計(jì)算機(jī)工程;2012年13期
7 周勝臣;瞿文婷;石英子;施詢之;孫韻辰;;中文微博情感分析研究綜述[J];計(jì)算機(jī)應(yīng)用與軟件;2013年03期
8 張浩;尚進(jìn);;微博時(shí)代的電子政務(wù)建設(shè)與創(chuàng)新[J];中國信息界;2011年09期
9 陳彥舟;曹金璇;;基于Hadoop的微博輿情監(jiān)控系統(tǒng)[J];計(jì)算機(jī)系統(tǒng)應(yīng)用;2013年04期
本文編號(hào):2282534
本文鏈接:http://www.sikaile.net/guanlilunwen/ydhl/2282534.html