基于Storm框架的微博用戶潛在需求實時分析評估系統(tǒng)
[Abstract]:With the popularity and development of the Internet, Weibo, as an open information exchange and sharing platform, can generate hundreds of millions of levels of data every day. Mining the potential purchase behavior of users from these massive data and analyzing it will produce great economic value to the enterprise. However, the current research and analysis methods have the following shortcomings: the real-time analysis of Weibo is insufficient, resulting in a certain lag in the analysis results; at present, Weibo analysis is not targeted enough, and the value of specific groups has not been fully excavated. Aiming at the problems existing in the existing analysis methods for mining the potential purchase behavior of Weibo users, an efficient and real-time Weibo user behavior analysis and evaluation system based on Storm is designed and implemented in this paper. The specific work includes: firstly, the problem of uneven task distribution in the existing scheduling strategies of Storm is proposed and verified by experiments, and then an adaptive scheduling model based on CPU weights is proposed. In order to solve the problem of low efficiency caused by the time delay between internal nodes and the local characteristics of messages. Then it is the design and implementation of the real-time analysis system: it is divided into data source module, data access module, data analysis module and data display module: the data source obtains Weibo data through crawler and Sina API; The data access module solves the problem of data flow delay by building Kafka cluster, realizes the Spout and Bolt interface of Storm, realizes the data analysis module, uses Chinese word segmentation technology to segment the data, and uses K-means to analyze the data. The data storage module and Hbase are used to save the data, and SpringMVC and ECharts are used to realize the data display module. The experimental results show that the performance of the improved scheduling strategy is obviously better than that of the existing scheduling strategies, especially in CPU-intensive scheduling tasks, the performance of the improved scheduling strategy is obviously improved by about 50%. The real-time analysis system can analyze the potential purchase behavior of users in real time, and enterprises can carry out related research and marketing according to the analyzed behavior characteristics.
【學位授予單位】:北京郵電大學
【學位級別】:碩士
【學位授予年份】:2016
【分類號】:TP393.092;TP311.13
【參考文獻】
相關期刊論文 前9條
1 趙林莉;楊曉光;;基于Hadoop的多最小支持度關聯(lián)規(guī)則挖掘研究[J];數(shù)字技術與應用;2015年10期
2 燕明磊;;Hadoop集群中作業(yè)調度研究[J];軟件導刊;2015年04期
3 靳永超;吳懷谷;;基于Storm和Hadoop的大數(shù)據(jù)處理架構的研究[J];現(xiàn)代計算機(專業(yè)版);2015年04期
4 李川;鄂海紅;宋美娜;;基于Storm的實時計算框架的研究與應用[J];軟件;2014年10期
5 柴昱含;李道全;;基于Storm的滑動窗口實現(xiàn)[J];電腦知識與技術;2014年16期
6 黃靜;張琦;江文斌;;基于改進K-Means算法的蠶繭自動計數(shù)方法的研究[J];絲綢;2014年01期
7 杜政頡;王鵬;黃焱;郎福通;;一種基于Storm編程模型的迭代Topology方案[J];成都信息工程學院學報;2014年01期
8 張榆;馬友忠;孟小峰;;一種基于HBase的高效空間關鍵字查詢策略[J];小型微型計算機系統(tǒng);2012年10期
9 林大云;;基于Hadoop的微博信息挖掘[J];計算機光盤軟件與應用;2012年01期
相關博士學位論文 前1條
1 田野;基于微博平臺的事件趨勢分析及預測研究[D];武漢大學;2012年
相關碩士學位論文 前9條
1 南海京;一種基于STORM的交通流數(shù)據(jù)實時處理系統(tǒng)設計與實現(xiàn)[D];北方工業(yè)大學;2015年
2 馬瑞;基于Storm的短信詐騙攔截提示系統(tǒng)的設計與實現(xiàn)[D];北京郵電大學;2014年
3 周茜;基于網(wǎng)絡爬蟲的信息采集分類系統(tǒng)設計與實現(xiàn)[D];廈門大學;2013年
4 李浩;基于Twitter Storm的云平臺監(jiān)控系統(tǒng)研究與實現(xiàn)[D];東北大學;2013年
5 史冬冬;云隊列:一個基于Hadoop的大規(guī)模消息基礎平臺[D];東華大學;2012年
6 石安磊;基于文本相似度評分的中醫(yī)案例分析系統(tǒng)研究與實現(xiàn)[D];西北大學;2011年
7 徐曉明;專利文本聚類及關鍵短語抽取的研究[D];東北大學;2011年
8 董長春;基于Hadoop的倒排索引技術的研究[D];遼寧大學;2011年
9 蘇旋;分布式網(wǎng)絡爬蟲技術的研究與實現(xiàn)[D];哈爾濱工業(yè)大學;2006年
,本文編號:2479467
本文鏈接:http://www.sikaile.net/kejilunwen/ruanjiangongchenglunwen/2479467.html