天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 管理論文 > 營銷論文 >

基于Hadoop的用戶瀏覽路徑挖掘技術(shù)研究

發(fā)布時(shí)間:2018-03-01 11:48

  本文關(guān)鍵詞: Web日志 瀏覽偏愛路徑 MapReduce Hadoop 出處:《湖南工業(yè)大學(xué)》2015年碩士論文 論文類型:學(xué)位論文


【摘要】:隨著互聯(lián)網(wǎng)發(fā)展帶來的數(shù)據(jù)爆炸,使得Web服務(wù)器積累了大量的日志數(shù)據(jù),如何從海里的Web日志中挖掘有價(jià)值的信息成為了目前的研究熱點(diǎn)之一。通過對Web日志進(jìn)行有效分析和挖掘,進(jìn)而發(fā)現(xiàn)用戶瀏覽偏愛路徑,既可以為優(yōu)化網(wǎng)站的拓?fù)浣Y(jié)構(gòu)提供參考,而且又能為企業(yè)制定更完善的營銷策略提供依據(jù)。本文對基于Hadoop的用戶瀏覽路徑挖掘技術(shù)進(jìn)行了相關(guān)研究,所做工作主要包括以下三個(gè)方面。1.提出并實(shí)現(xiàn)了一種基于可信興趣度的用戶瀏覽偏愛路徑挖掘算法。在充分考慮用戶瀏覽頁面時(shí)對頁面的感興趣程度的前提下,提出了頁面興趣度的概念;結(jié)合用戶瀏覽路徑選擇因素、頁面放置位置和其他頁面對該頁面的鏈接原因及網(wǎng)站拓?fù)浣Y(jié)構(gòu)圖修正加權(quán)衡量標(biāo)準(zhǔn),提出了可信選擇度的概念。并將可信選擇度和頁面興趣度綜合度量,得到可信興趣度指標(biāo)。提出并實(shí)現(xiàn)了基于可信興趣度的用戶瀏覽偏愛路徑挖掘算法(MUPCDI)。2.提出并實(shí)現(xiàn)了基于MapReduce的可信興趣度用戶瀏覽偏愛路徑挖掘算法,該算法運(yùn)行于Hadoop分布式集群環(huán)境中,能對海量用戶瀏覽偏愛路徑進(jìn)行分析與挖掘。3.針對目標(biāo)數(shù)據(jù)集,應(yīng)用基于可信興趣度的用戶瀏覽偏愛路徑挖掘算法(MUPCDI)對可信興趣度算法的閾值、準(zhǔn)確性和有效性進(jìn)行了對比分析;同時(shí)應(yīng)用基于MapReduce的可信興趣度用戶瀏覽偏愛路徑挖掘算法對分布式平臺的高效性進(jìn)行了對比分析。以上工作表明,本文提出的可信興趣度算法對于挖掘用戶瀏覽偏愛路徑更為準(zhǔn)確和有效;同時(shí)針對挖掘大數(shù)據(jù)集的Web日志,分布式環(huán)境下本文提出的基于MapReduce的可信興趣度算法挖掘用戶瀏覽偏愛路徑的效率遠(yuǎn)高于單機(jī)環(huán)境下。
[Abstract]:With the data explosion brought by the development of the Internet, the Web server has accumulated a lot of log data. How to mine valuable information from Web logs in the sea has become one of the research hotspots at present. By analyzing and mining Web logs effectively, we can find the preferred path for users to browse. It can not only provide a reference for optimizing the topological structure of the website, but also provide the basis for the enterprise to formulate a more perfect marketing strategy. This paper has carried on the related research to the user browsing path mining technology based on Hadoop. The main work includes the following three aspects. 1. A user preference path mining algorithm based on trusted interest is proposed and implemented. This paper puts forward the concept of page interest degree, combines with the factors of user browsing path selection, the reason of page placement and the link of other pages to the page, and modifies the weighted measurement standard of website topology chart. In this paper, the concept of trusted selection is put forward, and the trust selection and page interest are comprehensively measured. A user browsing preference path mining algorithm based on trusted interest is proposed and implemented. 2. A user browsing preference path mining algorithm based on MapReduce is proposed and implemented. The algorithm runs in the Hadoop distributed cluster environment and can analyze and mine the preference paths of massive users. The user browsing preference path mining algorithm based on trusted interest is used to analyze the threshold, accuracy and validity of trust interest. At the same time, using the trusted interest degree user browsing preference path mining algorithm based on MapReduce, the efficiency of distributed platform is compared and analyzed. The trusted interest algorithm proposed in this paper is more accurate and effective for mining user browsing preference paths, while mining the Web logs of big data set. The MapReduce based trusted interest algorithm proposed in this paper is much more efficient than the single computer environment in mining user browsing preference paths.
【學(xué)位授予單位】:湖南工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2015
【分類號】:TP311.13

【參考文獻(xiàn)】

相關(guān)期刊論文 前9條

1 崔曉靖;陳興蜀;曾雪梅;;基于站點(diǎn)結(jié)構(gòu)和瀏覽時(shí)間的路徑補(bǔ)全算法[J];計(jì)算機(jī)工程與設(shè)計(jì);2014年03期

2 柳平;李春青;姬嬋娟;;基于HDFS的云存儲架構(gòu)模型分析[J];電腦知識與技術(shù);2013年36期

3 王思寶;李銀勝;;基于Web日志挖掘用戶的瀏覽興趣路徑[J];計(jì)算機(jī)應(yīng)用與軟件;2012年01期

4 程苗;;基于云計(jì)算的用戶瀏覽偏愛路徑挖掘算法[J];計(jì)算機(jī)工程與應(yīng)用;2011年29期

5 吳晶;張品;羅辛;盛浩;熊璋;;門戶個(gè)性化興趣獲取與遷移模式發(fā)現(xiàn)[J];計(jì)算機(jī)研究與發(fā)展;2007年08期

6 郭巖,白碩,楊志峰,張凱;網(wǎng)絡(luò)日志規(guī)模分析和用戶興趣挖掘[J];計(jì)算機(jī)學(xué)報(bào);2005年09期

7 邢東山,沈鈞毅,宋擒豹;從Web日志中挖掘用戶瀏覽偏愛路徑[J];計(jì)算機(jī)學(xué)報(bào);2003年11期

8 邢東山,沈鈞毅,宋擒豹;用戶瀏覽偏愛模式挖掘算法的研究[J];西安交通大學(xué)學(xué)報(bào);2002年04期

9 韓家煒,孟小峰,王靜,李盛恩;Web挖掘研究[J];計(jì)算機(jī)研究與發(fā)展;2001年04期



本文編號:1551799

資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/guanlilunwen/yingxiaoguanlilunwen/1551799.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶8ada0***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請E-mail郵箱bigeng88@qq.com