天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

面向稀疏性數(shù)據(jù)的協(xié)同過濾推薦算法的研究

發(fā)布時間:2018-03-19 01:32

  本文選題:推薦系統(tǒng) 切入點:數(shù)據(jù)稀疏性 出處:《吉林大學(xué)》2017年碩士論文 論文類型:學(xué)位論文


【摘要】:隨著互聯(lián)網(wǎng)和電子商務(wù)的迅速發(fā)展,網(wǎng)絡(luò)上的信息迅速膨脹,出現(xiàn)了“信息過載”現(xiàn)象。個性化推薦技術(shù)能夠幫助用戶快速、準確地從雜亂無章的信息找到用戶所需的信息,一定程度上緩解了“信息過載”問題。作為當前應(yīng)用最廣泛的個性化推薦技術(shù)之一,協(xié)同過濾技術(shù)在現(xiàn)實應(yīng)用中已經(jīng)獲得了相當大的成功,但是由于現(xiàn)實的數(shù)據(jù)往往都十分稀疏,導(dǎo)致了協(xié)同過濾技術(shù)出現(xiàn)數(shù)據(jù)稀疏性問題。冷啟動問題可以看作是數(shù)據(jù)稀疏性問題的極端情況,本文將其視為數(shù)據(jù)稀疏性問題研究。數(shù)據(jù)稀疏性問題嚴重影響了協(xié)同過濾推薦算法的推薦質(zhì)量。引起數(shù)據(jù)稀疏性問題是由于推薦系統(tǒng)中的用戶數(shù)量和項目數(shù)量越來越多,用戶對項目的評分數(shù)量又很少,這樣用戶評分矩陣必然很稀疏,而協(xié)同過濾算法又非常依賴用戶評分矩陣。為了解決數(shù)據(jù)稀疏性問題,研究人員針對用戶評分矩陣提出了許多方法,主要分兩大類:第一類對評分矩陣進行填充,降低其稀疏程度;第二類是對評分矩陣進行分解,刪除對計算相似度影響不大的用戶和項目,降低評分矩陣維度。在第二類方法中,選擇刪除的信息很可能會含有用戶的有用信息,影響推薦質(zhì)量,所以本文選擇在第一類方法的基礎(chǔ)上解決推薦系統(tǒng)里的數(shù)據(jù)稀疏性問題。具體工作如下:1)針對用戶冷啟動問題,提出了融合用戶特征和項目關(guān)系的協(xié)同過濾算法(User-Item-Mix CF)。傳統(tǒng)的協(xié)同過濾算法在計算用戶間相似性時,沒有考慮項目之間的關(guān)系,這樣會導(dǎo)致計算出的用戶相似性不準確;谠搯栴}本文提出一種融合項目關(guān)系的用戶間相似性計算方法(Item-Based User Sim),旨在提高用戶間相似性計算的準確性;其后,在改進的用戶間相似性算法的基礎(chǔ)上,在計算用戶相似性時,加入了用戶特征屬性,并通過動態(tài)平衡權(quán)值?將其與項目之間的關(guān)系融合,提出User-Item-Mix CF算法。最后,在Movie Lens數(shù)據(jù)集上,將User-Item-Mix CF算法與眾數(shù)法進行對比實驗,實驗結(jié)果表明:在選取不同的新用戶個數(shù)時,User-Item-Mix CF算法的平均絕對誤差(MAE)值均小于眾數(shù)法。2)針對數(shù)據(jù)稀疏性問題,提出了基于用戶評分預(yù)測的協(xié)同過濾算法(User-SP CF)。該算法在計算項目之間相似性時,利用Item-Based User Sim算法計算用戶間的相似性,并將計算得到的用戶間相似性值填充到評分矩陣中未評分的項,降低矩陣稀疏性;在填充得到的評分矩陣中,尋找目標項目的最近鄰居集,完成推薦。最后在Movie Lens數(shù)據(jù)集上,將User-SP CF算法同基于項目評分預(yù)測的協(xié)同過濾算法和基于項目的協(xié)同過濾算法進行對比實驗,實驗結(jié)果表明:在選取不同鄰居個數(shù)時,User-SP CF算法的平均絕對誤差(MAE)值均小于另外兩種算法。
[Abstract]:With the rapid development of the Internet and electronic commerce, the information on the network expands rapidly, and the phenomenon of "information overload" appears. Personalized recommendation technology can help users find the information they need quickly and accurately from the random information. To some extent, it alleviates the problem of "information overload". As one of the most widely used personalized recommendation technologies, collaborative filtering technology has achieved considerable success in practical applications. However, due to the fact that the data are often very sparse, the problem of data sparsity in collaborative filtering technology is caused. The cold start problem can be regarded as the extreme case of data sparsity problem. In this paper, the problem of data sparsity is considered as a study of data sparsity, which seriously affects the recommendation quality of collaborative filtering recommendation algorithm. The problem of data sparsity is caused by the increasing number of users and items in the recommendation system. In order to solve the problem of data sparsity, the user rating matrix is very sparse, and the collaborative filtering algorithm relies heavily on the user score matrix to solve the problem of data sparsity. Researchers have proposed a number of methods for user rating matrices, which are divided into two main categories: the first is to fill the scoring matrix to reduce its sparsity, and the second is to decompose the scoring matrix. Delete users and items that have little effect on computing similarity, and reduce the score matrix dimension. In the second method, the information selected to delete is likely to contain useful information of users and affect the quality of recommendation. So this paper chooses to solve the problem of data sparsity in recommendation system based on the first method. In this paper, a collaborative filtering algorithm combining user features and item relationships is proposed. The traditional collaborative filtering algorithm does not consider the relationship between items when calculating the similarity between users. This will lead to inaccurate user similarity calculation. Based on this problem, this paper proposes an Item-Based User simulation method to improve the accuracy of user similarity calculation. On the basis of the improved similarity algorithm between users, the user characteristic attribute is added in the calculation of user similarity, and the dynamic balance weight is adopted. The relationship between User-Item-Mix CF and the project is fused, and the User-Item-Mix CF algorithm is proposed. Finally, on the Movie Lens data set, the User-Item-Mix CF algorithm is compared with the mode method. The experimental results show that the average absolute error (mae) of User-Item-Mix CF algorithm is smaller than that of mode method. In this paper, a collaborative filtering algorithm based on user score prediction is proposed, which uses Item-Based User Sim algorithm to calculate the similarity between users when calculating the similarity between items. The calculated similarity value between users is filled into the ungraded items in the score matrix to reduce the sparsity of the matrix. In the filled score matrix, the nearest neighbor set of the target item is found and the recommendation is completed. Finally, on the Movie Lens data set, The User-SP CF algorithm is compared with the co-filtering algorithm based on item score prediction and the co-filtering algorithm based on item. The experimental results show that the mean absolute error (mae) of the User-SP CF algorithm is lower than that of the other two algorithms when the number of neighbors is selected.
【學(xué)位授予單位】:吉林大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.3

【參考文獻】

相關(guān)期刊論文 前7條

1 鄧愛林,朱揚勇,施伯樂;基于項目評分預(yù)測的協(xié)同過濾推薦算法[J];軟件學(xué)報;2003年09期

2 張光衛(wèi);李德毅;李鵬;康建初;陳桂生;;基于云模型的協(xié)同過濾推薦算法[J];軟件學(xué)報;2007年10期

3 許海玲;吳瀟;李曉東;閻保平;;互聯(lián)網(wǎng)推薦系統(tǒng)比較研究[J];軟件學(xué)報;2009年02期

4 馬宏偉;張光衛(wèi);李鵬;;協(xié)同過濾推薦算法綜述[J];小型微型計算機系統(tǒng);2009年07期

5 嵇曉聲;劉宴兵;羅來明;;協(xié)同過濾中基于用戶興趣度的相似性度量方法[J];計算機應(yīng)用;2010年10期

6 張玉芳;代金龍;熊忠陽;;分步填充緩解數(shù)據(jù)稀疏性的協(xié)同過濾算法[J];計算機應(yīng)用研究;2013年09期

7 孟祥武;劉樹棟;張玉潔;胡勛;;社會化推薦系統(tǒng)研究[J];軟件學(xué)報;2015年06期

相關(guān)博士學(xué)位論文 前1條

1 孫小華;協(xié)同過濾系統(tǒng)的稀疏性與冷啟動問題研究[D];浙江大學(xué);2005年

,

本文編號:1632252

資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/jingjilunwen/dianzishangwulunwen/1632252.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶d395e***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com