基于用戶評分和遺傳算法的協(xié)同過濾推薦算法
發(fā)布時間:2018-12-17 06:14
【摘要】:隨著互聯(lián)網(wǎng)的迅速發(fā)展,人們的生活發(fā)生了翻天覆地的巨大變化,但是如何從龐大的信息中找到自己需要的也變得越來越難。在這種背景下,推薦系統(tǒng)應運而生了,并且發(fā)揮了巨大作用;推薦系統(tǒng)在減少很多網(wǎng)站存在的信息過載問題所帶來的諸多負面影響方面發(fā)揮了越來越重要的作用,而在這些網(wǎng)站上,用戶往往很有可能通過評分投票的方式表達出他們對一系列物品或者服務(wù)的喜好。協(xié)同過濾推薦算法是目前廣泛使用的一種推薦技術(shù)。它分析用戶興趣,在用戶群中找到指定用戶的相似(興趣)用戶,綜合這些相似用戶對某一信息的評價,形成系統(tǒng)對該指定用戶對此信息的喜好程度的預測。常用的相似性計算方法有余弦相似性、Pearson相關(guān)系數(shù)等方法,但這些相似性計算方法通常公式比較復雜,這樣就導致推薦過程中的相似性計算耗時過多,降低推薦效率。本文將提出一種新的相似性計算方法,該方法基于遺傳算法和用戶評分信息。首先,提出一個向量元素的個數(shù)為C-c+1(例如C=5,c=1,元素個數(shù)為5)。表示兩個用戶x,y對同一個物品評分的評分差為i出現(xiàn)的次數(shù)a與同時都被這兩個用戶評過分的物品的個數(shù)b的比值。其次,提出一個權(quán)重向量元素個數(shù)是C-c+l。每個元素q(i)的值在[-1,1]之間。每個元素q(i)用來衡量px,y(i)對于計算兩個用戶之間相似性的重要程度。由這兩個向量構(gòu)成新的相似性計算方法。其中最佳權(quán)重向量通過遺傳算法來得到。最后,將上面新的相似性計算方法在FilmAffinity兩個數(shù)據(jù)集進行實驗。通過訓練集得到推薦模型,然后將種群中的個體q運用到訓練集中進Movielens行預測推薦,得到該個體q對應的系統(tǒng)MAE如果小于給定的閾值,那么該個體就是最佳個體,將其運用到測試集中進行性能測試。通過實驗比較性能指標,在推薦系統(tǒng)中,本方法在預測、推薦質(zhì)量等方面與傳統(tǒng)方法相比有一定提高,并且推薦效率也有一定的提升。
[Abstract]:With the rapid development of the Internet, people's lives have undergone tremendous changes, but how to find their own needs from the huge information becomes more and more difficult. In this context, recommendation system emerged as the times require, and played a great role; Recommendation systems are playing an increasingly important role in reducing the many negative effects of information overload problems on many websites, Users are more likely to express their preference for a range of goods or services by voting on ratings. Collaborative filtering recommendation algorithm is a widely used recommendation technology. It analyzes the interest of the user, finds the similar user in the user group, synthesizes the evaluation of the information by these similar users, and forms the prediction of the system's preference for the information. The common methods of similarity calculation are cosine similarity and Pearson correlation coefficient, but the formulas of these similarity calculation methods are usually complicated, which leads to the time-consuming calculation of similarity in the process of recommendation and the reduction of recommendation efficiency. In this paper, a new similarity calculation method is proposed, which is based on genetic algorithm and user scoring information. First of all, we propose that the number of vector elements is C-c1 (for example, the number of elements is 5). The difference between the two users' scores of the same item is the ratio of the number of times I appears a and the number of items overrated by the two users at the same time b. Secondly, it is proposed that the number of weight vector elements is C-cl. The value of each element q (i) is between [- 1]. Each element q (i) is used to measure the importance of px,y (i) in calculating the similarity between two users. The two vectors constitute a new similarity calculation method. The optimal weight vector is obtained by genetic algorithm. Finally, the new similarity calculation method is applied to the two data sets of FilmAffinity. The recommendation model is obtained from the training set, and then the individual Q of the population is applied to the training set to predict the recommendation in the Movielens row. If the system MAE corresponding to the individual Q is less than the given threshold, then the individual is the best individual. Apply it to the test set for performance testing. Compared with the traditional methods, the performance index of this method is improved and the efficiency of recommendation is also improved in the recommendation system.
【學位授予單位】:湖南大學
【學位級別】:碩士
【學位授予年份】:2016
【分類號】:TP391.3
,
本文編號:2383789
[Abstract]:With the rapid development of the Internet, people's lives have undergone tremendous changes, but how to find their own needs from the huge information becomes more and more difficult. In this context, recommendation system emerged as the times require, and played a great role; Recommendation systems are playing an increasingly important role in reducing the many negative effects of information overload problems on many websites, Users are more likely to express their preference for a range of goods or services by voting on ratings. Collaborative filtering recommendation algorithm is a widely used recommendation technology. It analyzes the interest of the user, finds the similar user in the user group, synthesizes the evaluation of the information by these similar users, and forms the prediction of the system's preference for the information. The common methods of similarity calculation are cosine similarity and Pearson correlation coefficient, but the formulas of these similarity calculation methods are usually complicated, which leads to the time-consuming calculation of similarity in the process of recommendation and the reduction of recommendation efficiency. In this paper, a new similarity calculation method is proposed, which is based on genetic algorithm and user scoring information. First of all, we propose that the number of vector elements is C-c1 (for example, the number of elements is 5). The difference between the two users' scores of the same item is the ratio of the number of times I appears a and the number of items overrated by the two users at the same time b. Secondly, it is proposed that the number of weight vector elements is C-cl. The value of each element q (i) is between [- 1]. Each element q (i) is used to measure the importance of px,y (i) in calculating the similarity between two users. The two vectors constitute a new similarity calculation method. The optimal weight vector is obtained by genetic algorithm. Finally, the new similarity calculation method is applied to the two data sets of FilmAffinity. The recommendation model is obtained from the training set, and then the individual Q of the population is applied to the training set to predict the recommendation in the Movielens row. If the system MAE corresponding to the individual Q is less than the given threshold, then the individual is the best individual. Apply it to the test set for performance testing. Compared with the traditional methods, the performance index of this method is improved and the efficiency of recommendation is also improved in the recommendation system.
【學位授予單位】:湖南大學
【學位級別】:碩士
【學位授予年份】:2016
【分類號】:TP391.3
,
本文編號:2383789
本文鏈接:http://www.sikaile.net/kejilunwen/ruanjiangongchenglunwen/2383789.html
最近更新
教材專著