面向顧客目錄分割算法的研究及應用

發(fā)布時間：2018-05-24 16:07

本文選題：數(shù)據(jù)挖掘 + 目錄分割��；參考：《華南理工大學》2013年碩士論文

【摘要】：目前，數(shù)據(jù)挖掘在微觀經(jīng)濟領(lǐng)域有著廣泛的應用。目錄分割是其中一個重要的研究方向。在過去目錄分割的研究中，主要是以目錄中商品被購買的數(shù)量來衡量制定目錄的好壞。但隨著目錄分割研究的深入、目錄分割實際應用領(lǐng)域的增多，以吸引更多顧客前來光顧實體店或網(wǎng)店為目標的目錄分割問題（簡稱面向顧客目錄分割）的重要性日益凸顯。面向顧客目錄分割主要應用于顧客分類、產(chǎn)品推送、客戶關(guān)系管理等方面。本文在對現(xiàn)有面向顧客目錄分割算法進行深入分析的基礎(chǔ)上，修正了Best-Product-fit （BPF）算法中用于選擇“最佳商品”的評分函數(shù)；提出了新的面向顧客目錄算法主要包括兩個步驟：①從歷史交易數(shù)據(jù)中尋找頻繁項集；②創(chuàng)建合適的商品目錄，依據(jù)商品目錄對顧客進行劃分。最后在面向顧客目錄分割模型中引入了商品利潤指標，得到了新的一般化模型；著重研究了數(shù)據(jù)歸一、數(shù)據(jù)庫表示方式、利潤加權(quán)計算等幾個關(guān)鍵問題。本文的主要工作有以下幾點：（1）針對BPF算法中的評分函數(shù)在興趣度t2時，兩部分存在的不平衡性，提出了采用動態(tài)加權(quán)方式，來獲得較均衡的商品分值。通過實驗比較發(fā)現(xiàn)改進后的評分函數(shù)相對于傳統(tǒng)的評分函數(shù)，算法可以覆蓋更多的顧客。（2）針對BPF算法不能適用于企業(yè)收益為負的情形，引入了風險度概念并給出了新的評分函數(shù)，在此基礎(chǔ)上得到了算法適用于服務性質(zhì)的企業(yè)。（3）根據(jù)顧客興趣度和頻繁模式挖掘中最小支持度間的關(guān)系，提出了通過頻繁模式創(chuàng)建商品目錄的思想。提出了基于頻繁模式挖掘的目錄分割算法。（4）在面向顧客目錄分割中引入利潤指標，給出了利潤加權(quán)的兩種方式，即直接加權(quán)和間接加權(quán)。實驗證明，該模型可以覆蓋更多的高價值顧客。（5）使用真實電商上的交易數(shù)據(jù)，將改進后的面向顧客目錄分割算法和經(jīng)典的算法BPF做了對比分析；并將兩種不同的利潤加權(quán)方式做了對比分析，，結(jié)果發(fā)現(xiàn)間接利潤加權(quán)方式有著更好的效果。
[Abstract]:At present, data mining has a wide range of applications in the field of microeconomics. Directory segmentation is one of the important research directions. In the past research on catalog segmentation, the quality of cataloguing was mainly measured by the quantity of goods purchased in the catalogue. However, with the development of research on directory segmentation, the importance of directory segmentation, which aims at attracting more customers to visit physical stores or online shops, is becoming more and more important. Customer-oriented catalog segmentation is mainly used in customer classification, product push, customer relationship management and so on. Based on the in-depth analysis of the existing customer-oriented catalog segmentation algorithms, this paper modifies the scoring function used to select the "best commodity" in the Best-Product-fit / BPF algorithm. In this paper, a new customer-oriented directory algorithm is proposed, which includes two steps: 1 to find frequent itemsets from the historical transaction data and to create a suitable catalog, and to divide the customers according to the catalog. Finally, a new general model is obtained by introducing the commodity profit index into the customer oriented catalog segmentation model, and some key problems, such as data normalization, database representation, profit weighting calculation and so on, are studied emphatically. The main work of this paper is as follows: 1) aiming at the imbalance between the two parts of the scoring function in the BPF algorithm, a dynamic weighting method is proposed to obtain a more balanced commodity score. The experimental results show that the improved scoring function can cover more customers than the traditional scoring function. 2) in view of the fact that the BPF algorithm can not be applied to the situation where the enterprise returns are negative, the concept of risk degree is introduced and a new scoring function is given. On the basis of this, it is obtained that the algorithm is suitable for enterprises with service nature. 3) according to the relationship between customer interest degree and minimum support degree in frequent pattern mining, the idea of creating catalog by frequent pattern is put forward. A directory segmentation algorithm based on frequent pattern mining is proposed. In this paper, the profit index is introduced into customer oriented catalog segmentation, and two ways of weighted profit are given, that is, direct weighting and indirect weighting. Experiments show that the model can cover more high value customers. (5) using the transaction data on real e-commerce, the improved client-oriented catalog segmentation algorithm and the classical algorithm BPF are compared and analyzed, and the two different profit weighting methods are compared and analyzed. The results show that indirect profit weighting has better effect.
【學位授予單位】：華南理工大學
【學位級別】：碩士
【學位授予年份】：2013
【分類號】：TP311.13;TP391.41

【參考文獻】

相關(guān)期刊論文前5條

1 周常恩;林端宜;楊雪梅;賴新梅;褚劍鋒;;頻繁模式挖掘算法綜述[J];福建電腦;2010年02期

2 李曉毅;徐兆棣;;關(guān)聯(lián)規(guī)則挖掘的算法分析[J];遼寧工程技術(shù)大學學報;2006年02期

3 何友全;;大型數(shù)據(jù)庫中關(guān)于多頻項集的動態(tài)增量式挖掘[J];計算機工程;2006年02期

4 謝廷婷;;頻繁集挖掘算法研究[J];計算機與現(xiàn)代化;2007年03期

5 趙丹丹;;Apriori算法改進及其在中藥知識發(fā)掘中的應用[J];計算機與現(xiàn)代化;2007年08期

相關(guān)博士學位論文前2條

1 馬海兵;頻繁模式挖掘相關(guān)技術(shù)研究[D];復旦大學;2005年

2 徐秀娟;商務智能中的利潤挖掘研究[D];吉林大學;2008年

本文編號：1929712

資料下載