基于Co-training的用戶屬性預(yù)測研究
發(fā)布時(shí)間:2018-06-10 06:59
本文選題:用戶屬性 + Co-training; 參考:《工程科學(xué)與技術(shù)》2017年S2期
【摘要】:針對當(dāng)前基于第三方應(yīng)用數(shù)據(jù)進(jìn)行用戶屬性預(yù)測算法研究,其較少考慮應(yīng)用前臺(tái)實(shí)際使用時(shí)長問題,由此,本文在應(yīng)用的使用頻率及使用時(shí)長的基礎(chǔ)上,構(gòu)造了應(yīng)用前臺(tái)均使用時(shí)長特征,該特征能進(jìn)一步刻畫用戶對應(yīng)用的興趣度;同時(shí),為充分利用大量未標(biāo)注數(shù)據(jù),從多角度特征對用戶屬性進(jìn)行預(yù)測,由此本文采用了Co-training框架,該框架包含兩個(gè)均由棧式自編碼器與神經(jīng)網(wǎng)絡(luò)相結(jié)合的網(wǎng)絡(luò)結(jié)構(gòu)。實(shí)驗(yàn)過程中,對于棧式自編碼算法,先利用未標(biāo)注的數(shù)據(jù)對網(wǎng)絡(luò)進(jìn)行參數(shù)初始化,使得網(wǎng)絡(luò)參數(shù)處于一個(gè)較優(yōu)的位置,再利用有標(biāo)注的數(shù)據(jù),采用基于準(zhǔn)確率的梯度下降算法,對網(wǎng)絡(luò)參數(shù)進(jìn)行更新,最終達(dá)到收斂。實(shí)驗(yàn)結(jié)果表明,本文算法在準(zhǔn)確率、召回率、F1值上均有所提高。
[Abstract]:In view of the current research on user attribute prediction algorithm based on third-party application data, the problem of actual usage time of application foreground is less considered. Therefore, based on the frequency and duration of application, In order to make full use of a large amount of unannotated data and to predict user attributes from multiple angles, the Co-training framework is used in this paper. The framework consists of two networks which are composed of stack self-encoder and neural network. In the process of experiment, for the stack self-coding algorithm, the network parameters are initialized with unlabeled data at first, and the network parameters are placed in a better position. Then, using labeled data, the gradient descent algorithm based on accuracy is adopted. The network parameters are updated and finally converged. The experimental results show that the accuracy and recall rate of the algorithm are improved.
【作者單位】: 四川大學(xué)計(jì)算機(jī)學(xué)院;
【基金】:國家自然科學(xué)基金資助項(xiàng)目(61332066;81373239)
【分類號(hào)】:TP301.6
【相似文獻(xiàn)】
相關(guān)期刊論文 前2條
1 余坦;王益民;;一種基于用戶屬性的搜索算法[J];計(jì)算機(jī)系統(tǒng)應(yīng)用;2010年07期
2 ;[J];;年期
,本文編號(hào):2002369
本文鏈接:http://www.sikaile.net/kejilunwen/ruanjiangongchenglunwen/2002369.html
最近更新
教材專著