多標(biāo)簽學(xué)習(xí)的特征降維方法

發(fā)布時(shí)間：2018-02-12 04:45

本文關(guān)鍵詞： 多標(biāo)簽學(xué)習(xí) 特征降維主成分分析非負(fù)矩陣分解相似矩陣　出處：《閩南師范大學(xué)》2017年碩士論文　論文類型：學(xué)位論文

【摘要】：在多標(biāo)簽學(xué)習(xí)中,多標(biāo)簽數(shù)據(jù)的每個(gè)樣本含有多個(gè)標(biāo)簽,標(biāo)簽與標(biāo)簽之間也不是獨(dú)立存在的。多標(biāo)簽數(shù)據(jù)的維數(shù)較高,增加了數(shù)據(jù)挖掘的復(fù)雜度和難度。近些年來如何高效地處理多標(biāo)簽數(shù)據(jù),成為研究者們研究的一個(gè)熱點(diǎn)問題。特征降維能降低多標(biāo)簽數(shù)據(jù)的維度、縮小數(shù)據(jù)規(guī)模,提高多標(biāo)簽學(xué)習(xí)的性能。本論文提出了兩種多標(biāo)簽學(xué)習(xí)特征降維算法:(1)基于主成分分析的多標(biāo)簽學(xué)習(xí)特征降維算法(MLFR-PCA)。首先該算法利用PCA原理將原始數(shù)據(jù)投影到低維空間,對(duì)數(shù)據(jù)進(jìn)行密集和去噪處理。其次算法將數(shù)據(jù)的所有標(biāo)簽作為一個(gè)整體,在標(biāo)簽與特征之間引入稀疏回歸,建立起標(biāo)簽空間與特征空間的聯(lián)系,以此構(gòu)造數(shù)據(jù)降維的目標(biāo)函數(shù)。然后結(jié)合2,1l范數(shù)對(duì)算法進(jìn)行優(yōu)化處理,最終實(shí)現(xiàn)降低多標(biāo)簽數(shù)據(jù)維數(shù)的目的。(2)基于非負(fù)矩陣分解的多標(biāo)簽學(xué)習(xí)特征降維算法(MLFR-NMF)。首先該算法用特征矩陣與非負(fù)矩陣的乘積構(gòu)建特征空間的相似矩陣。其次將數(shù)據(jù)的所有標(biāo)簽作為一個(gè)整體,利用已有方法構(gòu)造標(biāo)簽空間的相似矩陣。然后在特征空間的相似矩陣與標(biāo)簽空間的相似矩陣之間引入最小二乘法,建立起標(biāo)簽空間與特征空間的聯(lián)系,以此構(gòu)造數(shù)據(jù)降維的目標(biāo)函數(shù)。最后結(jié)合2l范數(shù)對(duì)算法進(jìn)行優(yōu)化處理,以實(shí)現(xiàn)降低多標(biāo)簽數(shù)據(jù)維數(shù)的目的。以上兩種特征降維算法可以直接對(duì)多標(biāo)簽數(shù)據(jù)進(jìn)行降維,不需要轉(zhuǎn)化多標(biāo)簽數(shù)據(jù)為單標(biāo)簽數(shù)據(jù),這樣不僅減少了轉(zhuǎn)化過程引起的工作量增大問題,也避免了因轉(zhuǎn)化不準(zhǔn)確帶來的后續(xù)問題。此外,算法將數(shù)據(jù)的所有標(biāo)簽作為一個(gè)整體參與目標(biāo)函數(shù)構(gòu)造,這樣可以在不破壞標(biāo)簽結(jié)構(gòu)的情況下,有效利用標(biāo)簽信息實(shí)現(xiàn)降維。通過在真實(shí)數(shù)據(jù)集上的實(shí)驗(yàn),表明了兩種算法效果良好。
[Abstract]:In multi-label learning, each sample of multi-label data contains multiple tags, and the labels and tags do not exist independently. The dimension of multi-label data is higher. In recent years, how to deal with multi-label data efficiently has become a hot issue for researchers. Feature dimensionality reduction can reduce the dimension of multi-label data and reduce the scale of data. In this paper, we propose two multi-label learning feature reduction algorithms: (1) Multi-label learning feature reduction algorithm based on principal component analysis (PCA) and MLFR-PCAA algorithm. Firstly, this algorithm uses PCA principle to project raw data into low-dimensional space. Secondly, the algorithm takes all labels of data as a whole, introduces sparse regression between labels and features, and establishes the relationship between label space and feature space. The objective function of data dimension reduction is constructed, and the algorithm is optimized with 2L norm. Finally, the purpose of reducing the dimension of multi-label data is realized.) the multi-label learning feature reduction algorithm based on non-negative matrix factorization is proposed. Firstly, the product of feature matrix and non-negative matrix is used to construct the similarity matrix of feature space. Take all the labels of the data as a whole, The similarity matrix of the tag space is constructed by using the existing methods, and then the least square method is introduced between the similarity matrix of the feature space and the similarity matrix of the label space, and the relation between the tag space and the feature space is established. Finally, the algorithm is optimized with 2l norm to reduce the dimension of multi-label data. The above two feature dimensionality reduction algorithms can directly reduce the dimension of multi-label data. There is no need to convert multi-label data to single-label data, which not only reduces the increased workload caused by the conversion process, but also avoids the subsequent problems caused by inaccurate transformation. The algorithm constructs all the tags of the data as a whole to participate in the objective function, which can effectively use tag information to reduce the dimension without breaking the tag structure. The results show that the two algorithms are effective.
【學(xué)位授予單位】：閩南師范大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類號(hào)】：TP181

【參考文獻(xiàn)】

相關(guān)期刊論文前3條

1 劉松;張德賢;;基于權(quán)重差異和類別關(guān)聯(lián)的互信息改進(jìn)研究[J];計(jì)算機(jī)應(yīng)用研究;2014年07期

2 張翔;鄧趙紅;王士同;蔡及時(shí);;極大熵Relief特征加權(quán)[J];計(jì)算機(jī)研究與發(fā)展;2011年06期

3 龔建興;韓超;邱曉剛;黃柯棣;;構(gòu)建可擴(kuò)展的HLA聯(lián)邦成員架構(gòu)[J];系統(tǒng)仿真學(xué)報(bào);2006年11期

相關(guān)博士學(xué)位論文前2條

1 陳自潔;多標(biāo)簽分類問題的圖結(jié)構(gòu)描述及若干學(xué)習(xí)算法的研究[D];華南理工大學(xué);2015年

2 朱林;基于特征加權(quán)與特征選擇的數(shù)據(jù)挖掘算法研究[D];上海交通大學(xué);2013年

相關(guān)碩士學(xué)位論文前4條

1 李玲;多標(biāo)簽分類中特征選擇算法研究[D];浙江師范大學(xué);2015年

2 王文峰;K-L變換的研究及其在圖像壓縮編碼中的應(yīng)用[D];沈陽理工大學(xué);2008年

3 翟亞利;非負(fù)矩陣分解及其在中文文本挖掘中的應(yīng)用[D];國防科學(xué)技術(shù)大學(xué);2007年

4 蘇映雪;特征選擇算法研究[D];國防科學(xué)技術(shù)大學(xué);2006年

，

本文編號(hào)：1504808

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://www.sikaile.net/kejilunwen/zidonghuakongzhilunwen/1504808.html

上一篇：基于DSP的永磁同步電機(jī)伺服控制算法研究
下一篇：基于云計(jì)算的油煙在線監(jiān)測(cè)系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級(jí)|國家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

多標(biāo)簽學(xué)習(xí)的特征降維方法