面向Mashup多敏感屬性數(shù)據(jù)集的隱私保護(hù)方法研究

發(fā)布時(shí)間：2019-03-01 19:01

【摘要】：Mashup是目前Internet上廣受關(guān)注的基于Web的數(shù)據(jù)集成應(yīng)用。作為一種常見的數(shù)據(jù)聚合應(yīng)用,Mashup為數(shù)據(jù)的交換和共享提供了強(qiáng)有利的支持。數(shù)據(jù)聚合發(fā)布時(shí)會(huì)涉及多個(gè)數(shù)據(jù)發(fā)布單位,這些數(shù)據(jù)源間的連接操作往往會(huì)導(dǎo)致嚴(yán)重的隱私泄露問題,也極易產(chǎn)生敏感信息在數(shù)據(jù)發(fā)布單位之間泄露的情況。同時(shí),從眾多數(shù)據(jù)源聚合而來的數(shù)據(jù)必然包含大量屬性,對(duì)高維度數(shù)據(jù)匿名化后極易造成數(shù)據(jù)過度失真的情況。數(shù)據(jù)聚合發(fā)布下的隱私保護(hù)是一個(gè)重要且富有挑戰(zhàn)的難題。PHDMashup算法是針對(duì)數(shù)據(jù)聚合發(fā)布中的隱私保護(hù)問題而提出,它采用LKC-Privacy保護(hù)模型,結(jié)合自頂向下特化的方法,實(shí)現(xiàn)了數(shù)據(jù)聚合發(fā)布時(shí)的隱私保護(hù)。但是在數(shù)據(jù)聚合的過程中涉及眾多數(shù)據(jù)提供方,需要進(jìn)行匿名化處理的屬性數(shù)量必然是巨大的,PHDMashup算法要求對(duì)所有屬性構(gòu)建的泛化樹的全部有效節(jié)點(diǎn)進(jìn)行特化,不僅造成了時(shí)間和空間的浪費(fèi),也帶來了繁重的計(jì)算量。本文在此基礎(chǔ)上對(duì)PHDMashup算法進(jìn)行改進(jìn),提出了 NPHDMashup算法,通過減少特化節(jié)點(diǎn)的方式提高了算法的執(zhí)行效率。另外,針對(duì)以上兩種算法中由于數(shù)據(jù)提供方之間大量的信息交流造成時(shí)間耗費(fèi)的情況,提出了一種改進(jìn)型數(shù)據(jù)聚合隱私保護(hù)算法SPHDMashup,通過引入Server作為中間件,數(shù)據(jù)提供方直接與Server進(jìn)行信息交流,不僅大大提高了算法效率,也在很大程度上減少了數(shù)據(jù)提供方的工作量。而且,以Mashup模式匯聚的資源具有多源、異質(zhì)、結(jié)構(gòu)復(fù)雜的特點(diǎn),異構(gòu)問題會(huì)影響數(shù)據(jù)提供方之間共享屬性的數(shù)據(jù)處理,本文提出通過構(gòu)建映射表的方法實(shí)現(xiàn)局部數(shù)據(jù)模型向公共模型的轉(zhuǎn)換,解決數(shù)據(jù)聚合時(shí)的語義異構(gòu)問題。最后通過實(shí)驗(yàn)對(duì)提出的算法進(jìn)行評(píng)估,通過與原算法進(jìn)行分析比對(duì),驗(yàn)證了算法的優(yōu)越性,并對(duì)算法中存在的不足進(jìn)行了分析,討論了今后算法的改進(jìn)方向。
[Abstract]:Mashup is a widely concerned Web-based data integration application on Internet. As a common data aggregation application, Mashup provides strong support for data exchange and sharing. Data aggregation and publishing will involve multiple data publishing units, the connection between these data sources will often lead to serious privacy disclosure problems, but also very easy to generate sensitive information leakage between data publishing units. At the same time, the data aggregated from many data sources must contain a large number of attributes, which can easily result in over-distortion of data after anonymization of high-dimensional data. Privacy protection under data aggregation publishing is an important and challenging problem. PHD Mashup algorithm is proposed to solve the privacy protection problem in data aggregation publishing. It adopts LKC-Privacy protection model and combines top-down specialization method. The privacy protection of data aggregation and publishing is realized. However, many data providers are involved in the process of data aggregation, and the number of attributes that need to be anonymized must be huge. The PHDMashup algorithm requires the specialization of all effective nodes of the generalization tree constructed by all the attributes. It not only causes waste of time and space, but also brings heavy calculation. In this paper, the PHDMashup algorithm is improved and the NPHDMashup algorithm is proposed, which improves the efficiency of the algorithm by reducing the specialized nodes. In addition, an improved data aggregation privacy protection algorithm, SPHDMashup, which uses Server as a middleware, is proposed to solve the problem of time-consuming caused by a large amount of information exchange between data providers in the above two algorithms. The direct exchange of information between the data provider and Server not only improves the efficiency of the algorithm, but also reduces the workload of the data provider to a great extent. Moreover, the resources converged in Mashup mode have the characteristics of multi-source, heterogeneous and complex structure, and heterogeneous problems will affect the data processing of shared attributes among data providers. In this paper, a mapping table is proposed to realize the transformation from local data model to common model, so as to solve the problem of semantic heterogeneity in data aggregation. Finally, the proposed algorithm is evaluated by experiments, compared with the original algorithm, the superiority of the algorithm is verified, the shortcomings of the algorithm are analyzed, and the improvement direction of the algorithm in the future is discussed.
【學(xué)位授予單位】：哈爾濱工程大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2016
【分類號(hào)】：TP309

【參考文獻(xiàn)】

相關(guān)期刊論文前8條

1 馮登國;張敏;李昊;;大數(shù)據(jù)安全與隱私保護(hù)[J];計(jì)算機(jī)學(xué)報(bào);2014年01期

2 高永兵;吳紀(jì)磊;胡文江;魏曉東;;基于Web服務(wù)的Mashup應(yīng)用的研究與實(shí)現(xiàn)[J];計(jì)算機(jī)技術(shù)與發(fā)展;2010年06期

3 楊曉春;王斌;于戈;;支持信息共享的有效的安全數(shù)據(jù)發(fā)布算法[J];中國科學(xué)(F輯:信息科學(xué));2009年08期

4 李峰;李春旺;;Mashup關(guān)鍵技術(shù)研究[J];現(xiàn)代圖書情報(bào)技術(shù);2009年01期

5 符寧;周興社;詹濤;;信息集成數(shù)據(jù)模型研究[J];計(jì)算機(jī)應(yīng)用研究;2008年05期

6 楊曉春;王雅哲;王斌;于戈;;數(shù)據(jù)發(fā)布中面向多敏感屬性的隱私保護(hù)方法[J];計(jì)算機(jī)學(xué)報(bào);2008年04期

7 劉喻;呂大鵬;馮建華;周立柱;;數(shù)據(jù)發(fā)布中的匿名化技術(shù)研究綜述[J];計(jì)算機(jī)應(yīng)用;2007年10期

8 馬曉艷;曹寶香;;一種基于XML的數(shù)據(jù)集成方案及其關(guān)鍵技術(shù)[J];計(jì)算機(jī)與現(xiàn)代化;2006年03期

相關(guān)博士學(xué)位論文前1條

1 王波;數(shù)據(jù)發(fā)布中的個(gè)性化隱私匿名技術(shù)研究[D];哈爾濱工程大學(xué);2012年

相關(guān)碩士學(xué)位論文前4條

1 孟凡峰;基于本體的戰(zhàn)場電磁環(huán)境組織模型設(shè)計(jì)與實(shí)現(xiàn)[D];國防科學(xué)技術(shù)大學(xué);2010年

2 曾子平;發(fā)布數(shù)據(jù)的隱私保護(hù)技術(shù)研究[D];重慶大學(xué);2009年

3 龍?jiān)兰t;地圖Mashup的研究與實(shí)現(xiàn)[D];中南大學(xué);2008年

4 黃春梅;微數(shù)據(jù)發(fā)布匿名技術(shù)研究[D];中南大學(xué);2008年

，

本文編號(hào)：2432712

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://www.sikaile.net/kejilunwen/ruanjiangongchenglunwen/2432712.html

上一篇：Spark環(huán)境下基于頻繁邊的大規(guī)模單圖采樣算法
下一篇：自適應(yīng)分?jǐn)?shù)階TV修復(fù)算法與研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級(jí)|國家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

面向Mashup多敏感屬性數(shù)據(jù)集的隱私保護(hù)方法研究