異構(gòu)數(shù)據(jù)源集成系統(tǒng)中查詢重寫的研究
本文關(guān)鍵詞:異構(gòu)數(shù)據(jù)源集成系統(tǒng)中查詢重寫的研究 出處:《哈爾濱商業(yè)大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
更多相關(guān)文章: 數(shù)據(jù)集成 查詢重寫 MiniCon 路徑優(yōu)化
【摘要】:隨著計(jì)算機(jī)技術(shù)的飛速發(fā)展與廣泛應(yīng)用,數(shù)據(jù)量已經(jīng)無(wú)法用"多"來(lái)形容,由于各行各業(yè)對(duì)數(shù)據(jù)的需求不同導(dǎo)致數(shù)據(jù)的存儲(chǔ)方式,數(shù)據(jù)結(jié)構(gòu)等多方面都存在差異,因此形成了大量的異構(gòu)數(shù)據(jù)源。但對(duì)于用戶來(lái)說(shuō)這并不是他們想要的,用戶通常希望通過(guò)提交一次查詢就可以得到需要的數(shù)據(jù),異構(gòu)數(shù)據(jù)源集成系統(tǒng)應(yīng)運(yùn)而生。其中,查詢重寫技術(shù)在異構(gòu)數(shù)據(jù)源集成系統(tǒng)中扮演著極其重要的角色,集成系統(tǒng)正是通過(guò)查詢重寫技術(shù)將用戶給定的基于全局模式提出的查詢語(yǔ)句進(jìn)行重寫,以實(shí)現(xiàn)從異構(gòu)數(shù)據(jù)源中獲取結(jié)果并反饋給用戶。查詢重寫技術(shù)與數(shù)據(jù)集成、查詢優(yōu)化等問(wèn)題都密切相關(guān)。課題針對(duì)異構(gòu)數(shù)據(jù)源集成系統(tǒng)中的查詢重寫問(wèn)題做了以下研究。首先,課題對(duì)三種經(jīng)典的查詢重寫算法,Bucket算法,Inverse-Rules算法,以及MiniCon算法進(jìn)行了深入的研究,并分別指出了以上三種算法的不足之處。著重探討研究了 MiniCon算法,并在該算法的基礎(chǔ)上提出了一種改進(jìn)算法,即基于路徑優(yōu)化的MiniCon算法。該算法在傳統(tǒng)的MiniCon算法的基礎(chǔ)上,增加了一步路徑的優(yōu)化,通過(guò)比較查詢視圖中相關(guān)字段的數(shù)據(jù)有效比例,對(duì)查詢路徑進(jìn)行優(yōu)化,以達(dá)到提高查詢效率的目的。其次,課題介紹了三種傳統(tǒng)數(shù)據(jù)集成方案,即聯(lián)邦數(shù)據(jù)庫(kù)法,中間件法以及數(shù)據(jù)倉(cāng)庫(kù)法。并以中間件體系結(jié)構(gòu)為基礎(chǔ),融合JSON技術(shù),設(shè)計(jì)了一種異構(gòu)數(shù)據(jù)源集成框架,采用穩(wěn)定的三層結(jié)構(gòu),包括展示層,中間層及數(shù)據(jù)源層,其中,中間層為此系統(tǒng)的核心,查詢生成、查詢重寫等都在中介層中實(shí)現(xiàn)。最后,課題將傳統(tǒng)的MiniCon算法與改進(jìn)的基于路徑優(yōu)化的MiniCon算法應(yīng)用到上述設(shè)計(jì)的異構(gòu)數(shù)據(jù)源集成系統(tǒng)中,并采用河南世紀(jì)聯(lián)華超市的數(shù)據(jù),對(duì)兩種算法的查詢速率進(jìn)行了比較以證明改進(jìn)算法的正確性與優(yōu)越性。
[Abstract]:With the rapid development and wide application of computer technology, data volume can not be described by "multi". Due to the different needs of data in different industries, there are many differences in data storage mode and data structure. Therefore, a large number of heterogeneous data sources are formed. But for users, this is not what they want. Users usually hope to get the required data by submitting a query, and heterogeneous data source integration system arises at the historic moment. Among them, the query rewriting technology plays a very important role in the integration of heterogeneous data sources in the system, the integrated system is through the query rewriting technology will be given by the user query based on global schema rewriting, in order to achieve from heterogeneous data sources to obtain results and feedback to the user. Query rewriting technology is closely related to data integration, query optimization and so on. The following research has been done on query rewriting in heterogeneous data source integration systems. First of all, three classical query rewriting algorithms, Bucket algorithm, Inverse-Rules algorithm and MiniCon algorithm are deeply studied, and the shortcomings of the above three algorithms are pointed out respectively. This paper focuses on the study of the MiniCon algorithm and proposes an improved algorithm based on the algorithm, that is, the MiniCon algorithm based on the path optimization. Based on the traditional MiniCon algorithm, the algorithm adds a one-step path optimization. By comparing the effective proportion of related fields in query view, we optimize the query path, so as to improve the efficiency of query. Secondly, we introduce three traditional data integration schemes, namely, federal database method, middleware method and data warehouse method. And the middleware architecture based on integration of JSON technology, design a framework of heterogeneous data source integration, adopt three layers of structure stability, including the presentation layer, middle layer and data source layer, the middle layer to the core of the system, are implemented in the intermediate layer, query rewriting query generation. Finally, issues of the traditional MiniCon algorithm and improved MiniCon algorithm is applied to path optimization based on the design of heterogeneous data source integration system, and the Henan Century Lianhua supermarket data, the two algorithms of query rates were compared to prove the correctness and superiority of the improved algorithm.
【學(xué)位授予單位】:哈爾濱商業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP311.13
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 劉嘉琦;孫嘉成;;使用JSON完成異構(gòu)系統(tǒng)間通訊的應(yīng)用研究[J];黑龍江科技信息;2016年19期
2 呂曉東;;分布式網(wǎng)絡(luò)中數(shù)據(jù)庫(kù)中間件技術(shù)研究[J];電子技術(shù)與軟件工程;2016年07期
3 李華昱;龔安;;基于語(yǔ)義視圖的SPARQL-SQL查詢轉(zhuǎn)換方法[J];計(jì)算機(jī)系統(tǒng)應(yīng)用;2016年02期
4 楊月華;杜軍平;平源;;基于本體的智能信息檢索系統(tǒng)[J];軟件學(xué)報(bào);2015年07期
5 馬相芬;;XML和JSON數(shù)據(jù)格式在Ajax中的對(duì)比分析[J];電腦編程技巧與維護(hù);2015年10期
6 李亢;李新明;劉東;;多源異構(gòu)裝備數(shù)據(jù)集成研究綜述[J];中國(guó)電子科學(xué)研究院學(xué)報(bào);2015年02期
7 楊曉鵬;黃琛;黃曉川;;基于中間件技術(shù)的數(shù)據(jù)整合方案設(shè)計(jì)與實(shí)現(xiàn)[J];科技視界;2015年01期
8 張曉剛;楊路明;潘久輝;;面向數(shù)據(jù)集成的一種高效一致性查詢方法[J];電子學(xué)報(bào);2014年08期
9 張凌宇;陳淑鑫;李敬有;;基于視圖的本體集成系統(tǒng)框架的研究[J];計(jì)算機(jī)仿真;2014年07期
10 蘇琪;劉西林;王軍;;基于Web Service的數(shù)據(jù)集成研究及應(yīng)用[J];計(jì)算機(jī)技術(shù)與發(fā)展;2014年08期
相關(guān)碩士學(xué)位論文 前2條
1 陳斌;半結(jié)構(gòu)化數(shù)據(jù)的聚類研究及在產(chǎn)品設(shè)計(jì)中的應(yīng)用[D];西安電子科技大學(xué);2015年
2 姚香菊;基于本體的異構(gòu)數(shù)據(jù)集成技術(shù)的研究[D];東華大學(xué);2015年
,本文編號(hào):1344593
本文鏈接:http://www.sikaile.net/shoufeilunwen/xixikjs/1344593.html