天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 碩博論文 > 信息類博士論文 >

容錯分布式存儲系統(tǒng)擴容機制研究

發(fā)布時間:2018-11-19 09:06
【摘要】:當(dāng)今大規(guī)模分布式存儲系統(tǒng)采用冗余存儲的方式來維持數(shù)據(jù)的可用性。冗余信息產(chǎn)生方式有復(fù)制和糾刪碼。糾刪碼相對于復(fù)制,因提供相同的容錯能力所需的存儲開銷大大降低,而被越來越多的存儲系統(tǒng)所采用。另一方面,數(shù)據(jù)的快速增長以及用戶對系統(tǒng)容量和性能需求的不斷提高導(dǎo)致當(dāng)前構(gòu)建存儲系統(tǒng)經(jīng)常出現(xiàn)存儲能力和帶寬資源不足的情況。當(dāng)應(yīng)用需求超出系統(tǒng)能力,需要增加存儲資源,并將部分數(shù)據(jù)遷移到新的存儲設(shè)備上來緩解壓力,這一操作被稱作存儲系統(tǒng)擴容。因此,研究基于糾刪碼的分布式存儲系統(tǒng)擴容機制,對云存儲以及數(shù)據(jù)中心背景下的數(shù)據(jù)存儲具有重要意義。本文從設(shè)計糾刪碼存儲系統(tǒng)的擴容算法、調(diào)度在線擴容過程中的用戶I/O請求與系統(tǒng)I/O請求、優(yōu)化擴容后的用戶訪問性能三個維度出發(fā),研究分布式存儲系統(tǒng)的擴容機制,主要研究內(nèi)容與貢獻如下:(1) Cauchy Reed-Solomon (CRS)擴容問題研究隨著當(dāng)前存儲系統(tǒng)對容錯要求的逐漸提高,考慮容任意錯的CRS編碼的擴容問題愈發(fā)重要。CRS編碼主要適用于由眾多存儲節(jié)點以及互聯(lián)網(wǎng)絡(luò)組成的分布式存儲系統(tǒng)(例,CleverSafe, OceanStore)。擴容過程需要遷移部分數(shù)據(jù)到新的存儲設(shè)備,同時需要更新校驗。數(shù)據(jù)遷移與校驗更新帶來的存儲I/O與網(wǎng)絡(luò)傳輸帶寬開銷直接影響擴容過程中的系統(tǒng)性能。本文研究了基于CRS編碼的分布式存儲系統(tǒng)的擴容問題,通過第一步設(shè)計擴容后的編碼矩陣,第二步設(shè)計擴容過程中的數(shù)據(jù)遷移方案,第三步利用校驗解碼部分數(shù)據(jù)的思想進一步優(yōu)化數(shù)據(jù)遷移過程,為CRS系統(tǒng)擴容設(shè)計了一個三階段優(yōu)化擴容算法。理論分析表明,本文的三階段優(yōu)化擴容算法相對于基本擴容算法,能有效逐步地減少CRS系統(tǒng)擴容過程中的系統(tǒng)I/O與網(wǎng)絡(luò)傳輸帶寬。通過在實際的分布式文件系統(tǒng)中部署CRS三階段優(yōu)化擴容算法,并與基本擴容算法進行廣泛實驗對比,本文證實了算法在單線程以及多線程架構(gòu)下的有效性與實用性。(2)在線擴容問題研究在實際存儲系統(tǒng)中,大多數(shù)上層用戶級應(yīng)用都要求系統(tǒng)提供7x24小時的在線服務(wù)。因此,當(dāng)存儲系統(tǒng)進行在線擴容的時候,用戶的I/O請求和遷移的I/O請求相互競爭,勢必影響擴容過程中的用戶和遷移的響應(yīng)時間性能!と欢,已有的擴容算法在設(shè)計之時都很少考慮用戶I/O請求,在線擴容過程中的用戶和遷移的響應(yīng)時間性能勢必降級。本文基于此問題,為已有眾多的擴容算法設(shè)計了一個在線擴容優(yōu)化機制Popularity-based Online Scaling (POS)。本文的在線擴容優(yōu)化機制POS結(jié)合實際系統(tǒng)中用戶訪問的兩個特征,即:數(shù)據(jù)熱度和數(shù)據(jù)局部性,通過將原有存儲空間劃分為多個區(qū)域,并記錄每個區(qū)域的熱度(主要以訪問頻度為指標),從而改變擴容順序,優(yōu)先遷移熱度高的區(qū)域,進一步利用數(shù)據(jù)局部性來更好地響應(yīng)用戶的讀、寫請求,同時可以減少用戶訪問對遷移性能的影響。POS可以看作一個插件,垂直地應(yīng)用在已有眾多的擴容算法之上,提高在線擴容性能。通過在實際的磁盤模擬器中部署POS,并與已有的RAID-0擴容算法FastScale開展廣泛實驗對比,本文證實了POS相對于傳統(tǒng)擴容算法能顯著提高在線擴容過程中的用戶以及遷移的響應(yīng)時間性能。(3)擴容后讀、寫性能優(yōu)化研究存儲系統(tǒng)擴容必須兼顧擴容過程中性能與擴容結(jié)束后用戶讀、寫操作性能。一方面,擴容過程中的系統(tǒng)I/O開銷越大,擴容時間窗口越長,對于擴容過程中的遷移與用戶的響應(yīng)時間性能影響越大:另一方面,擴容結(jié)束后,必須服務(wù)正常的用戶讀、寫操作,擴容后的用戶訪問性能亦為重要。然而,已有的擴容算法主要考慮最小化擴容過程中的數(shù)據(jù)遷移量,并未考慮優(yōu)化擴容后的用戶讀、寫操作性能。由于擴容過程改變了系統(tǒng)的數(shù)據(jù)布局,所以,擴容過程直接影響擴容結(jié)束后正常的用戶訪問性能。因此,本文從擴容過程出發(fā),考慮設(shè)計好的數(shù)據(jù)遷移方法。本文以RAID-0擴容為例,設(shè)計一種新的擴容算法PostScale。 PostScale實現(xiàn)了擴容過程中的最小化數(shù)據(jù)遷移量,在此約束條件下,保證了擴容結(jié)束后的連續(xù)數(shù)據(jù)塊的最大化分散放置。通過如此設(shè)計,擴容時間窗口得以縮小,同時擴容結(jié)束后的用戶讀、寫請求能利用存儲系統(tǒng)最大的并發(fā)訪問性能。模擬實驗表明,PostScale相對于傳統(tǒng)的兩種RAID-0擴容算法round-robin、 FastScale皆有優(yōu)勢,PostScale能大大縮小round-robin的擴容時間窗口,亦能有效提高FastScale的擴容結(jié)束后用戶讀、寫響應(yīng)時間性能。本文的PostScale可以進一步延伸應(yīng)用于RAID-5系統(tǒng)擴容、基于Reed-Solomon編碼的分布式存儲系統(tǒng)擴容,改進擴容后的用戶訪問性能。
[Abstract]:Today's large-scale distributed storage systems use redundant storage to maintain data availability. The redundant information generation mode has the copy and deletion codes. the storage overhead required for providing the same fault-tolerant capability is greatly reduced with respect to the replication, and is used by an increasing number of storage systems. On the other hand, the rapid growth of data, as well as the user's increasing system capacity and performance requirements, often result in the current build-up of storage systems with low storage capacity and insufficient bandwidth resources. When application requirements exceed system capabilities, the storage resource needs to be increased and some of the data is migrated to the new storage device to relieve the pressure, which is known as the storage-system expansion. Therefore, it is of great significance to study the capacity expansion mechanism of the distributed storage system based on the erasure code, and it is of great significance to the cloud storage and the data storage in the background of the data center. This paper studies the expansion mechanism of the distributed storage system from the three dimensions of the system I/ O request and the system I/ O request and the user's access performance after the expansion, and the main research contents and contributions are as follows: (1) The research of the capacity expansion of the Cauchy Reed-Solomon (CRS) is becoming more and more important as the current storage system is improving the fault tolerance. CRS encoding is mainly applicable to a distributed storage system (e.g., CleverSafe, OceanStore) consisting of a number of storage nodes and the Internet. The expansion process requires the migration of part of the data to the new storage device, while the check needs to be updated. The storage I/ O and network transmission bandwidth overhead brought by the data migration and check update directly influence the system performance in the expansion process. In this paper, the expansion of the distributed storage system based on CRS is studied, the first step is to design the expanded coding matrix, the second step is to design the data migration scheme in the expansion process, and the third step further optimizes the data migration process by using the idea of the data of the check and decoding part. In this paper, a three-stage optimization and expansion algorithm is designed for the expansion of CRS system. The theoretical analysis shows that the three-stage optimization expansion algorithm in this paper can effectively reduce the system I/ O and network transmission bandwidth in the expansion process of the CRS system with respect to the basic capacity expansion algorithm. In this paper, the validity and practicability of the algorithm under the single thread and multi-thread architecture are verified by deploying the CRS three-stage optimization expansion algorithm in the actual distributed file system and comparing with the basic capacity expansion algorithm. (2) On-line capacity expansion is studied in the actual storage system. Most upper-level user-level applications require the system to provide an online service of 7x24 hours. Therefore, when the storage system is expanded online, the I/ O request and the migration I/ O request of the user compete with each other, and the response time performance of the user and the migration in the expansion process is bound to be affected. However, the existing capacity expansion algorithm seldom takes into account the user I/ O request at the time of design, and the response time performance of the user and the migration in the on-line expansion process is bound to be degraded. In this paper, an on-line capacity expansion optimization mechanism, Popularity-based Online Scaling (POS), is designed for a number of expansion algorithms. The on-line capacity expansion optimization mechanism (POS) of this paper is based on two characteristics of user access in the actual system, namely, data heat and data locality, by dividing the original storage space into a plurality of areas, and recording the heat of each area (mainly taking the access frequency as an index), and the influence of user access on the migration performance can be reduced. The POS can be regarded as a plug-in, which can be applied vertically to a large number of expansion algorithms, so as to improve the on-line capacity expansion performance. By deploying the POS in the actual disk simulator, and carrying out extensive experimental comparison with the existing RAID-0 expansion algorithm FastScale, this paper proves that the performance of the response time of the user and the migration in the on-line expansion process can be improved significantly with respect to the traditional expansion algorithm. (3) After capacity expansion, read and write performance optimization study storage system expansion must take account of the performance of the expansion process and the user's reading and writing operation performance after the end of expansion. On the one hand, the greater the system I/ O overhead in the expansion process, the longer the expansion time window, the greater the impact on the migration and the user's response time performance during the expansion: on the other hand, after the expansion is over, the normal user read and write operation must be served, The user access performance after the expansion is also important. However, the existing capacity expansion algorithm is mainly concerned with minimizing the amount of data migration in the expansion process, and does not consider optimizing the user's reading and writing operation performance after the expansion. Because the expansion process changes the data layout of the system, the expansion process directly influences the normal user access performance after the expansion. Therefore, this paper, from the process of expansion, considers the design of the data migration method. In this paper, a new expansion algorithm, PostScale, is designed based on the expansion of RAID-0. PostScale realizes the minimum data migration in the expansion process, and under the constraint condition, the maximum dispersion and placement of the continuous data blocks after the expansion end is guaranteed. With such a design, the expansion time window is reduced, and the user read and write requests after the expansion end can utilize the maximum concurrent access performance of the storage system. The simulation results show that the PostScale has the advantages of both the traditional two RAID-0 expansion algorithms, round-robin and FastScale, and PostScale can greatly reduce the expansion time window of the round-robin, and can effectively improve the time performance of user read and write response after the expansion of the FastScale. PostScale in this paper can further extend to the expansion of the RAID-5 system, expand the distributed storage system based on Reed-Solomon coding, and improve the user access performance after the expansion.
【學(xué)位授予單位】:中國科學(xué)技術(shù)大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2016
【分類號】:TP333

【相似文獻】

相關(guān)期刊論文 前10條

1 王征;劉心松;李美安;;企業(yè)信息分布式存儲的熱點處理策略[J];計算機集成制造系統(tǒng);2006年09期

2 李磊;沈海斌;黃凱;嚴曉浪;Han Sangil;Ahmed A Jerraya;;分布式存儲管理在多核設(shè)計中的高層建模[J];電子與信息學(xué)報;2008年11期

3 劉翔;汪海玲;;分布式存儲中的一種數(shù)據(jù)放置策略[J];計算機與數(shù)字工程;2009年05期

4 陳衛(wèi)衛(wèi);吳海佳;胥光輝;;分布式存儲中文件分割的最優(yōu)化模型[J];解放軍理工大學(xué)學(xué)報(自然科學(xué)版);2010年04期

5 崔忠強;左德承;張展;;在云間可重構(gòu)的分布式存儲[J];系統(tǒng)工程理論與實踐;2011年S2期

6 郝杰;逯彥博;劉鑫吉;夏樹濤;;分布式存儲中的再生碼綜述[J];重慶郵電大學(xué)學(xué)報(自然科學(xué)版);2013年01期

7 唐京偉;;基于云計算的分布式存儲技術(shù)[J];中國傳媒科技;2013年15期

8 郭棟;王偉;曾國蓀;;基于一致性樹分布的數(shù)據(jù)分布式存儲方法[J];計算機應(yīng)用;2013年12期

9 蘇李亮;王云福;侯斌;;海量設(shè)計文檔分布式存儲及負載均衡的研究與實現(xiàn)[J];電信科學(xué);2013年12期

10 謝然;;敢問存儲之路在何方?見分布式存儲搖曳在數(shù)據(jù)枝頭[J];互聯(lián)網(wǎng)周刊;2014年02期

相關(guān)會議論文 前7條

1 蘇李亮;王云福;侯斌;;海量設(shè)計文檔分布式存儲及負載均衡的研究與實現(xiàn)[A];2013電力行業(yè)信息化年會論文集[C];2013年

2 蘇李亮;王云福;侯斌;;海量設(shè)計文檔分布式存儲及負載均衡的研究與實現(xiàn)[A];2013電力行業(yè)信息化年會論文集[C];2013年

3 鄭文武;李先緒;黃植勤;邱紅飛;;云存儲關(guān)鍵技術(shù)[A];2012全國無線及移動通信學(xué)術(shù)大會論文集(下)[C];2012年

4 蔣軼林;郭淑琴;;分布式存儲在數(shù)字集群移動通信系統(tǒng)中的應(yīng)用[A];浙江省電子學(xué)會2013學(xué)術(shù)年會論文集[C];2013年

5 姜繼忱;陳鋼;;P2P之路——締造“分布式對等”的Internet3.0[A];全面建設(shè)小康社會:中國科技工作者的歷史責(zé)任——中國科協(xié)2003年學(xué)術(shù)年會論文集(下)[C];2003年

6 付偉;肖儂;盧錫城;;QoS感知的副本放置問題研究綜述[A];第15屆全國信息存儲技術(shù)學(xué)術(shù)會議論文集[C];2008年

7 張彥;劉欣然;徐慧彬;;一種基于虛擬計算環(huán)境的分布式存儲體系結(jié)構(gòu)[A];2009全國計算機網(wǎng)絡(luò)與通信學(xué)術(shù)會議論文集[C];2009年

相關(guān)重要報紙文章 前8條

1 京東架構(gòu)委員會主任 云平臺首席架構(gòu)師 系統(tǒng)技術(shù)部負責(zé)人 劉海鋒;京東:分布式存儲體系成為業(yè)務(wù)基石[N];中國信息化周報;2014年

2 《網(wǎng)絡(luò)世界》記者 于翔;京東分布式存儲體系研發(fā)歷程[N];網(wǎng)絡(luò)世界;2014年

3 《網(wǎng)絡(luò)世界》記者 于翔;融合一體機投入大規(guī)模商用[N];網(wǎng)絡(luò)世界;2013年

4 記者 余榮華;大數(shù)據(jù),催生大變革[N];人民日報;2014年

5 本報記者 張佳星;新生產(chǎn)業(yè)布局如何“云”中索驥[N];科技日報;2014年

6 本報記者 甘露;物聯(lián)網(wǎng)讓管理更美妙[N];計算機世界;2013年

7 本報記者 郭濤;華為幫用戶定制HANA一體機[N];中國計算機報;2013年

8 臨江;手機瀏覽器,3G時代的采礦機?[N];人民郵電;2009年

相關(guān)博士學(xué)位論文 前9條

1 吳思;容錯分布式存儲系統(tǒng)擴容機制研究[D];中國科學(xué)技術(shù)大學(xué);2016年

2 胡q,

本文編號:2341817


資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/shoufeilunwen/xxkjbs/2341817.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶c8f3b***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com