面向海量用戶的云存儲系統(tǒng)的設(shè)計與優(yōu)化

發(fā)布時間：2018-11-08 18:05

【摘要】：隨著信息技術(shù)不斷進步、移動網(wǎng)絡(luò)的普及,個人用戶的數(shù)據(jù)量迅猛增長,對云存儲服務(wù)需求度急增,眾多知名企業(yè)紛紛投入到個人用戶云存儲服務(wù)的研發(fā)與運營中,如Google, Microsoft, Drupbox,聯(lián)想,金山,華為,百度,電信運營商等。在這些優(yōu)秀產(chǎn)品中,大多數(shù)采用的策略均是以Hadoop HDFS作為基礎(chǔ)文件系統(tǒng)進行二次定制開發(fā)。開源的Hadoop HDFS以其優(yōu)異的架構(gòu)設(shè)計與高可擴展性、可用性、可靠性、容錯性、經(jīng)濟性及出色的性能風(fēng)靡全球,成為了熱點研究領(lǐng)域。然而,HDFS自身尚有諸多不足尚待解決,如NameNode單點瓶頸、小文件處理能力不足、冗余文件缺乏引用、缺乏用戶層負(fù)載平衡、不支持文件的斷點續(xù)傳、系統(tǒng)安全性弱、缺乏數(shù)據(jù)加密存儲及共享授權(quán)機制等問題。 HDFS現(xiàn)存的這些不足可以從兩個方面去彌補：第一,對HDFS的源碼進行修改,即從內(nèi)部對其進行完善；第二,在HDFS上增加一層服務(wù)層,即將部分功能剝離出來,簡化HDFS本身的功能。第一種方式需要對HDFS做重大修改,工程量大、難度高、不能向后兼容HDFS版本,并且不能有效解決單點瓶頸、斷點續(xù)傳、文件加密授權(quán)等問題。第二種方式束縛條件少,難度低,工程量小,能兼容各種HDFS版本,更重要的是,此方式構(gòu)建的云存儲系統(tǒng)具有很大的改進空間,可以解決這些問題,具有很強的可擴展性,所以本文采用的是后一種方式。本文著重分析了現(xiàn)有的HDFS架構(gòu),并在此基礎(chǔ)上構(gòu)建了一套面向海量用戶的云存儲系統(tǒng)架構(gòu),該架構(gòu)為HDFS存在的諸多問題給出了優(yōu)化解決方案,并能確保數(shù)據(jù)安全及用戶隱私保護。本文的主要創(chuàng)新點如下： 1.提出了一個基于HDFS的海量用戶云存儲系統(tǒng)架構(gòu)并分析了此架構(gòu)的優(yōu)勢：有效地緩解了單點瓶頸問題、增強了系統(tǒng)的安全性與可擴展性、支持多種訪問協(xié)議、兼容HDFS各版本等。 2.提出了一套完整的系統(tǒng)安全保護機制：第一,提出了一種能抵抗木馬環(huán)境的客戶端登錄驗證方法以增強用戶帳戶的安全性；第二,提出了一種文件的加密存儲與分級授權(quán)管理辦法以確保用戶數(shù)據(jù)安全,并能方便文件授權(quán)的分發(fā)與回收；第三,給出了適用于云存儲服務(wù)的訪問控制策略,從而能夠更好地保證訪問安全。分析表明,該機制不僅能提升系統(tǒng)的安全性,而且能實現(xiàn)用戶隱私保護及安全訪問控制。再配合使用SSL/TLS進行加密通信,系統(tǒng)整體安全性得到極大增強。 3.給出了應(yīng)用服務(wù)器間負(fù)載均衡的調(diào)度機制以實現(xiàn)用戶層訪問請求的負(fù)載均衡。應(yīng)用服務(wù)器可隨時加入或退出集群,避免了單點故障問題。訪問請求的均衡調(diào)度與應(yīng)用服務(wù)器的緩存管理相互配合,能夠有效地提升系統(tǒng)性能及負(fù)載能力。 4.針對HDFS與海量用戶特性提出了對應(yīng)的優(yōu)化方案。例如：增加斷點續(xù)傳功能,對小文件進行打包存儲,對大文件進行冗余引用處理,為應(yīng)用服務(wù)器增加緩存,將文件的容器結(jié)構(gòu)與HDFS結(jié)構(gòu)映射等。與原HDFS系統(tǒng)相比,本文提出的方法在增加系統(tǒng)功能的同時,能夠提升系統(tǒng)的性能及安全性。
[Abstract]:With the development of information technology, the popularity of mobile networks, the rapid increase of the data volume of individual users, the demand for cloud storage services has increased rapidly, and many well-known enterprises have input into the R & D and operation of individual user cloud storage services, such as Google, Microsoft, Drupbox, Lenovo, Kingsoft and Huawei, Baidu, telecom operator, etc. In these excellent products, most of the policies are based on Hadoop HDFS as a base file system for secondary customization development. The open source Hadoop HDFS is a hot research area with its excellent architecture design and high scalability, availability, reliability, fault tolerance, economy, and excellent performance. However, there are many problems still to be solved by the HDFS, such as the single-point bottleneck of the NameNode, the insufficient processing capacity of the small file, the lack of reference of the redundant files, the lack of the load balance of the user, the failure of the file, the weak security of the system, the lack of data encryption and the sharing of the authorization mechanism, etc. The existing shortage of HDFS can be made up of two ways: first, the source code of the HDFS is modified, that is, it is improved from inside; secondly, a layer of service layer is added on the HDFS, the part of the function is peeled off, and the HDFS is simplified Function. In the first way, it is necessary to make major changes to the HDFS. The quantity is large, the difficulty is high, the HDFS version can not be backward compatible, and the single-point bottleneck, the breakpoint continuous transmission, the file encryption authorization and the like cannot be effectively solved. The second way is that the binding condition is small, the difficulty is low, the engineering quantity is small, can be compatible with various HDFS versions, and more importantly, the cloud storage system constructed in this way has a great improvement space, can solve the problems, and has strong expandability, so that the method adopts the following In this paper, the existing HDFS architecture is analyzed, and a set of cloud storage system architecture for mass users is built on this basis. The architecture provides an optimized solution for many problems existing in the HDFS, and can ensure the data security and use. The main part of this paper is to protect the privacy of the household. The innovation point is as follows: 1. A mass user cloud storage system architecture based on HDFS is proposed and the advantages of this architecture are analyzed: the problem of single-point bottleneck is effectively relieved, the security and the expandability of the system are enhanced, a plurality of access protocols are supported, HDFS version and so on. 2. A complete system safety protection mechanism is proposed: first, a client login verification method which can resist the Trojan environment is proposed to enhance the security of the user account; secondly, a file encryption and classification authorization management is proposed The method can ensure the data safety of the user and can facilitate the distribution and recovery of the file authorization; and thirdly, the access control strategy applicable to the cloud storage service is provided, The analysis shows that the system not only can improve the security of the system but also the users Privacy protection and secure access control. Re-use SSL/ TLS for encrypted communications, systems The overall security is greatly enhanced. 3. The scheduling mechanism of the load balance among the application servers is given. the load balancing of the access request of the user layer is realized, and the application server can join or the method comprises the following steps of: quitting the cluster, and avoiding the problem of single point failure. The balancing scheduling of the access request and the cache management of the application server are mutually matched, and the method can improve system performance and load capacity effectively. 4. for HDFS and a corresponding optimization scheme is proposed for the characteristics of the mass user, for example, the breakpoint continuous transmission function is increased, the small file is packaged and stored, the large file is subjected to redundant reference processing, the cache is added to the application server, The container structure of the file and the HDFS structure mapping, etc. Compared with the original HDFS system, the method proposed in this paper is increasing the system
【學(xué)位授予單位】：華東師范大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2013
【分類號】：TP333;TP309

【參考文獻】

相關(guān)期刊論文前10條

1 唐箭;;云存儲系統(tǒng)的分析與應(yīng)用研究[J];電腦知識與技術(shù);2009年20期

2 許春聰;黃小猛;徐鵬志;吳諾;劉松彬;楊廣文;;CarrierFS:基于虛擬內(nèi)存的分布式文件系統(tǒng)[J];華中科技大學(xué)學(xué)報(自然科學(xué)版);2010年S1期

3 付印金;肖儂;劉芳;;重復(fù)數(shù)據(jù)刪除關(guān)鍵技術(shù)研究進展[J];計算機研究與發(fā)展;2012年01期

4 方世昌;;國際標(biāo)準(zhǔn)ISO 7498-2第一版簡介和讀后感[J];計算機工程與應(yīng)用;1990年07期

5 張前進;齊美彬;李莉;;基于應(yīng)用層負(fù)載均衡策略的分析與研究[J];計算機工程與應(yīng)用;2007年32期

6 楊德志;許魯;張建剛;;藍(lán)鯨分布式文件系統(tǒng)元數(shù)據(jù)服務(wù)[J];計算機工程;2008年07期

7 黎哲,郭成城,陳亮;一個基于TCP遷移機制的第七層負(fù)載均衡系統(tǒng)[J];計算機應(yīng)用研究;2005年04期

8 羅擁軍;李曉樂;孫如祥;;負(fù)載均衡算法綜述[J];科技情報開發(fā)與經(jīng)濟;2008年23期

9 謝鯤;文吉剛;張大方;謝高崗;;布魯姆過濾器查詢算法[J];軟件學(xué)報;2009年01期

10 譚生龍;;存儲虛擬化技術(shù)的研究[J];微計算機應(yīng)用;2010年01期

相關(guān)碩士學(xué)位論文前1條

1 陳虎;基于HDFS的云存儲平臺的優(yōu)化與實現(xiàn)[D];華南理工大學(xué);2012年

，

本文編號：2319262

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://www.sikaile.net/kejilunwen/jisuanjikexuelunwen/2319262.html

上一篇：基于垃圾回收的MapReduce作業(yè)內(nèi)存調(diào)優(yōu)
下一篇：一種基于KVM的vTPM虛擬機動態(tài)遷移方案

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

面向海量用戶的云存儲系統(tǒng)的設(shè)計與優(yōu)化