基于MPI的多層容錯高性能云計(jì)算平臺關(guān)鍵技術(shù)研究
發(fā)布時間:2018-05-27 09:42
本文選題:MPI + 容錯 ; 參考:《武漢理工大學(xué)》2013年碩士論文
【摘要】:隨著全球信息化浪潮的推進(jìn)和計(jì)算機(jī)應(yīng)用技術(shù)的不斷迭代更新,各行業(yè)需要處理的信息量越來越大,尤其實(shí)在航空航天、海洋開發(fā)、天氣預(yù)報(bào)等諸多領(lǐng)域,數(shù)據(jù)規(guī)模已經(jīng)達(dá)到TB甚至PB級,而如何存儲并處理這種規(guī)模的數(shù)據(jù)顯得至關(guān)重要,為了解決這一問題,引入云計(jì)算平臺這一概念。一方面,對于云計(jì)算平臺而言有兩個特點(diǎn),一個是能分布式存儲大數(shù)據(jù),另一個特點(diǎn)是將視任務(wù)執(zhí)行失敗為正常情況;但另外一方面,許多云平臺不適用于低延遲服務(wù),并且在面對計(jì)算密集型任務(wù)時候顯得效率不高,而MPI擅長計(jì)算密集型,并且通信迅速,消息傳遞延遲少,因而用MPI實(shí)現(xiàn)一個云平臺則顯得十分有意義。在本研究當(dāng)中將主要研究如何構(gòu)建并實(shí)現(xiàn)能夠支持大數(shù)據(jù)存儲存并擁有多層容錯功能的MPI云平臺。 針對上述問題,本文提出并實(shí)現(xiàn)出一個基于MPI的云平臺,為了讓此平臺能夠支持大數(shù)據(jù)存儲,因而實(shí)現(xiàn)了一個由MySQL構(gòu)建的分布式集群,并且多個MySQL節(jié)點(diǎn)存儲不一樣的數(shù)據(jù),在此之上增加一個數(shù)據(jù)庫中間件層,以便能將這些數(shù)據(jù)庫節(jié)點(diǎn)聯(lián)立在一起。而用戶在使用的時候,并不需要考慮此存儲架構(gòu),使用起來就和單個MySQL的效果是類似的。另外一方面,考慮到MPI自身沒有提供響應(yīng)的容錯機(jī)制,因而本研究者設(shè)計(jì)出3層容錯機(jī)制,分別是:任務(wù)失敗重調(diào)度、任務(wù)的CheckPoint/Restart以及進(jìn)程遷徙,并且將此容錯機(jī)制獨(dú)立分離出接口,以便平臺開發(fā)者可以依據(jù)自身需求來定制其具體需求,也便于對此功能進(jìn)行二次開發(fā),而對于用戶而言,則可以依據(jù)其實(shí)際需求來設(shè)定容錯級別。 經(jīng)過測試和評估,證明基于MySQL的分布式集群之上運(yùn)行的數(shù)據(jù)庫中間件能夠處理用戶的SQL請求,實(shí)現(xiàn)數(shù)據(jù)的查找以及基本的增刪改功能,并且本平臺可以很好地應(yīng)對節(jié)點(diǎn)服務(wù)失效問題并能最終給用戶反饋正確的結(jié)果。原型系統(tǒng)的可行性、可靠性、健壯性、高效性均達(dá)到設(shè)計(jì)預(yù)期。
[Abstract]:Along with the advance of the global informationization tide and the constant iteration of computer application technology , the amount of information to be processed by each industry is becoming more and more important , especially in the fields of aerospace , ocean development , weather forecast and so on , and how to store and process the data of this scale is very important . In order to solve this problem , the concept of cloud computing platform is introduced . In one aspect , for the cloud computing platform , there are two characteristics , one is the distributed storage big data , and the other characteristic is that the task execution failure is normal .
However , on the other hand , many cloud platforms are not suitable for low - latency services , and are inefficient in the face of computing - intensive tasks , while MPI is good at computing - intensive , and communication is fast , messaging latency is less , and it is meaningful to implement a cloud platform with MPI . In this study , we will focus on how to build and implement MPI cloud platforms that support large data storage and multi - layer fault tolerance .
In view of the above problems , a cloud platform based on MPI is proposed and implemented . In order to enable this platform to support large data storage , a distributed cluster constructed by MySQL is implemented , and a database middleware layer is added on the platform so that the database nodes can be connected together .
Through testing and evaluation , it is proved that the database middleware running on the distributed cluster based on MySQL can handle the user ' s SQL request , realize the searching of the data and the basic addition and deletion function , and the platform can well deal with the problem of the failure of the node service and finally feed back the correct result to the user . The feasibility , the reliability , the robustness and the efficiency of the prototype system reach the design expectation .
【學(xué)位授予單位】:武漢理工大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP333;TP302.8
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 鄭啟龍;吳曉偉;房明;王昊;汪勝;王向前;;HPMR在并行矩陣計(jì)算中的應(yīng)用[J];計(jì)算機(jī)工程;2010年08期
相關(guān)博士學(xué)位論文 前1條
1 謝e,
本文編號:1941532
本文鏈接:http://www.sikaile.net/kejilunwen/jisuanjikexuelunwen/1941532.html
最近更新
教材專著