面向異構(gòu)多核系統(tǒng)的并行計(jì)算模型和調(diào)度算法研究
本文關(guān)鍵詞:面向異構(gòu)多核系統(tǒng)的并行計(jì)算模型和調(diào)度算法研究 出處:《湖南大學(xué)》2012年碩士論文 論文類(lèi)型:學(xué)位論文
更多相關(guān)文章: 異構(gòu)多核系統(tǒng) 并行編程模型 MapReduce 推測(cè)執(zhí)行 調(diào)度算法
【摘要】:隨著異構(gòu)多核并行編程的難度不斷增大,人們迫切希望并行編程模型可以處理并能生成超大規(guī)模(TB級(jí))數(shù)據(jù)集,以減少并行編程難度,提高異構(gòu)多核系統(tǒng)開(kāi)發(fā)速度。 MapReduce是近些年新興的并行編程模型,該模型主要用于實(shí)現(xiàn)并行計(jì)算中子任務(wù)劃分、資源的調(diào)度、計(jì)算結(jié)構(gòu)歸約等,其為異構(gòu)并行系統(tǒng)的大規(guī)模數(shù)據(jù)處理提供一個(gè)簡(jiǎn)單、有效的解決方案。然而傳統(tǒng)的MapReduce調(diào)度算法存在任務(wù)響應(yīng)時(shí)間過(guò)長(zhǎng),系統(tǒng)吞吐量大幅度下降的情況,從而影響整個(gè)系統(tǒng)的效率的提高。本文在對(duì)MapReduce并行編程模型深入研究的基礎(chǔ)上,提出了一種適應(yīng)于Hadoop平臺(tái)的異構(gòu)多核的MapReduce調(diào)度改進(jìn)算法。主要工作如下: (1)針對(duì)MapReduce模型的調(diào)度問(wèn)題,研究了影響MapReduce調(diào)度性能的三個(gè)主要因素:本地化、同步開(kāi)銷(xiāo)及公平性約束,并對(duì)處理這三個(gè)因素的調(diào)度方法進(jìn)行分析。對(duì)MapReduce模型中同步開(kāi)銷(xiāo)問(wèn)題的兩種解決方法:異步處理和推測(cè)執(zhí)行進(jìn)行了探究。對(duì)于公平性約束,討論了Hadoop的本地提升和延遲調(diào)度,以及Dryad的Quincy調(diào)度器。 (2)結(jié)合異構(gòu)多核環(huán)境的特性,針對(duì)基于典型MapReduce調(diào)度算法——LATE算法的不足,提出了一種MapReduce異構(gòu)多核調(diào)度的改進(jìn)算法,該算法通過(guò)在系統(tǒng)上添加使系統(tǒng)獲得自動(dòng)學(xué)習(xí)的能力——機(jī)器學(xué)習(xí)中的監(jiān)管學(xué)習(xí),隨機(jī)提取部分工作任務(wù)作為測(cè)試任務(wù),以獲得處理節(jié)點(diǎn)的處理信息,進(jìn)而得到任務(wù)處理的各個(gè)階段的實(shí)際時(shí)間比,并調(diào)整程序的運(yùn)行方式,從而啟動(dòng)備份任務(wù),以提高任務(wù)響應(yīng)時(shí)間。 為了驗(yàn)證本文算法的有效性,本文在Hadoop平臺(tái)基礎(chǔ)上,對(duì)本文算法進(jìn)行了實(shí)驗(yàn),實(shí)驗(yàn)結(jié)果表明本文算法在任務(wù)響應(yīng)時(shí)間上,,優(yōu)于LATE算法和Hadoop平臺(tái)原有調(diào)度算法,有利于整個(gè)系統(tǒng)處理效率的提高,對(duì)異構(gòu)多核并行計(jì)算具有一定的推動(dòng)意義。
[Abstract]:With the increasing difficulty of heterogeneous multi-core parallel programming, people urgently hope that the parallel programming model can process and generate large scale / terabyte (TB) data sets, so as to reduce the difficulty of parallel programming. Improve the development speed of heterogeneous multi-core system. MapReduce is a new parallel programming model in recent years. This model is mainly used to realize the parallel computing neutron task partition, resource scheduling, computing structure reduction and so on. It provides a simple and effective solution for large-scale data processing in heterogeneous parallel systems. However, the task response time of traditional MapReduce scheduling algorithm is too long. The throughput of the system is greatly reduced, which affects the efficiency of the whole system. This paper deeply studies the parallel programming model of MapReduce. In this paper, an improved MapReduce scheduling algorithm based on heterogeneous multicore for Hadoop platform is proposed. The main work is as follows: 1) aiming at the scheduling problem of MapReduce model, three main factors affecting the scheduling performance of MapReduce are studied: localization, synchronization overhead and fairness constraints. This paper also analyzes the scheduling methods to deal with these three factors, and explores two solutions to the synchronous overhead problem in the MapReduce model: asynchronous processing and speculative execution. The local promotion and delay scheduling of Hadoop and the Quincy scheduler of Dryad are discussed. 2) considering the characteristics of heterogeneous multi-core environment, aiming at the shortcomings of the typical MapReduce scheduling algorithm, path algorithm. In this paper, an improved algorithm for heterogeneous multi-core scheduling of MapReduce is proposed. The algorithm adds the ability of automatic learning to the system, which is the supervised learning in machine learning. A part of the task is randomly extracted as a test task to obtain the processing information of the processing node, and then the actual time ratio of each stage of the task processing is obtained, and the operation mode of the program is adjusted to start the backup task. To increase task response time. In order to verify the effectiveness of this algorithm, this paper based on the Hadoop platform, the experimental results show that the algorithm in the task response time. It is superior to the LATE algorithm and the original scheduling algorithm of Hadoop platform, which is beneficial to the improvement of the processing efficiency of the whole system, and has a certain significance to promote the heterogeneous multi-core parallel computing.
【學(xué)位授予單位】:湖南大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類(lèi)號(hào)】:TP338.6
【參考文獻(xiàn)】
相關(guān)期刊論文 前7條
1 鄭欣杰;朱程榮;熊齊邦;;基于MapReduce的分布式光線跟蹤的設(shè)計(jì)與實(shí)現(xiàn)[J];計(jì)算機(jī)工程;2007年22期
2 陳全;鄧倩妮;;異構(gòu)環(huán)境下自適應(yīng)的Map-Reduce調(diào)度[J];計(jì)算機(jī)工程與科學(xué);2009年S1期
3 周鋒;李旭偉;;一種改進(jìn)的MapReduce并行編程模型[J];科協(xié)論壇(下半月);2009年02期
4 陳康;鄭緯民;;云計(jì)算:系統(tǒng)實(shí)例與研究現(xiàn)狀[J];軟件學(xué)報(bào);2009年05期
5 鄭啟龍;王昊;吳曉偉;房明;;HPMR:多核集群上的高性能計(jì)算支撐平臺(tái)[J];微電子學(xué)與計(jì)算機(jī);2008年09期
6 王鄂;李銘;;云計(jì)算下的海量數(shù)據(jù)挖掘研究[J];現(xiàn)代計(jì)算機(jī)(專(zhuān)業(yè)版);2009年11期
7 陳國(guó)良;苗乾坤;孫廣中;徐云;鄭啟龍;;分層并行計(jì)算模型[J];中國(guó)科學(xué)技術(shù)大學(xué)學(xué)報(bào);2008年07期
相關(guān)博士學(xué)位論文 前1條
1 張琦;多核系統(tǒng)中的程序性能優(yōu)化研究[D];中國(guó)科學(xué)技術(shù)大學(xué);2010年
相關(guān)碩士學(xué)位論文 前1條
1 胡利軍;Web集群服務(wù)器的負(fù)載均衡和性能優(yōu)化[D];北京郵電大學(xué);2010年
本文編號(hào):1420674
本文鏈接:http://www.sikaile.net/kejilunwen/jisuanjikexuelunwen/1420674.html