當(dāng)前位置：主頁(yè) > 科技論文 > 計(jì)算機(jī)論文 >

卷積神經(jīng)網(wǎng)絡(luò)的并行化研究

發(fā)布時(shí)間：2018-04-02 06:11

本文選題：卷積神經(jīng)網(wǎng)絡(luò)　切入點(diǎn)：并行化　出處：《鄭州大學(xué)》2013年碩士論文

【摘要】：卷積神經(jīng)網(wǎng)絡(luò)(Convolutional Neural NetWorks, CNN);算法能夠有效地從原始輸入中學(xué)習(xí)到高階不變性的特征,廣泛應(yīng)用于車牌檢測(cè)、人臉檢測(cè)、手勢(shì)識(shí)別、語(yǔ)音識(shí)別、圖像復(fù)原和語(yǔ)義分析等領(lǐng)域。目前,CNN算法主要以串行方式實(shí)現(xiàn)。串行實(shí)現(xiàn)的CNN算法存在兩個(gè)主要缺陷：(1)不能發(fā)揮算法內(nèi)在的并行性,導(dǎo)致訓(xùn)練過(guò)程需要較長(zhǎng)時(shí)間；(2)伸縮性不強(qiáng),使得不能高效地處理數(shù)據(jù)密集型問(wèn)題。谷歌提出的并行編程框架MapReduce具有良好的擴(kuò)展性和容錯(cuò)性,成為當(dāng)前云計(jì)算平臺(tái)并行處理大規(guī)模數(shù)據(jù)的主流技術(shù)。本文使用MapReduce并行化CNN,并采用GPU加速計(jì)算過(guò)程,以增強(qiáng)算法的并行性和伸縮性,取得的成果如下： 1.提出利用MapReduce并行化訓(xùn)練卷積神經(jīng)網(wǎng)絡(luò)的方法(CNN-MR),并部署到Hadoop云計(jì)算平臺(tái)。CNN-MR采用數(shù)據(jù)并行的分解方法,將訓(xùn)練樣本劃分給平臺(tái)中的每個(gè)計(jì)算節(jié)點(diǎn)。并使用批量更新的方式,在所有計(jì)算節(jié)點(diǎn)處理本地訓(xùn)練樣本結(jié)束之后,節(jié)點(diǎn)之間做一次通信,得到可訓(xùn)練參數(shù)在整個(gè)訓(xùn)練集上的全局梯度改變量并更新網(wǎng)絡(luò),多次迭代,至網(wǎng)絡(luò)收斂到設(shè)定閾值或最大迭代次數(shù),算法結(jié)束。 2.提出利用GPU加速CNN-MR算法的方法(CNN-MR-G),并部署到G-Hadoop計(jì)算平臺(tái)。將CNN每層的特征圖、神經(jīng)元或權(quán)值分別映射到GPU的線程塊、線程,使得同層神經(jīng)元可并行地計(jì)算輸出結(jié)果、輸出誤差或權(quán)值的局部梯度改變量。在手體字?jǐn)?shù)據(jù)集(MNIST)和自建車牌數(shù)據(jù)集上的實(shí)驗(yàn)表明,CNN-MR算法相對(duì)于串行算法具有較好的加速比和伸縮性,CNN-MR-G算法對(duì)CNN-MR算法有較好的加速效果。
[Abstract]:Convolutional Neural Networks (CNN); the algorithm can effectively learn high-order invariance features from the original input, and is widely used in license plate detection, face detection, gesture recognition, speech recognition, and so on. Image restoration and semantic analysis. At present, the CNN algorithm is mainly implemented in serial mode. The serial CNN algorithm has two main defects: 1) it can not give full play to the inherent parallelism of the algorithm, which leads to the low scalability of the training process. Makes it impossible to deal with data-intensive problems efficiently. MapReduce, a parallel programming framework proposed by Google, has become the mainstream technology for parallel processing of large scale data in cloud computing platforms due to its good scalability and fault-tolerance. This paper uses MapReduce to parallelize MapReduce and GPU to speed up the computing process. To enhance the parallelism and scalability of the algorithm, the results are as follows:. 1. A method of training convolutional neural network by using MapReduce parallelism is proposed, and deployed to Hadoop cloud computing platform. CNN-MR uses data parallel decomposition method to divide the training samples into each computing node in the platform. After all the computing nodes process the local training samples, the nodes communicate once, get the global gradient change of the trainable parameters on the whole training set, update the network, and iterate many times. To the network convergence to set a threshold or the maximum number of iterations, the algorithm ends. 2. The method of using GPU to accelerate CNN-MR algorithm is put forward and deployed to the G-Hadoop computing platform. The characteristic graph, neuron or weight of each CNN layer are mapped to the thread block and thread of GPU, respectively, so that the same layer neuron can calculate the output results in parallel. The local gradient change of the output error or weight. Experiments on the handwritten character data set (MNIST) and the self-built license plate dataset show that the CNN-MR algorithm has a better speedup and scalability than the serial algorithm in accelerating the CNN-MR algorithm.
【學(xué)位授予單位】：鄭州大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2013
【分類號(hào)】：TP183;TP338.6

【參考文獻(xiàn)】

相關(guān)期刊論文前6條

1 張佳康;陳慶奎;;基于CUDA技術(shù)的卷積神經(jīng)網(wǎng)絡(luò)識(shí)別算法[J];計(jì)算機(jī)工程;2010年15期

2 田緒紅;江敏杰;;GPU加速的神經(jīng)網(wǎng)絡(luò)BP算法[J];計(jì)算機(jī)應(yīng)用研究;2009年05期

3 高曙;基于機(jī)群的并行BP算法的設(shè)計(jì)與實(shí)現(xiàn)[J];武漢理工大學(xué)學(xué)報(bào)(交通科學(xué)與工程版);2002年05期

4 曹鋒;周傲英;;基于圖形處理器的數(shù)據(jù)流快速聚類[J];軟件學(xué)報(bào);2007年02期

5 趙莉;程榮;;一種并行BP神經(jīng)網(wǎng)絡(luò)的動(dòng)態(tài)負(fù)載平衡方案[J];計(jì)算機(jī)技術(shù)與發(fā)展;2006年07期

6 楊珂;羅瓊;石教英;;圖形處理器在數(shù)據(jù)庫(kù)技術(shù)中的應(yīng)用[J];浙江大學(xué)學(xué)報(bào)(工學(xué)版);2009年08期

，

本文編號(hào)：1699122

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://www.sikaile.net/kejilunwen/jisuanjikexuelunwen/1699122.html

上一篇：深亞微米大容量PROM芯片ESD保護(hù)技術(shù)
下一篇：串行RapidIO總線在存儲(chǔ)系統(tǒng)中的應(yīng)用研究

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

卷積神經(jīng)網(wǎng)絡(luò)的并行化研究