面向GPGPU的嵌入式平臺(tái)人群計(jì)數(shù)算法的并行優(yōu)化與設(shè)計(jì)
發(fā)布時(shí)間:2018-07-31 06:02
【摘要】:目前,國(guó)內(nèi)城市人口的快速增長(zhǎng)大大提高了公共場(chǎng)所人群聚集事件的發(fā)生概率。由人群聚集導(dǎo)致的踩踏、混亂等異常群體事件給人們帶來(lái)了巨大的生命財(cái)產(chǎn)損失。如何有效監(jiān)測(cè)和管理地鐵、商城和廣場(chǎng)等公共場(chǎng)所的人群動(dòng)態(tài)信息,成為了當(dāng)前亟待解決的實(shí)際問(wèn)題。人群數(shù)量信息是異常群體事件的主要特征,若在事件發(fā)生前獲得監(jiān)控區(qū)域的人群數(shù)量信息,則可以幫助管理者及時(shí)疏導(dǎo)聚集的人群,有效避免異常群體事件的發(fā)生。近年來(lái),GPU硬件性能的快速提高,使得利用GPU進(jìn)行通用計(jì)算成為了數(shù)字圖像算法加速的一種新途徑。本文針對(duì)人群異常事件預(yù)警的需求,提出了一種面向監(jiān)控視頻的人群計(jì)數(shù)算法,并利用GPGPU通用計(jì)算技術(shù)對(duì)該算法的瓶頸模塊進(jìn)行硬件加速。首先,根據(jù)廣場(chǎng)和通道等公共場(chǎng)所監(jiān)控視頻的特點(diǎn),利用圖像處理中的前景提取、邊緣檢測(cè)、目標(biāo)識(shí)別與跟蹤等技術(shù)設(shè)計(jì)和實(shí)現(xiàn)該人群計(jì)數(shù)算法,并對(duì)該人群計(jì)數(shù)算法的各個(gè)模塊進(jìn)行耗時(shí)分析,得出算法運(yùn)行瓶頸模塊為ViBe前景提取和Canny邊緣檢測(cè)。然后,利用垮平臺(tái)的OpenCL異構(gòu)開發(fā)框架分別對(duì)ViBe前景提取和Canny邊緣檢測(cè)進(jìn)行并行優(yōu)化設(shè)計(jì)。在ViBe前景提取并行優(yōu)化設(shè)計(jì)時(shí),采用了NDRange索引空間優(yōu)化和異步執(zhí)行優(yōu)化方案對(duì)其模型初始化和模型更新進(jìn)行GPU硬件加速。在Canny邊緣檢測(cè)并行優(yōu)化設(shè)計(jì)時(shí),分別利用內(nèi)存訪問(wèn)優(yōu)化、分離式卷積設(shè)計(jì)、減少內(nèi)存訪問(wèn)次數(shù)和有限次迭代處理等方案對(duì)其圖像高速濾波、梯度值和方向計(jì)算、非極大值抑制和雙閥值邊緣連接進(jìn)行并行優(yōu)化處理。對(duì)優(yōu)化前后的ViBe算法和Canny算法進(jìn)行性能測(cè)試,結(jié)果表明優(yōu)化后的算法都能在不影響處理效果的情況下,降低耗時(shí),提高運(yùn)行效率。最后,將并行優(yōu)化后的人群計(jì)數(shù)算法應(yīng)用到監(jiān)控系統(tǒng)中,并在嵌入式平臺(tái)進(jìn)行實(shí)現(xiàn)和測(cè)試。通過(guò)對(duì)監(jiān)控系統(tǒng)整體功能對(duì)比和性能測(cè)試,結(jié)果表明系統(tǒng)通過(guò)OpenCL并行優(yōu)化設(shè)計(jì)后,明顯提高了算法耗時(shí)較高的瓶頸模塊的運(yùn)行效率。經(jīng)過(guò)GPU硬件加速后的系統(tǒng)整體性能夠在不影響系統(tǒng)功能操作和監(jiān)控效果的情況下得到了45%到60%的提高。
[Abstract]:At present, the rapid growth of urban population in China has greatly increased the probability of crowd gathering in public places. The stampede, chaos and other abnormal crowd events caused by crowd gathering have brought huge loss of life and property to people. How to effectively monitor and manage the crowd dynamic information in public places such as subway, shopping mall and square has become a practical problem to be solved. The information of crowd quantity is the main characteristic of abnormal group events. If the information of crowd quantity in monitoring area is obtained before the event occurs, it can help managers to direct the crowd gathered in time and effectively avoid the occurrence of abnormal group events. In recent years, with the rapid improvement of the hardware performance of GPUs, general computing using GPU has become a new way to accelerate the digital image algorithm. In this paper, a crowd counting algorithm for surveillance video is proposed, and the bottleneck module of the algorithm is accelerated by using the general computing technology of GPGPU. Firstly, according to the characteristics of surveillance video in public places, such as square and passageway, the algorithm of crowd counting is designed and implemented by using the techniques of foreground extraction, edge detection, target recognition and tracking in image processing. The time-consuming analysis of each module of the algorithm shows that the bottleneck module of the algorithm is ViBe foreground extraction and Canny edge detection. Then, the ViBe foreground extraction and Canny edge detection are optimized by using the OpenCL heterogeneous development framework. NDRange index space optimization and asynchronous execution optimization scheme are used to accelerate the model initialization and model update in parallel optimization design of ViBe foreground extraction. In the parallel optimization design of Canny edge detection, the methods of memory access optimization, separation convolution design, reduction of memory access times and finite iterative processing are used to calculate the image high speed filtering, gradient value and direction calculation, respectively. Non-maximum suppression and double-threshold edge connection are processed by parallel optimization. The performance tests of the ViBe algorithm and the Canny algorithm before and after the optimization show that the optimized algorithm can reduce the time consuming and improve the running efficiency without affecting the processing effect. Finally, the parallel optimized crowd counting algorithm is applied to the monitoring system, and implemented and tested on the embedded platform. Through the comparison of the whole function of the monitoring system and the performance test, the results show that the system can obviously improve the running efficiency of the bottleneck module, which is time-consuming and time-consuming, after the system is designed in parallel with OpenCL. After GPU hardware acceleration, the system integrity can be improved by 45% to 60% without affecting the system function operation and monitoring effect.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:X924;TP391.41
本文編號(hào):2154652
[Abstract]:At present, the rapid growth of urban population in China has greatly increased the probability of crowd gathering in public places. The stampede, chaos and other abnormal crowd events caused by crowd gathering have brought huge loss of life and property to people. How to effectively monitor and manage the crowd dynamic information in public places such as subway, shopping mall and square has become a practical problem to be solved. The information of crowd quantity is the main characteristic of abnormal group events. If the information of crowd quantity in monitoring area is obtained before the event occurs, it can help managers to direct the crowd gathered in time and effectively avoid the occurrence of abnormal group events. In recent years, with the rapid improvement of the hardware performance of GPUs, general computing using GPU has become a new way to accelerate the digital image algorithm. In this paper, a crowd counting algorithm for surveillance video is proposed, and the bottleneck module of the algorithm is accelerated by using the general computing technology of GPGPU. Firstly, according to the characteristics of surveillance video in public places, such as square and passageway, the algorithm of crowd counting is designed and implemented by using the techniques of foreground extraction, edge detection, target recognition and tracking in image processing. The time-consuming analysis of each module of the algorithm shows that the bottleneck module of the algorithm is ViBe foreground extraction and Canny edge detection. Then, the ViBe foreground extraction and Canny edge detection are optimized by using the OpenCL heterogeneous development framework. NDRange index space optimization and asynchronous execution optimization scheme are used to accelerate the model initialization and model update in parallel optimization design of ViBe foreground extraction. In the parallel optimization design of Canny edge detection, the methods of memory access optimization, separation convolution design, reduction of memory access times and finite iterative processing are used to calculate the image high speed filtering, gradient value and direction calculation, respectively. Non-maximum suppression and double-threshold edge connection are processed by parallel optimization. The performance tests of the ViBe algorithm and the Canny algorithm before and after the optimization show that the optimized algorithm can reduce the time consuming and improve the running efficiency without affecting the processing effect. Finally, the parallel optimized crowd counting algorithm is applied to the monitoring system, and implemented and tested on the embedded platform. Through the comparison of the whole function of the monitoring system and the performance test, the results show that the system can obviously improve the running efficiency of the bottleneck module, which is time-consuming and time-consuming, after the system is designed in parallel with OpenCL. After GPU hardware acceleration, the system integrity can be improved by 45% to 60% without affecting the system function operation and monitoring effect.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:X924;TP391.41
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 周治平;許伶俐;李文慧;;特征回歸與檢測(cè)結(jié)合的人數(shù)統(tǒng)計(jì)方法[J];計(jì)算機(jī)輔助設(shè)計(jì)與圖形學(xué)學(xué)報(bào);2015年03期
相關(guān)碩士學(xué)位論文 前1條
1 俞嫣琰;視頻摘要算法研發(fā)及GPU優(yōu)化[D];浙江大學(xué);2016年
,本文編號(hào):2154652
本文鏈接:http://www.sikaile.net/kejilunwen/anquangongcheng/2154652.html
最近更新
教材專著