面向動態(tài)場景理解的時空深度學(xué)習(xí)算法
發(fā)布時間:2018-06-30 19:09
本文選題:動態(tài)場景理解 + 深度學(xué)習(xí) ; 參考:《電子科技大學(xué)》2017年碩士論文
【摘要】:動態(tài)場景理解是一個計算機(jī)視覺和機(jī)器學(xué)習(xí)的交叉子問題,一直以來都是一個研究的熱點(diǎn)。本文提出了基于規(guī)則針對動態(tài)監(jiān)控場景中特定事件檢測的算法,并針對其存在的問題,提出了面向動態(tài)監(jiān)控場景的時空深度學(xué)習(xí)算法,并將其在動態(tài)交通場景上進(jìn)行了應(yīng)用,實現(xiàn)了對方向盤轉(zhuǎn)角的擬人化決策。針對動態(tài)場景理解中特定事件的檢測,本文提出了一種使用基于規(guī)則動態(tài)場景理解算法。該算法通過分析特定事件的特點(diǎn),為不同的事件制定針對性的檢測規(guī)則,使用光流法和背景建模算法等經(jīng)典計算機(jī)視覺算法,并結(jié)合根據(jù)經(jīng)驗設(shè)置的約束檢驗,實現(xiàn)了對應(yīng)的事件檢測。這套算法在監(jiān)控場景的人群異常事件檢測中進(jìn)行應(yīng)用時,對人群聚集異常檢測的F-Measure達(dá)到了90.9%,而在人群逃散異常檢測任務(wù)中,顯示出了對光線變化的魯棒性,在其他非學(xué)習(xí)算法幾乎無法檢測的光線變化劇烈的場景中F-Measure仍達(dá)到了61.24%。針對上述基于規(guī)則的方法規(guī)則制定困難且難以推廣的缺點(diǎn),本文使用深度學(xué)習(xí)算法,提出了一種深度學(xué)習(xí)動態(tài)場景分析算法,該算法不特別針對某類特定事件而是對多種事件普遍適用。本文通過使用多路三維卷積網(wǎng)絡(luò),提取出動態(tài)場景數(shù)據(jù)中豐富的高層特征,并將這些高層特征融合,用以對動態(tài)場景數(shù)據(jù)的內(nèi)容進(jìn)行分類,之后在該網(wǎng)絡(luò)結(jié)合之前也有應(yīng)用的經(jīng)驗約束,可有效地進(jìn)行動態(tài)監(jiān)控場景中的事件檢測。在訓(xùn)練網(wǎng)絡(luò)時,本文使用了預(yù)訓(xùn)練及微調(diào)的方式一定程度上解決了訓(xùn)練樣本不足的問題。在微調(diào)及檢測時,使用了時空分塊策略提升了檢測效果。在監(jiān)控場景的人群異常檢測中這個適用于多種事件深度學(xué)習(xí)動態(tài)場景分析算法取得了比針對特定事件專門制定規(guī)則的方法略優(yōu)的效果。針對面向的動態(tài)交通場景中的決策的任務(wù),本文將上述多路時空三維卷積網(wǎng)絡(luò)進(jìn)行了改進(jìn)和應(yīng)用。本文將卷積網(wǎng)絡(luò)中的一些效果提升方法應(yīng)用到網(wǎng)絡(luò)中,構(gòu)建了時空決策網(wǎng)絡(luò)。通過從經(jīng)驗駕駛員的駕駛數(shù)據(jù)中學(xué)習(xí)有助于網(wǎng)絡(luò)理解動態(tài)交通場景的特征,使得改進(jìn)后的網(wǎng)絡(luò)成功地對汽車行駛過程中方向盤轉(zhuǎn)角進(jìn)行了決策。最終得到的時空決策網(wǎng)絡(luò)相對于現(xiàn)在通用的二維卷積神經(jīng)網(wǎng)絡(luò)的平均絕對值誤差減小了0.762°。
[Abstract]:Dynamic scene understanding is an intersecting sub-problem of computer vision and machine learning, which has always been a hot research topic. In this paper, a rule-based algorithm for detecting specific events in dynamic monitoring scene is proposed. Aiming at its existing problems, a spatio-temporal depth learning algorithm for dynamic monitoring scene is proposed, and it is applied to dynamic traffic scene. The personification decision of steering wheel angle is realized. To detect specific events in dynamic scene understanding, a rule-based dynamic scene understanding algorithm is proposed in this paper. By analyzing the characteristics of specific events, the algorithm formulates specific detection rules for different events, uses classical computer vision algorithms such as optical flow method and background modeling algorithm, and combines the constraints set up according to experience. The corresponding event detection is realized. When this algorithm is applied to the detection of abnormal events of crowd in monitoring scene, the F-Measure of abnormal detection of crowd aggregation reaches 90.9, while in the task of detecting crowd escape anomaly, it shows the robustness to the change of light. F-Measure still reaches 61.24 in other non-learning algorithms where light changes dramatically, almost undetectable. In view of the disadvantages of the rule-based method, which is difficult to establish and difficult to popularize, this paper proposes a dynamic scene analysis algorithm for depth learning by using depth learning algorithm. The algorithm is applicable to a variety of events rather than a particular class of events. In this paper, the rich high-level features of dynamic scene data are extracted by using multi-channel 3D convolution network, and these high-level features are fused to classify the contents of dynamic scene data. After that, there are some application constraints before the network is combined, which can effectively detect the events in the dynamic monitoring scene. In the training of network, the methods of pre-training and fine-tuning are used to solve the problem of shortage of training samples to some extent. In fine tuning and detection, space-time block strategy is used to improve the detection effect. In the crowd anomaly detection of monitoring scene, this dynamic scene analysis algorithm is suitable for multi-event depth learning and achieves better results than the special rule making method for specific events. Aiming at the task of decision making in the dynamic traffic scene, this paper improves and applies the multi-channel spatio-temporal 3D convolution network. In this paper, some effect enhancement methods in convolution network are applied to the network, and a spatio-temporal decision network is constructed. Learning from the driving data of experienced drivers helps the network to understand the characteristics of the dynamic traffic scene and makes the improved network make a successful decision on the steering wheel angle in the course of vehicle driving. Compared with the current two-dimensional convolution neural network, the average absolute value error of the resulting spatio-temporal decision network is reduced by 0.762 擄.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.41;TP181
【參考文獻(xiàn)】
相關(guān)碩士學(xué)位論文 前1條
1 蘇建安;面向智能視頻監(jiān)控的高動態(tài)場景建模和修復(fù)[D];電子科技大學(xué);2014年
,本文編號:2086546
本文鏈接:http://www.sikaile.net/kejilunwen/zidonghuakongzhilunwen/2086546.html
最近更新
教材專著