深度學(xué)習(xí)驅(qū)動的場景分析和語義目標(biāo)解析
本文關(guān)鍵詞: 深度學(xué)習(xí) 卷積神經(jīng)網(wǎng)絡(luò) 深度估計 光流估計 行人細(xì)粒度分析 全變分模型 多尺度相關(guān)性學(xué)習(xí) 出處:《浙江大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
【摘要】:語義目標(biāo)解析和場景分析是計算機(jī)視覺中重要的研究方向,其主要目的是對圖像和視頻中的目標(biāo)和場景進(jìn)行分析、理解,在視頻監(jiān)控、自動駕駛、智能交通等方面均有廣泛的應(yīng)用。語義目標(biāo)解析涉及對行人、車輛等目標(biāo)的檢測、識別及分析過程。其中行人細(xì)粒度分析是很多計算機(jī)視覺應(yīng)用的基礎(chǔ),其目的是將行人圖像分割成語義部件,并識別其屬性。場景分析主要包括對場景的深度估計、運(yùn)動分析以及結(jié)構(gòu)分析等。場景的深度估計是指從圖像中得到場景的深度信息,有助于恢復(fù)場景的三維結(jié)構(gòu)。場景的運(yùn)動分析則主要是指從連續(xù)視頻幀中得到光流信息,被用于運(yùn)動目標(biāo)的行為識別和異常事件的檢測分類。因此,有效的行人細(xì)粒度分析、圖像深度估計和光流估計算法具有重要的現(xiàn)實(shí)意義,本文也主要關(guān)注這三個任務(wù)。近年來,深度學(xué)習(xí)已在目標(biāo)檢測、人臉識別、場景標(biāo)注等計算機(jī)視覺任務(wù)上取得突破,設(shè)計以任務(wù)為導(dǎo)向的網(wǎng)絡(luò)模型受到學(xué)術(shù)界和工業(yè)界越來越多的關(guān)注。本文將針對行人細(xì)粒度分析、單張圖像深度估計和光流估計這三個任務(wù),分別提出不同的基于深度學(xué)習(xí)的模型。具體如下:1.對于單張圖像深度估計任務(wù),本文首先回顧了已有的相關(guān)方法,然后針對目前基于深度學(xué)習(xí)的深度估計模型在建?臻g上下文關(guān)系上存在的不足,本文分別提出基于數(shù)據(jù)驅(qū)動的上下文特征學(xué)習(xí)模型和基于全變分模型的損失函數(shù)模型。前者通過數(shù)據(jù)學(xué)習(xí)和像素位置相關(guān)的上下文關(guān)系權(quán)值將鄰域特征融合到深度值預(yù)測,而后者則能夠有效地壓制噪聲并在保留邊緣的同時使結(jié)果更加的平滑。最后本文將這兩種模型融合,得到更有效的方法。2.在光流估計任務(wù)中,相對于傳統(tǒng)的光流估計方法,基于深度學(xué)習(xí)的方法具有效率高、易擴(kuò)展的優(yōu)點(diǎn)。然而目前基于深度學(xué)習(xí)的方法并不多,同時已有的深度模型在大位移光流預(yù)測問題上存在不足。本文將提出一種基于多尺度的相關(guān)性學(xué)習(xí)的深度卷積網(wǎng)絡(luò)結(jié)構(gòu),能夠有效地處理大位移情況。在一些大位移光流數(shù)據(jù)集上,相對于基準(zhǔn)算法,本文提出的框架的表現(xiàn)有很明顯的改善。另外,由于預(yù)測的結(jié)果含有較多的噪聲和較大的誤差,本文提出將遞歸神經(jīng)網(wǎng)絡(luò)與卷積神經(jīng)網(wǎng)絡(luò)相結(jié)合對預(yù)測的結(jié)果進(jìn)一步修正并得到更加精細(xì)的結(jié)果。3.對于行人細(xì)粒度分析任務(wù),本文針對監(jiān)控視頻下的行人精細(xì)化識別競賽,提出兩種基于Faster R-CNN的模型框架,一種是在同一個網(wǎng)絡(luò)模型中聯(lián)合學(xué)習(xí)部件檢測和部件屬性分類,另一種則是先基于Faster R-CNN框架檢測出部件位置,然后再訓(xùn)練另一個網(wǎng)絡(luò)對部件進(jìn)行屬性分類。實(shí)驗(yàn)表明先檢測再分類的分階段方式能夠減少類之間的干擾進(jìn)而減少誤分類現(xiàn)象。
[Abstract]:Semantic object parsing and scene analysis are important research directions in computer vision. Their main purpose is to analyze the objects and scenes in images and videos, to understand, to monitor video, to drive automatically. Semantic target resolution involves the detection, identification and analysis of objects such as pedestrians and vehicles, in which fine-grained pedestrian analysis is the basis of many computer vision applications. Scene analysis includes depth estimation of scene, motion analysis and structure analysis. Depth estimation of scene refers to the depth information of scene. The motion analysis of the scene mainly refers to the optical flow information obtained from the continuous video frame, which is used to identify the behavior of moving targets and detect and classify abnormal events. Image depth estimation and optical flow estimation algorithms have important practical significance. This paper also focuses on these three tasks. In recent years, depth learning has made a breakthrough in computer vision tasks, such as target detection, face recognition, scene tagging and so on. The design of task-oriented network model has attracted more and more attention from academia and industry. This paper will focus on the three tasks of pedestrian fine-grained analysis, single image depth estimation and optical flow estimation. Different models based on depth learning are proposed respectively. The following are as follows: 1. For the task of estimating the depth of a single image, this paper first reviews the existing methods. Then aiming at the shortcomings of depth estimation model based on depth learning in modeling spatial context relationship, In this paper, a data-driven contextual feature learning model and a loss function model based on a total variation model are proposed, respectively, in which neighborhood features are fused to depth prediction through data learning and contextual weights related to pixel positions. The latter can effectively suppress noise and make the results smoother while preserving edges. Finally, the two models are fused to obtain a more effective method .2. compared with traditional optical flow estimation methods, Methods based on depth learning have the advantages of high efficiency and easy to be extended. However, there are few methods based on depth learning at present. At the same time, the existing depth models are deficient in the problem of large displacement optical flow prediction. In this paper, a kind of depth convolution network structure based on multi-scale correlation learning is proposed. In some large displacement optical flow data sets, the performance of the frame proposed in this paper is obviously improved compared with the reference algorithm. In addition, the prediction results contain more noise and larger errors. In this paper, the combination of recurrent neural network and convolutional neural network is proposed to further revise the prediction results and obtain more precise results .3. for the pedestrian fine grained analysis task, this paper aims at the pedestrian fine recognition competition under the surveillance video. Two model frameworks based on Faster R-CNN are proposed. One is to combine learning component detection and component attribute classification in the same network model, the other is to detect the location of components based on Faster R-CNN framework. Then another network is trained to classify the components. The experiment shows that the method of detecting and reclassifying can reduce the interference between classes and reduce the phenomenon of misclassification.
【學(xué)位授予單位】:浙江大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.41;TP18
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 邵義元;;一類模型學(xué)習(xí)樣本的預(yù)處理[J];鄂州大學(xué)學(xué)報;2012年05期
2 譚建輝;;徑向基函數(shù)神經(jīng)網(wǎng)絡(luò)的再學(xué)習(xí)算法及其應(yīng)用[J];微電子學(xué)與計算機(jī);2006年05期
3 薛志東;王燕;邱德紅;;逆C均值學(xué)習(xí)樣本篩選方法[J];微計算機(jī)信息;2007年27期
4 張映偉,于川,邢鎮(zhèn)容;學(xué)習(xí)樣本存在分類錯誤時的判據(jù)穩(wěn)定性問題[J];計算機(jī)仿真;2003年06期
5 岑健;秦勇;邢鎮(zhèn)容;;學(xué)習(xí)樣本存在分類錯誤時的決策判據(jù)分析[J];茂名學(xué)院學(xué)報;2006年04期
6 黎移新;;多層前饋神經(jīng)網(wǎng)絡(luò)幾種算法的樣本順序敏感性[J];食品與機(jī)械;2010年04期
7 胡瑞敏,李德仁,沈未名,,吳捷,姚天任;連續(xù)函數(shù)映射網(wǎng)絡(luò)樣本重組的研究[J];計算機(jī)學(xué)報;1996年09期
8 李遠(yuǎn),劉悅,王媛,吳耿鋒;地震預(yù)報專家系統(tǒng)中學(xué)習(xí)樣本的構(gòu)建[J];計算機(jī)工程與應(yīng)用;2005年04期
9 蔣明 ,柏文陽 ,肖建華 ,符江東;調(diào)和的復(fù)合BP網(wǎng)絡(luò)及學(xué)習(xí)算法[J];小型微型計算機(jī)系統(tǒng);2003年03期
10 高雋;胡勇;胡良梅;;關(guān)于AM學(xué)習(xí)樣本選擇的實(shí)驗(yàn)研究[J];模式識別與人工智能;2002年03期
相關(guān)會議論文 前3條
1 田建艷;武增懿;韓肖清;;徑向基函數(shù)神經(jīng)網(wǎng)絡(luò)學(xué)習(xí)算法的改進(jìn)[A];2009年中國智能自動化會議論文集(第七分冊)[南京理工大學(xué)學(xué)報(增刊)][C];2009年
2 周斌;;內(nèi)燃機(jī)排放神經(jīng)網(wǎng)絡(luò)模型學(xué)習(xí)樣本的確定[A];加入WTO和中國科技與可持續(xù)發(fā)展——挑戰(zhàn)與機(jī)遇、責(zé)任和對策(上冊)[C];2002年
3 文博武;胡壽松;;基于再勵學(xué)習(xí)的殲擊機(jī)安全著陸橫側(cè)向協(xié)調(diào)控制[A];2005全國自動化新技術(shù)學(xué)術(shù)交流會論文集(二)[C];2005年
相關(guān)碩士學(xué)位論文 前2條
1 趙杉杉;深度學(xué)習(xí)驅(qū)動的場景分析和語義目標(biāo)解析[D];浙江大學(xué);2017年
2 惠寅華;基于同倫的學(xué)習(xí)算法研究[D];蘇州大學(xué);2013年
本文編號:1494140
本文鏈接:http://www.sikaile.net/kejilunwen/zidonghuakongzhilunwen/1494140.html