立體視覺(jué)媒體分析及處理技術(shù)研究
發(fā)布時(shí)間:2018-03-31 13:04
本文選題:雙目立體媒體 切入點(diǎn):深度計(jì)算 出處:《南京大學(xué)》2017年博士論文
【摘要】:VR、AR、IMAX3D等成為近年來(lái)人們耳熟能詳?shù)臒狳c(diǎn)詞匯,究其原因,主要是由于基于立體視覺(jué)媒體獲取設(shè)備的大量普及以及立體媒體數(shù)量的激增,讓更多人有機(jī)會(huì)了解、使用、研究立體媒體。盡管立體媒體的表達(dá)方式多樣,本文主要對(duì)其中模仿人眼方式記錄信息的雙目立體媒體,展開(kāi)內(nèi)容分析和處理方面的研究。同傳統(tǒng)多媒體信息處理技術(shù)相比,立體媒體處理技術(shù)的關(guān)鍵在于對(duì)雙目視角之間區(qū)別和聯(lián)系關(guān)系的挖掘和利用。來(lái)自于平行視角之間的對(duì)立統(tǒng)一關(guān)系,既為內(nèi)容處理增加了更多線索,同時(shí)也增加了更多干擾,因而探索結(jié)合媒體新特性的新方法,才能切實(shí)提高立體媒體處理的質(zhì)量和效率。針對(duì)立體媒體內(nèi)容分析領(lǐng)域幾個(gè)關(guān)鍵性基礎(chǔ)問(wèn)題,在總結(jié)國(guó)內(nèi)外研究現(xiàn)狀的基礎(chǔ)上,分析了存在的主要問(wèn)題,并給出相應(yīng)的解決方案。同時(shí),對(duì)相關(guān)處理技術(shù)進(jìn)行了深入探索。其中主要的創(chuàng)新點(diǎn)和貢獻(xiàn)包括如下幾個(gè)方面:1.提出了一種立體視頻深度快速估計(jì)方法,利用視頻幀間冗余信息,通過(guò)自適應(yīng)運(yùn)動(dòng)插值,顯著提高計(jì)算效率,同時(shí)保證深度序列時(shí)域連續(xù)性,F(xiàn)有立體媒體深度計(jì)算方法大多建立在雙目圖像立體匹配的基礎(chǔ)之上,此類(lèi)方法通常需要設(shè)置合適的視差范圍,方能達(dá)到最佳計(jì)算效果,因而直接遷移到立體視頻上易造成深度序列不連續(xù)等現(xiàn)象。已有針對(duì)立體視頻的深度計(jì)算方法,為確保時(shí)域深度的連續(xù)性,需要引入大量全局優(yōu)化過(guò)程,因而計(jì)算效率很難得到保障。本文通過(guò)分析立體視頻特性,將細(xì)粒度深度計(jì)算和粗粒度深度估計(jì)通過(guò)運(yùn)動(dòng)矢量有機(jī)結(jié)合,提出了一種基于運(yùn)動(dòng)插值的深度快速估計(jì)方法。該方法不僅在精度上可以媲美全局優(yōu)化方法,在計(jì)算效率上更可以節(jié)省一半以上計(jì)算時(shí)間。2.提出了一種多對(duì)象似物性推薦方法,通過(guò)構(gòu)建基于上下文感知的多對(duì)象似物性推薦模型,有效解決了逐幀似物性推薦所帶來(lái)的推薦不一致、計(jì)算冗余等問(wèn)題,F(xiàn)有似物性推薦研究多集中于圖像,而針對(duì)視頻的工作大多開(kāi)始于圖像方法的逐幀使用,且主要面向運(yùn)動(dòng)物體或者顯著物體推薦。實(shí)驗(yàn)表明,逐幀似物性推薦,不僅存在計(jì)算冗余,更重要的是其在時(shí)域上物體推薦結(jié)果易出現(xiàn)不一致性。為解決這些問(wèn)題,本文提出了一種基于上下文感知的多對(duì)象似物性推薦方法,通過(guò)設(shè)置自適應(yīng)映射策略,把空域似物性推薦和時(shí)域似物性推薦有機(jī)結(jié)合,為優(yōu)秀的似物性推薦研究成果應(yīng)用于視頻中提供了通用且有效的解決方案。此外,針對(duì)目前缺少視頻多對(duì)象似物性推薦數(shù)據(jù)集的現(xiàn)狀,構(gòu)建了一個(gè)平均物體數(shù)量達(dá)3.34的視頻多物體數(shù)據(jù)集,以推動(dòng)本領(lǐng)域的相關(guān)研究。3.提出了一種基于視角融合的多顯著對(duì)象檢測(cè)方法,有效利用不同視角之間物體檢測(cè)的不一致性,進(jìn)一步提升了顯著物體檢測(cè)的精度。目前顯著對(duì)象檢測(cè)主要基于場(chǎng)景中只有一個(gè)顯著對(duì)象的假設(shè),有關(guān)多顯著對(duì)象檢測(cè)的問(wèn)題,尚未形成規(guī)模性研究,并且已有和多顯著對(duì)象相關(guān)的工作也主要在單目圖像上開(kāi)展。實(shí)驗(yàn)表明,單目圖像多顯著對(duì)象檢測(cè)方法作用于雙目圖像時(shí),易出現(xiàn)不同視角之間物體推薦不一致的現(xiàn)象。針對(duì)這一問(wèn)題,本文提出了一種基于視角融合的多顯著對(duì)象檢測(cè)方法,通過(guò)探討平行視角間顯著物體框之間的關(guān)系,采用顯著性和似物性雙概率估計(jì)的策略,對(duì)顯著物體框的打分進(jìn)行精化,從而提升最終多顯著物體檢測(cè)的準(zhǔn)確性和精度。4.提出了一種平面動(dòng)態(tài)立體感的展示方法,服務(wù)于廣泛存在的立體圖像,為實(shí)現(xiàn)立體圖像裸眼3D提供了新思路。如果沒(méi)有硬件輔助設(shè)備,存在于互聯(lián)網(wǎng)等處的立體圖像無(wú)法在普通顯示器上展示立體感的現(xiàn)象,是阻礙立體圖像進(jìn)一步普及化的瓶頸。由于當(dāng)前一些利用運(yùn)動(dòng)視差的平面3D動(dòng)態(tài)展示方法缺乏對(duì)人眼感知立體的完整分析和建模,易造成展示結(jié)果存在閃爍、觀看不適等問(wèn)題。本文通過(guò)對(duì)人眼視覺(jué)系統(tǒng)、運(yùn)動(dòng)視差、視覺(jué)暫留等現(xiàn)象的分析,提出了一種基于平面顯示設(shè)備的立體圖像動(dòng)態(tài)展示方法,將立體圖像的3D感成功傳遞給用戶(hù),為立體圖像的進(jìn)一步發(fā)展創(chuàng)造了更多可能。5.提出了一種對(duì)立體視頻進(jìn)行重對(duì)焦的方法,通過(guò)構(gòu)建計(jì)算攝影模型,營(yíng)造類(lèi)單反拍攝的重對(duì)焦效果,F(xiàn)有的立體視頻主要為電影院、VR/AR設(shè)備服務(wù),很難在普通用戶(hù)生活中尋其蹤跡。事實(shí)上,利用立體視頻所隱含的深度信息,可以對(duì)視頻內(nèi)容實(shí)現(xiàn)更為豐富的內(nèi)容處理。僅依靠軟件方式實(shí)現(xiàn)視頻重對(duì)焦,其輸出結(jié)果很難擺脫人工處理痕跡。本文基于對(duì)攝影學(xué)中焦平面、景深、彌散圓等概念的理解,構(gòu)建面向立體視頻重對(duì)焦的計(jì)算攝影模型,實(shí)現(xiàn)類(lèi)單反效果的視頻重對(duì)焦,服務(wù)于普通用戶(hù)。在以上關(guān)鍵技術(shù)和內(nèi)容處理的基礎(chǔ)上,本文還給出了對(duì)未來(lái)一些研究方向的展望,展示了本文研究?jī)?nèi)容的系統(tǒng)性和延展性,以及對(duì)相關(guān)研究領(lǐng)域的支撐作用,同時(shí)也說(shuō)明本文研究成果在立體媒體研究領(lǐng)域具有良好的應(yīng)用前景。
[Abstract]:VR, AR, IMAX3D has become a hot word in recent years the people the reason for having heard it many times, mainly due to a surge in the number of universal access to equipment based on stereo vision media and three-dimensional media, so that more people have the opportunity to learn, use, research on stereo media. Despite the expression of three-dimensional media diversity, this paper focuses on the binocular stereo media which mimics the recorded information of human way, carry out research content analysis and processing. Compared with the traditional multimedia information processing technology, three-dimensional media processing technology is the key to the mining of binocular visual angle between the difference and the relation between the unity of opposites. And from the perspective of the relationship between parallel, both for the content increased more clues, while also adding more interference, and to explore a new method of combining the new media features, in order to effectively improve the quality and efficiency of stereoscopic media processing Rate according to three-dimensional media content analysis field of several key basic problems, based on summarizing the domestic and foreign research status, analysis of the main problems, and gives the corresponding solutions. At the same time, in-depth exploration of the related processing technology. The main contributions are as follows: 1. put forward a fast stereo video depth estimation method, using the redundant information between the video frames, the motion adaptive interpolation, significantly improve the computational efficiency, at the same time to ensure the depth of time-domain sequence continuity. The existing stereo media depth calculation method based on the most in binocular stereo matching, this method usually need to set the appropriate to the disparity range. To achieve the best results, thus directly migrate to the stereo video easily caused by depth sequence discontinuous phenomena. According to the existing depth calculation of stereo video In order to ensure the continuity of the time domain method, the depth of the need to introduce a large number of global optimization process, so the computation efficiency is guaranteed. By analyzing the characteristics of stereo video, the fine-grained and coarse-grained depth calculation depth estimation by combining motion vector, proposed a motion interpolation based depth estimation method. This method is not only fast can the accuracy comparable to global optimization method, the computation efficiency can save more than half the computing time of.2. proposed a multi object analogues of recommendation method, by constructing multi object based on context awareness like material recommendation model can effectively solve the frame caused by the physical properties like recommended inconsistent calculation redundancy and other issues. The existing like material research focused on the recommendation for the video image, and most of the work started in the image frame and method of use, mainly for moving objects Or recommend significant objects. Experimental results show that the frame like properties recommended, not only exist redundant calculation, more important is the object in the time domain recommendation results prone to inconsistency. In order to solve these problems, this paper proposes a multi object based on context awareness like material recommendation method, by setting the adaptive mapping strategy the spatial properties, like the recommended and time domain analogs recommended combination, a general solution for the excellent and effective analogue recommendation research is applied to video. In addition, according to the present situation of the lack of video object like objects of recommended data sets, build a number of average objects up to 3.34 the video object data set, to promote research in the field of.3. presents a significant object detection method based on the fusion of the effective use of perspective, different perspectives between the object detection is not consistent, in Further enhance the saliency object detection accuracy. Currently significant object detection is mainly based on the scene only a significant object hypothesis, the more salient object detection problem, has not yet formed a large-scale study, and there are many and significant object related work mainly in monocular image. Experiments show that monocular image multiple salient objects detection method on the binocular image, prone to objects from different perspectives between recommended inconsistencies. Aiming at this problem, this paper proposes a multi object detection method was based on the perspective of integration through, to explore the relationship between the angle between parallel salient object frame, with significant and similar physical property estimation double probability strategy, refinement of the salient object frame rate, so as to enhance the accuracy and precision of the final.4. significant object detection presents a plane dynamic stereoscopic display 鏂規(guī)硶,鏈嶅姟浜庡箍娉涘瓨鍦ㄧ殑绔嬩綋鍥懼儚,涓哄疄鐜扮珛浣撳浘鍍忚8鐪,
本文編號(hào):1690856
本文鏈接:http://www.sikaile.net/shoufeilunwen/xxkjbs/1690856.html
最近更新
教材專(zhuān)著