融合注意力和動態(tài)語義指導的圖像描述模型

發(fā)布時間：2018-08-18 18:06

【摘要】：針對當前圖像語義描述生成模型對圖像內(nèi)目標細節(jié)部分描述不充分問題,提出了一種結合圖像動態(tài)語義指導和自適應注意力機制的圖像語義描述模型。該模型根據(jù)上一時刻信息預測下一時刻單詞,采用自適應注意力機制選擇下一時刻模型需要處理的圖像區(qū)域。此外,該模型構建了圖像的密集屬性信息作為額外的監(jiān)督信息,使得模型可以聯(lián)合圖像語義信息和注意力信息進行圖像內(nèi)容描述。在Flickr8K和Flickr30K圖像集中進行了訓練和測試,并且使用了不同的評估方法對所提模型進行了驗證,實驗結果表明所提模型性能有較大的提高,尤其與Guiding-Long Short-Term Memory模型相比,得分提高了4.1、1.8、2.4、0.8、3.1,提升幅度達到6.3%、4.0%、7.9%、3.9%、17.3%;與Soft-Attention相比,得分分別提高了1.9、2.4、3.3、1.5、2.74,提升幅度達到2.8%、5.5%、11.1%、7.5%、14.8%。
[Abstract]:An image semantic description model based on dynamic semantic guidance and adaptive attention mechanism is proposed to solve the problem of inadequate description of target details in the current image semantic description generation model. According to the information of the previous moment, the model predicts the words of the next moment, and adopts the adaptive attention mechanism to select the image region to be processed by the next moment model. In addition, the model constructs the dense attribute information of the image as additional monitoring information, which enables the model to combine image semantic information and attention information to describe the image content. The proposed model is trained and tested in Flickr8K and Flickr30K images, and different evaluation methods are used to verify the proposed model. The experimental results show that the performance of the proposed model is greatly improved, especially compared with the Guiding-Long Short-Term Memory model. The score increased by 4.1 / 1.82.40.80.81, and reached 6.3 / 4.07.9and 3.9m / 17.3.The score increased by 1.92.43.31.52.74 respectively compared with Soft-Attention, and the range of promotion reached 2.80.11.511.7.5and 14.80.The score increased by 1.92.43.31.52.74, respectively, and reached the range of 2.81.7.5.
【作者單位】：江南大學物聯(lián)網(wǎng)技術應用教育部工程研究中心;
【基金】：中央高�；究蒲袠I(yè)務費專項資金No.JUSRP51510~~
【分類號】：TP183;TP391.41

【相似文獻】

相關期刊論文前10條

1 劉清堂;金晶;趙剛;程文青;楊宗凱;;學習資源權利描述模型及執(zhí)行策略研究[J];計算機應用研究;2006年12期

2 孫偉,翟玉慶;一種以動作狀態(tài)為中心的數(shù)字權限描述模型[J];計算機工程與應用;2005年10期

3 孫偉,翟玉慶;一種采用一階動態(tài)邏輯表示的數(shù)字權限描述模型[J];計算機應用;2005年04期

4 彭宇行;;CHDL模型探討[J];計算技術與自動化;1990年03期

5 張英朝,張維明,肖衛(wèi)東,沙基昌;虛擬組織中面向共享的信息統(tǒng)一描述模型研究[J];系統(tǒng)工程學報;2005年01期

6 李行;張立臣;;面向方面的CORBA模型[J];現(xiàn)代計算機(專業(yè)版);2008年05期

7 劉超;蔣祖華;劉宇龍;;中醫(yī)推拿動素的規(guī)范化描述模型與實例應用[J];計算機工程;2009年11期

8 許占民,張全,景韶宇,陸長德;面向產(chǎn)品造型設計的形態(tài)風格描述模型構建[J];計算機應用研究;2005年11期

9 何建華;劉耀林;俞艷;;不確定方向關系的模糊描述模型[J];武漢大學學報(信息科學版);2008年03期

10 李文杰;馮志勇;趙德新;;基于本體的零件描述模型研究[J];計算機工程;2007年08期

相關會議論文前1條

1 張曉寧;李學慶;;一種基于MDA的UIMS實現(xiàn)[A];第四屆和諧人機環(huán)境聯(lián)合學術會議論文集[C];2008年

相關碩士學位論文前4條

1 白曉磊;面向服務計算的服務描述模型研究[D];電子科技大學;2012年

2 鄭丹丹;動態(tài)對象不確定方向關系描述與推理[D];燕山大學;2010年

3 楊海;基于MPEG-7標準的人臉結構描述模型的研究[D];黑龍江大學;2013年

4 代一帆;基于角色協(xié)同的公眾參與評估系統(tǒng)的設計與實現(xiàn)[D];西南交通大學;2009年

，

本文編號：2190264

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://www.sikaile.net/kejilunwen/zidonghuakongzhilunwen/2190264.html

上一篇：雙折射光纖環(huán)鏡應變靈敏度優(yōu)化研究
下一篇：履帶式管道清潔機器人控制系統(tǒng)的設計與實現(xiàn)

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

融合注意力和動態(tài)語義指導的圖像描述模型