場景圖像文本定位與字符識別方法研究
[Abstract]:The text in the scene image contains rich and accurate information, which has a wide range of application requirements in the fields of industrial automation, traffic management, automatic translation, service for the disabled and so on. However, due to the influence of non-uniform lighting, background texture and text diversity, the accuracy of scene text extraction is low. Therefore, how to extract text information accurately from these scene images has become a research focus in the field of pattern recognition. The research of this project has important practical value to improve the accuracy and robustness of scene image text recognition system. The main work and contributions of this paper are as follows: firstly, based on the consistency of the gray value of the characters in the text area, the amplitude of the gradient in the x direction is convex and the nearest neighbor of the text characters. In this paper, a text location method of scene image based on convolution neural network (CNN) and support vector machine (SVM) output score is proposed. According to the convexity distribution of the gradient amplitude in the x direction of the text region and the consistency of the character gray value, the typical points in the text region are detected, and the candidate connected components are extracted by the typical point position and gray clustering, and then the regions other than the candidate connected components are extracted. Other candidate connected components were further extracted by k-means clustering method. Then, the text connected component SVM classifiers based on CNN are used, the texture features of connected components are extracted by CNN, and then the non-text connected components are suppressed by SVM output score, and the nearest neighbor connected components are combined into candidate text regions. Finally, the support vector machine (SVM) is used to verify the candidate region according to the gradient direction histogram HOG feature of the candidate region. For the scene text image datasets of ICDAR2011 and ICDAR2013, the F values of 76% and 78% are obtained by the localization method, respectively, which shows that the method can effectively suppress the complex background texture interference. Secondly, based on the similarity of character color in text line, a text region character cutting method based on color clustering and gradient vector stream is proposed. Firstly, k-means clustering method is used to cluster the spatial position distribution of pixel color to obtain k candidate layers, and then the geometric features such as duty cycle and aspect ratio of connected components are used to extract the layers in which the candidate characters are connected. In the homogeneous region, the point far from the edge is found as the candidate segmentation pixel point, and the square of the gray difference is used as the cost to find the cutting path with the lowest cumulative cost. On the text dataset of ICDAR2013 scene image, the F value of 87.9% is obtained by this method. The experimental results show that color clustering can effectively suppress the interference of non-uniform light and occlusion. Finally, based on the rotation invariance of character structure, a multi-direction single character recognition model is proposed. The deformed HOG operator and concentric circular template sampling are used to extract the local joint HOG texture features and the quadrant structure features between the sampling points, and the character features are obtained by combining the above two features. Then the character word bag model of feature dictionary is established by learning, and then the character is recognized by support vector machine (SVM). Character recognition experiments are carried out for ICDAR character datasets, Chars74K datasets and manual collected datasets. The accuracy of the proposed method is 82%, 87% and 73% respectively, which shows that the proposed model has good robustness to rotation change.
【學(xué)位授予單位】:華中科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2016
【分類號】:TP391.41
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 ;有限自然碼非接觸光電字符識別[J];中國計量學(xué)院學(xué)報;2001年02期
2 許振新;字符識別要面向應(yīng)用[J];中國計算機(jī)用戶;2003年13期
3 盧達(dá),浦煒,謝銘培;一種用于提高字符識別速度的字符預(yù)分類法研究 [J];計算機(jī)工程與應(yīng)用;2000年04期
4 孫廣玲,唐降龍;基于識別結(jié)果反饋信息的閉環(huán)聯(lián)機(jī)字符識別系統(tǒng)[J];計算機(jī)工程與應(yīng)用;2002年22期
5 烏凌超,莫玉龍;基于獨(dú)立分量分析的字符識別方法[J];上海大學(xué)學(xué)報(自然科學(xué)版);2003年03期
6 陳薇,李勇;基于塊輸入的神經(jīng)網(wǎng)絡(luò)英語字符識別研究[J];計算機(jī)時代;2005年07期
7 湯茂斌;謝渝平;李就好;;基于神經(jīng)網(wǎng)絡(luò)算法的字符識別方法研究[J];微電子學(xué)與計算機(jī);2009年08期
8 田立巖;胡曉光;;一種改進(jìn)的快速嵌入式字符識別方法[J];光電子.激光;2010年10期
9 陳默;何小海;吳煒;楊曉敏;付光榮;;結(jié)合獨(dú)立與連續(xù)字符識別的集裝箱號識別技術(shù)[J];四川大學(xué)學(xué)報(工程科學(xué)版);2011年S1期
10 韓林峰;趙暉;;基于支持向量機(jī)的聯(lián)機(jī)手寫維吾爾字符識別[J];計算機(jī)應(yīng)用與軟件;2012年03期
相關(guān)會議論文 前10條
1 湯茂斌;謝渝平;李就好;;基于神經(jīng)網(wǎng)絡(luò)算法的字符識別方法研究[A];2009年全國開放式分布與并行計算機(jī)學(xué)術(shù)會議論文集(上冊)[C];2009年
2 洪漢玉;郭強(qiáng);章秀華;張艷;林志敏;;復(fù)雜背景條件下字符識別新方法研究[A];第十四屆全國圖象圖形學(xué)學(xué)術(shù)會議論文集[C];2008年
3 車揚(yáng);鄭智捷;;速記字符識別的預(yù)處理模式和方法探討[A];2010通信理論與技術(shù)新發(fā)展——第十五屆全國青年通信學(xué)術(shù)會議論文集(下冊)[C];2010年
4 李玉良;王良松;李晶;;圖像中數(shù)字字符識別技術(shù)概覽[A];節(jié)能環(huán)保 和諧發(fā)展——2007中國科協(xié)年會論文集(一)[C];2007年
5 劉云曼;王磊;;盲人閱讀機(jī)中圖像字符識別方法的研究[A];天津市生物醫(yī)學(xué)工程學(xué)會第三十三屆學(xué)術(shù)年會論文集[C];2013年
6 余曉華;陳曉春;劉好炯;;手持式儀表字符識別技術(shù)研究[A];《IT時代周刊》論文專版(第300期)[C];2014年
7 陸璐;張旭東;趙瑩;高雋;;基于卷積神經(jīng)網(wǎng)絡(luò)的車牌照字符識別研究[A];第十二屆全國圖象圖形學(xué)學(xué)術(shù)會議論文集[C];2005年
8 朱小燕;史一凡;馬少平;;脫機(jī)手寫體字符識別研究[A];面向21世紀(jì)的科技進(jìn)步與社會經(jīng)濟(jì)發(fā)展(上冊)[C];1999年
9 歐梅芳;宋瑞霞;;V-系統(tǒng)在信息重構(gòu)與字符識別中的應(yīng)用探索[A];中國圖學(xué)新進(jìn)展2007——第一屆中國圖學(xué)大會暨第十屆華東六省一市工程圖學(xué)學(xué)術(shù)年會論文集[C];2007年
10 張雪山;田慧;;字符識別系統(tǒng)的一種定位算法[A];圖像 仿真 信息技術(shù)——第二屆聯(lián)合學(xué)術(shù)會議論文集[C];2002年
相關(guān)重要報紙文章 前3條
1 尼克;計算歷史學(xué):大數(shù)據(jù)時代的讀書[N];東方早報;2014年
2 王慶國;票據(jù)印刷視覺字符檢測系統(tǒng)中硬件的選擇[N];中國包裝報;2008年
3 方忠誠;OCR技術(shù)及其應(yīng)用[N];北京電子報;2000年
相關(guān)博士學(xué)位論文 前4條
1 巫義銳;視覺場景理解與交互關(guān)鍵技術(shù)研究[D];南京大學(xué);2016年
2 文穎;數(shù)字、字符識別及其應(yīng)用研究[D];上海交通大學(xué);2009年
3 彭健;多類小字符集自適應(yīng)字符識別技術(shù)及系統(tǒng)的研究[D];重慶大學(xué);2002年
4 羅特飛(Mohammed Lutf);基于HMM與決策樹的多字體阿拉伯文的字符識別[D];華中科技大學(xué);2015年
相關(guān)碩士學(xué)位論文 前10條
1 張佳偉;基因組自動化進(jìn)化儀的研制[D];浙江大學(xué);2015年
2 邱立松;國際音標(biāo)字符識別算法的研究[D];上海師范大學(xué);2015年
3 張靖婭;鋼板點(diǎn)陣噴印字符識別方法研究[D];沈陽理工大學(xué);2015年
4 武威;基于模板匹配與結(jié)構(gòu)特征的字符識別算法研究[D];鄭州大學(xué);2015年
5 王勁松;基于神經(jīng)網(wǎng)絡(luò)的字符識別系統(tǒng)的設(shè)計與實(shí)現(xiàn)[D];電子科技大學(xué);2014年
6 周炳昱;基于手機(jī)攝像取詞的電子詞典的設(shè)計與實(shí)現(xiàn)[D];大連理工大學(xué);2015年
7 戴威;聯(lián)機(jī)手寫智能計算系統(tǒng)的研究[D];華北電力大學(xué);2015年
8 尹少東;基于嵌入式Linux的字符識別[D];河北科技大學(xué);2015年
9 周軍;圖像中自然場景字符區(qū)域定位[D];東北大學(xué);2014年
10 周品;車牌分割和字符識別的算法研究[D];南京郵電大學(xué);2015年
,本文編號:2484739
本文鏈接:http://www.sikaile.net/kejilunwen/ruanjiangongchenglunwen/2484739.html