針對(duì)西夏文字識(shí)別的特征提取及分類器研究

發(fā)布時(shí)間：2018-11-19 10:39

【摘要】：字符識(shí)別是機(jī)器識(shí)別領(lǐng)域中的一個(gè)傳統(tǒng)課題,并且取得了許多研究成果,漢字和古文字的識(shí)別是中文信息處理領(lǐng)域的重要研究課題。機(jī)器識(shí)別研究成果已經(jīng)進(jìn)入商業(yè)化,廣泛地應(yīng)用于人臉識(shí)別、指紋識(shí)別、車牌識(shí)別、辦公自動(dòng)化和金融商業(yè)事務(wù)中。字符識(shí)別雖有許多困難,由于漢字在實(shí)際應(yīng)用中十分重要,在理論研究方面也有很大意義,然而仍有許多研究堅(jiān)持不懈的研究這方面的工作。西夏文字的識(shí)別目前屬于一個(gè)待開發(fā)的新領(lǐng)域,根據(jù)研究表明,基于漢字形體的西夏文字的識(shí)別研究存在諸多方面的困難。第一,古文字西夏文有6000余字,因此屬于大字符集;第二,西夏文字和漢字相比,其結(jié)構(gòu)更復(fù)雜,筆劃繁雜,且絕大多數(shù)文字筆劃數(shù)都高于14劃,因此西夏文是相似度極高的字符集。第三,手寫西夏文字大多都有不同的尺寸和點(diǎn)陣,使得西夏文字的識(shí)別更困難、更復(fù)雜。古文字?jǐn)?shù)字化最重要的工作是古文字的機(jī)器識(shí)別,而文字識(shí)別中的特征提取是文字識(shí)別研究的基礎(chǔ),因此在本文中重點(diǎn)介紹了對(duì)西夏文特征提取的算法以及過(guò)程。本文首先介紹了西夏文識(shí)別的研究意義以及國(guó)內(nèi)外研究現(xiàn)狀;然后對(duì)西夏文字圖像進(jìn)行預(yù)處理,包括歸一化、二值化、平滑、細(xì)化、傾斜校正處理;然后采用haar-like算法和Gabor小波算法級(jí)聯(lián)方式對(duì)西夏文字圖像提取特征,最后用AdaBoost算法對(duì)提取的特征進(jìn)行分類識(shí)別研究,并將采用單一的haar-like算法和采用haar-like算法與Gabor小波算法級(jí)聯(lián)方式提取特征的分類結(jié)果進(jìn)行比較,取得了較好的分類識(shí)別效果。
[Abstract]:Character recognition is a traditional subject in the field of machine recognition, and many research achievements have been made. The recognition of Chinese characters and ancient characters is an important research topic in the field of Chinese information processing. The research results of machine recognition have been commercialized and widely used in face recognition, fingerprint recognition, license plate recognition, office automation and financial and commercial affairs. Although there are many difficulties in character recognition, because Chinese characters are very important in practical application and have great significance in theoretical research, there are still many researches on this aspect. The recognition of Xixia characters belongs to a new field to be developed at present. According to the research, there are many difficulties in the research on the recognition of Xixia characters based on the form of Chinese characters. First, the ancient Xixia language has more than 6000 words, so it belongs to the large character set; Second, compared with Chinese characters, Xixia characters have more complex structure and complicated strokes, and most of them are more than 14 strokes, so the Xixia characters are character sets with high similarity. Third, most of the handwritten Xixia characters have different sizes and lattice, which makes it more difficult and more complex to recognize the Xixia characters. The most important work in the digitization of ancient characters is the machine recognition of ancient characters, and the feature extraction in character recognition is the basis of the study of character recognition. Therefore, this paper mainly introduces the algorithm and process of feature extraction in the Xixia language. This paper first introduces the significance of the research on the recognition of the Xixia language and the current research situation at home and abroad, and then preprocesses the Xixia text image, including normalization, binarization, smoothing, thinning, tilting correction, etc. Then haar-like algorithm and Gabor wavelet algorithm are adopted to extract the features of the Xixia character image. Finally, the AdaBoost algorithm is used to classify and recognize the extracted features. The results of feature extraction using single haar-like algorithm and Gabor wavelet algorithm are compared, and good classification and recognition results are obtained.
【學(xué)位授予單位】：寧夏大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類號(hào)】：TP391.43

【參考文獻(xiàn)】

相關(guān)期刊論文前10條

1 魏淑霞;;“北方民族文字?jǐn)?shù)字化與西夏文獻(xiàn)研究國(guó)際研討會(huì)”綜述[J];西夏研究;2016年04期

2 李曉聰;涂剛毅;裴江;吳少鵬;;基于改進(jìn)Hough變換的檢測(cè)前跟蹤算法[J];現(xiàn)代防御技術(shù);2016年05期

3 許鵬;韓小忙;;西夏語(yǔ)詞匯研究述論[J];西夏研究;2016年03期

4 楊新武;馬壯;袁順;;基于弱分類器調(diào)整的多分類Adaboost算法[J];電子與信息學(xué)報(bào);2016年02期

5 顏學(xué)龍;任文帥;馬峻;;基于擴(kuò)展Haar特征的AdaBoost人臉檢測(cè)算法[J];計(jì)算機(jī)系統(tǒng)應(yīng)用;2015年09期

6 王海;蔡英鳳;袁朝春;;基于多模式弱分類器的AdaBoost-Bagging車輛檢測(cè)算法[J];交通運(yùn)輸工程學(xué)報(bào);2015年02期

7 王慶偉;應(yīng)自爐;;一種基于Haar-Like T特征的人臉檢測(cè)算法[J];模式識(shí)別與人工智能;2015年01期

8 江偉堅(jiān);郭躬德;賴智銘;;基于新Haar-like特征的Adaboost人臉檢測(cè)算法[J];山東大學(xué)學(xué)報(bào)(工學(xué)版);2014年02期

9 許劍;張洪偉;;Adaboost算法分類器設(shè)計(jì)及其應(yīng)用[J];四川理工學(xué)院學(xué)報(bào)(自然科學(xué)版);2014年01期

10 霍艷娟;;西夏語(yǔ)言研究簡(jiǎn)論[J];寧夏社會(huì)科學(xué);2013年06期

相關(guān)會(huì)議論文前1條

1 張平;王貴成;;Adaboost人臉檢測(cè)算法的速度影響因素分析及其改進(jìn)方法[A];第三屆中國(guó)智能計(jì)算大會(huì)論文集[C];2009年

相關(guān)博士學(xué)位論文前2條

1 何飛;基于Gabor濾波的虹膜多特征提取及融合識(shí)別方法研究[D];吉林大學(xué);2015年

2 許亞美;手寫維吾爾文字識(shí)別若干關(guān)鍵技術(shù)研究[D];西安電子科技大學(xué);2014年

相關(guān)碩士學(xué)位論文前10條

1 劉雨心;基于筆畫的脫機(jī)手寫體漢字識(shí)別與研究[D];太原理工大學(xué);2014年

2 齊光景;基于fast-AdaBoost算法的人臉檢測(cè)與識(shí)別方法研究[D];太原理工大學(xué);2014年

3 白瑩;手寫漢字的細(xì)化算法研究[D];西安電子科技大學(xué);2014年

4 盧婷;基于AdaBoost的分類器學(xué)習(xí)算法比較研究[D];華東理工大學(xué);2014年

5 孫抒雨;基于Gabor特征的人臉識(shí)別算法研究[D];遼寧科技大學(xué);2012年

6 姜文;維吾爾文單字符Gabor特征提取與識(shí)別[D];西安電子科技大學(xué);2012年

7 陳亮;Gabor小波特征提取技術(shù)及其在目標(biāo)識(shí)別中的應(yīng)用研究[D];南京理工大學(xué);2009年

8 楊全銀;基于Hough變換的圖像形狀特征檢測(cè)[D];山東大學(xué);2009年

9 趙萬(wàn)鵬;基于Adaboost算法的數(shù)字識(shí)別技術(shù)的研究與應(yīng)用[D];中國(guó)科學(xué)院研究生院（成都計(jì)算機(jī)應(yīng)用研究所）;2006年

10 陳洪波;Hough變換及改進(jìn)算法與線段檢測(cè)[D];廣西師范大學(xué);2004年

，

本文編號(hào)：2342050

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://www.sikaile.net/kejilunwen/ruanjiangongchenglunwen/2342050.html

上一篇：數(shù)據(jù)密集型知識(shí)發(fā)現(xiàn)的邊界與陷阱——以美國(guó)大選預(yù)測(cè)為例
下一篇：一種深度學(xué)習(xí)的信息文本分類算法

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

針對(duì)西夏文字識(shí)別的特征提取及分類器研究