基于Android平臺(tái)的圖像文字識(shí)別及語(yǔ)音播放系統(tǒng)
本文選題:安卓平臺(tái) 切入點(diǎn):文字識(shí)別 出處:《南京郵電大學(xué)》2017年碩士論文 論文類(lèi)型:學(xué)位論文
【摘要】:據(jù)統(tǒng)計(jì)全球約超過(guò)1.5%的人群因視覺(jué)方面的障礙不能像正常人那樣學(xué)習(xí)和生活,圖像文字識(shí)別和語(yǔ)音播放技術(shù)在一定程度上可以為他們提供閱讀幫助。雖然目前市場(chǎng)上已有基于Androi d終端的類(lèi)似產(chǎn)品,如云脈文檔識(shí)別、OCR(Optical Character Recognition)文字識(shí)別等,但這些識(shí)別軟件對(duì)圖像拍攝要求較高,往往要求拍攝的文字清晰、圖像不能傾斜、圖像僅僅只包含文字等,否則將無(wú)法識(shí)別或者導(dǎo)致識(shí)別準(zhǔn)確率降低,故這些要求對(duì)于存在視力障礙人群并不現(xiàn)實(shí)。為此本文研究開(kāi)發(fā)了基于Android的文字圖像識(shí)別軟件,并增加了語(yǔ)音播放的功能,使用者可通過(guò)聽(tīng)覺(jué)獲取文字信息。本文完成的主要工作如下:首先,提出文字圖像傾斜矯正和文字區(qū)域裁剪算法,并通過(guò)灰度化、二值化、傾斜矯正和文字區(qū)域裁剪等過(guò)程降低了待識(shí)別的文字圖像冗余信息,實(shí)現(xiàn)了文字圖像的預(yù)處理。然后,基于google公司優(yōu)化的tesseract識(shí)別引擎開(kāi)發(fā)了文字識(shí)別功能,并通過(guò)訓(xùn)練和擴(kuò)展字符庫(kù)的方法來(lái)提高文字識(shí)別的準(zhǔn)確率。最后,基于手說(shuō)TTS(Text To S peech)引擎開(kāi)發(fā)了語(yǔ)音播放功能,該功能不僅可以播放識(shí)別出來(lái)的文字,而且可以以不同性別、不同音量、不同語(yǔ)速進(jìn)行播放。通過(guò)對(duì)該系統(tǒng)進(jìn)行測(cè)試驗(yàn)證了本文開(kāi)發(fā)的基于Android平臺(tái)的圖像文字識(shí)別及語(yǔ)音播放系統(tǒng)的有效性,并且它同市場(chǎng)上應(yīng)用最廣泛的識(shí)別軟件之一的云脈文檔識(shí)別進(jìn)行了識(shí)別對(duì)比,驗(yàn)證了其在識(shí)別有傾斜或者包含非文字部分的文本圖像時(shí)效果更好。
[Abstract]:According to statistics, more than 1.5% people in the world are unable to study and live like normal people because of visual difficulties. To a certain extent, the technology of image recognition and speech playback can help them to read. Although there are already similar products based on Androi d terminals in the market, such as cloud pulse document recognition, optical Character recognition, character recognition, etc. However, the recognition software often requires the text to be clear, the image can not be tilted, and the image only contains text, otherwise, the recognition accuracy will be reduced. Therefore, these requirements are not realistic for people with visual impairment. In this paper, the text and image recognition software based on Android is developed, and the function of speech playing is added. The main work of this paper is as follows: firstly, the text image tilt correction and text region clipping algorithm are proposed. The process of skew correction and text region clipping reduces the redundant information of the text image to be recognized, and realizes the preprocessing of the text image. Then, based on the tesseract recognition engine optimized by google Company, the text recognition function is developed. And improve the accuracy of character recognition by training and expanding the character base. Finally, based on the handheld TTS(Text to S peech-based engine, a speech playback function has been developed, which not only can play the recognized text, but also can be of different gender. By testing the system, the validity of the image text recognition and speech playback system based on Android platform is verified. And it is compared with cloud pulse document recognition which is one of the most widely used recognition software in the market. It is proved that it is more effective in recognizing text images with skew or non-text parts.
【學(xué)位授予單位】:南京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類(lèi)號(hào)】:TP391.41;TN912
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 劉晟橋;牛連強(qiáng);馮庸;;一種改進(jìn)的退化文本圖像二值化方法[J];智能計(jì)算機(jī)與應(yīng)用;2016年04期
2 顏建強(qiáng);高新波;;一種基于Google的OCR結(jié)果校對(duì)新方法[J];計(jì)算機(jī)學(xué)報(bào);2014年06期
3 張國(guó)海;;基于TTS的中英文語(yǔ)音軟件設(shè)計(jì)與實(shí)現(xiàn)[J];安徽電子信息職業(yè)技術(shù)學(xué)院學(xué)報(bào);2014年02期
4 孫潔娣;溫江濤;李書(shū)茉;任瑞軍;;局部高亮干擾文本圖像的二值化方法研究[J];光電工程;2012年11期
5 井曉陽(yáng);羅飛;王亞棋;;漢語(yǔ)語(yǔ)音合成技術(shù)綜述[J];計(jì)算機(jī)科學(xué);2012年S3期
6 朱懷中;;基于Android的手機(jī)OCR識(shí)別技術(shù)設(shè)計(jì)與實(shí)現(xiàn)[J];電子科技;2012年09期
7 余佳;黃智超;蔣端保;梁治峰;楊兵;帖軍;;基于Android圖片文字朗讀軟件的盲人電子眼[J];軟件導(dǎo)刊;2012年08期
8 童立靖;張艷;舒巍;占國(guó)亮;錢(qián)W,
本文編號(hào):1622235
本文鏈接:http://www.sikaile.net/kejilunwen/xinxigongchenglunwen/1622235.html