基于OCR技術(shù)的化驗單識別方法研究

發(fā)布時間：2018-04-04 17:51

本文選題：OCR　切入點：化驗單　出處：《浙江大學(xué)》2017年碩士論文

【摘要】：隨著醫(yī)療互聯(lián)網(wǎng)的發(fā)展,人類醫(yī)療產(chǎn)生的健康數(shù)據(jù)正迅速增長,健康數(shù)據(jù)體現(xiàn)在就診后的化驗單數(shù)據(jù)。我國醫(yī)療行業(yè)醫(yī)患之間的數(shù)量差異導(dǎo)致化驗單的解讀存在壁壘,高效且準(zhǔn)確地解讀化驗單、管理個人的健康數(shù)據(jù)是目前健康醫(yī)療行業(yè)面臨的挑戰(zhàn)。論文針對解讀化驗單、管理個人的健康數(shù)據(jù)中存在的問題,提出基于OCR技術(shù)的化驗單識別基本流程方法,包括預(yù)處理、模式識別、內(nèi)容識別、識別結(jié)果糾錯,具體工作如下。首先論文對化驗單圖像預(yù)處理,主要包括二值化、抗扭斜。研究了全局閾值法、自適應(yīng)閾值法和OTSU方法三種二值化方法,通過對比實驗,分析了不同方法的預(yù)處理效果,選取OTSU方法作為化驗單圖像二值化基本方法。隨后通過模式識別技術(shù),采用基于霍夫變換的直線檢測方法提取化驗單圖像特征,根據(jù)直線特征分類化驗單圖像、處理化驗單圖像。再利用Tesseract開源引擎,訓(xùn)練數(shù)據(jù)、參數(shù)調(diào)優(yōu)、識別化驗單內(nèi)容。論文最后對識別結(jié)果采取結(jié)合編輯距離和化驗單醫(yī)學(xué)詞庫的糾錯方案,通過漢字圖像相似度比較,確定最終糾錯選項。論文通過化驗單識別結(jié)果糾錯前后的對比實驗,驗證了該糾錯方案的有效性。
[Abstract]:With the development of the medical Internet, the health data generated by human medical is increasing rapidly.The difference in the number of doctors and patients in our medical profession leads to the barriers to the interpretation of the laboratory sheet. It is a challenge for the health medical industry to efficiently and accurately interpret the test sheet and manage the individual health data.Aiming at the problems existing in the interpretation of laboratory sheets and the management of personal health data, this paper puts forward the basic flow method of identification of laboratory sheets based on OCR technology, including preprocessing, pattern recognition, content recognition, and error correction of recognition results. The specific work is as follows.First of all, the paper preprocessing the single image, mainly including binarization, torsion resistance.Three binarization methods, global threshold method, adaptive threshold method and OTSU method, are studied. Through comparative experiments, the preprocessing effects of different methods are analyzed, and OTSU method is selected as the basic method of binarization of single image.Then, through the pattern recognition technology, the method of line detection based on Hough transform is used to extract the feature of the laboratory single image, and the single image is classified according to the line feature, and the single test image is processed.Then use Tesseract open-source engine, training data, parameter tuning, identify the content of the test sheet.At the end of the paper, the method of error correction combined with editing distance and laboratory medical lexicon is adopted, and the final error correction options are determined by comparing the similarity of Chinese character images.The validity of the error correction scheme is verified by the contrast experiment before and after error correction.
【學(xué)位授予單位】：浙江大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2017
【分類號】：TP391.41

【相似文獻】

相關(guān)期刊論文前8條

1 吳小英;;OCR技術(shù)及其在圖書館中的應(yīng)用[J];當(dāng)代圖書館;2001年02期

2 韓元中;淺析OCR技術(shù)在銀行憑證檔案中的應(yīng)用[J];數(shù)字與縮微影像;2005年02期

3 李洋;;OCR技術(shù)在中央銀行會計核算數(shù)據(jù)集中系統(tǒng)應(yīng)用初探[J];金融發(fā)展研究;2014年04期

4 李聯(lián)濤;孫海東;;OCR技術(shù)在B737定檢工卡數(shù)字化中的應(yīng)用[J];科技經(jīng)濟市場;2011年08期

5 遲春佳;;OCR技術(shù)及其在高校圖書館信息資源數(shù)字化建設(shè)中的應(yīng)用[J];中國科技信息;2007年07期

6 ;引入OCR技術(shù),再造事后監(jiān)督流程——記南通工行事后監(jiān)督改革[J];中國金融電腦;2003年04期

7 邢立民;陳永琴;;掃描儀的OCR技術(shù)[J];實驗室科學(xué);2006年06期

8 丁曉青，郭繁夏;中文OCR技術(shù)最新進展[J];電子出版;1995年12期

相關(guān)重要報紙文章前5條

1 記者　張林軍;中以巨頭巔峰合作漢王OCR技術(shù)新突破[N];大眾科技報;2006年

2 琪文;引入OCR技術(shù) 再造事后監(jiān)督流程[N];金融時報;2003年

3 林君;漢王OCR技術(shù)獲科技進步一等獎[N];大眾科技報;2006年

4 王偉;OCR技術(shù)：提速物流企業(yè)票據(jù)管理[N];現(xiàn)代物流報;2007年

5 江山;OCR技術(shù)助力金融業(yè)票據(jù)處理[N];中華工商時報;2006年

相關(guān)碩士學(xué)位論文前2條

1 王宸敏;基于OCR技術(shù)的化驗單識別方法研究[D];浙江大學(xué);2017年

2 索玉秀;基于OCR技術(shù)的名片識別方法研究[D];哈爾濱理工大學(xué);2015年

，

本文編號：1711032

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://www.sikaile.net/kejilunwen/ruanjiangongchenglunwen/1711032.html

上一篇：云環(huán)境下針對企業(yè)營銷的個性化智能推薦研究
下一篇：大數(shù)據(jù)時代政府內(nèi)部信息安全管理問題研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于OCR技術(shù)的化驗單識別方法研究