當(dāng)前位置：主頁 > 科技論文 > 網(wǎng)絡(luò)通信論文 >

語音識別技術(shù)的關(guān)鍵問題研究

發(fā)布時(shí)間：2018-05-03 02:30

本文選題：語音識別 + 信號采集　；參考：《陜西師范大學(xué)》2014年碩士論文

【摘要】：隨著全球一體化的不斷發(fā)展,國家和區(qū)域之間的經(jīng)濟(jì)貿(mào)易交流越來越多,同時(shí)個(gè)體的活動范圍也正不斷的從本地走向世界,然而語言的交流卻成為阻礙發(fā)展的一大障礙。計(jì)算機(jī)技術(shù)和信息技術(shù)的不斷發(fā)展使得計(jì)算機(jī)作為輔助人類交流的中間工具正迅速的發(fā)展起來,如何利用新的技術(shù)使得交流從復(fù)雜到簡單,從抽象到通俗成為人們所關(guān)心的問題。語音識別(Speech Recognition)是模式識別技術(shù)的一個(gè)重要分支,它以語音信號為研究對象,以實(shí)現(xiàn)人機(jī)交互的目的,主要研究包括計(jì)算機(jī)技術(shù)、信號處理、模式識別語言學(xué)等多個(gè)領(lǐng)域的一門交叉學(xué)科。在最近的幾十年內(nèi)語音識別成為人和機(jī)器,人和人之間流暢溝通的重要橋梁。雖然語音識別技術(shù)在各行各業(yè)的使用范圍已經(jīng)非常廣泛,識別的質(zhì)量和識別效率也有很大的提高,但由于語音的人為因素、環(huán)境因素和語音識別算法等眾多因素的制約,完全100%的識別目前仍是不可能達(dá)到的。本文從影響語音識別的內(nèi)外部因素出發(fā),研究語音識別技術(shù)的關(guān)鍵技術(shù)和問題并探討如何提高語音識別的識別率。第一部分從影響語音識別的人為因素出發(fā)對影響識別準(zhǔn)確率的樣本采集方面進(jìn)行分析：語音識別的對象是不同的個(gè)體所發(fā)出來的信號源,因而個(gè)體的多樣性和特殊性就決定了同樣的一句話就會有不同的信號輸入。本文從個(gè)體的地域特征、個(gè)人的性別和生理特征以及個(gè)體的說話方式情感表達(dá)等的不同角度來分析人為因素對語音識別的影響。第二部分從外界環(huán)境對語音信號采集的影響進(jìn)行深入探討：語音信號從發(fā)音者發(fā)出來之后被語音識別設(shè)備所采集,在此過程中也存在著不定的外界因素,如信號采集過程的設(shè)備噪音、采集環(huán)境下的偶發(fā)噪音等外界因素對信號的采集有很大的影響,這些影響會直接導(dǎo)致語音信號訓(xùn)練和識別結(jié)果的不正確。第三部分從語音識別過程的算法和識別模型方法的角度探討目前流行的各種算法和技術(shù)方法。在語音識別過程中有很多種算法,在信號處理的前期階段關(guān)鍵方法和算法主要有：語音信號的預(yù)加重、語音信號的加窗處理、短時(shí)平均能量、短時(shí)平均幅度函數(shù)、短時(shí)過零率、短時(shí)自相關(guān)的分析、短時(shí)能量和零差分端點(diǎn)檢測算法等。在語音識別中,特征參數(shù)的提取是識別準(zhǔn)確率高低的一個(gè)重要部分,特征參數(shù)的好壞取決于能否完全表達(dá)信號所有信息的指標(biāo)。目前流行的特征參數(shù)方法有線性預(yù)測系數(shù)(LPC)、線性預(yù)測倒譜系數(shù)(LPCC)和Mel頻率倒譜系數(shù)(MFCC)等。識別模型方法是語音識別技術(shù)的另一個(gè)重要環(huán)節(jié)：其主要有動態(tài)時(shí)間規(guī)整(DTW)、隱馬爾科夫模型(HMM)、矢量量化(VQ)等。本文通過設(shè)計(jì)語音識別系統(tǒng)對大噪音環(huán)境的語音信號的使用濾波的噪音處理方法,并以MFCC作為特征參數(shù),使用VQ和HMM兩種識別模型來分別觀察實(shí)驗(yàn)結(jié)果分析語音識別效果。
[Abstract]:With the development of global integration, there are more and more economic and trade exchanges between countries and regions. At the same time, the scope of individual activities is constantly moving from local to the world. However, language exchange has become a major obstacle to development. With the development of computer technology and information technology, computer is developing rapidly as an intermediate tool to assist human communication. How to use new technology to make communication from complex to simple, From abstract to popular, people are concerned about it. Speech recognition is an important branch of pattern recognition technology. It takes speech signal as the research object to achieve the purpose of human-computer interaction. The main research includes computer technology, signal processing, speech recognition, speech recognition, speech recognition, speech recognition, speech recognition, speech recognition, speech recognition, speech recognition, speech recognition, speech recognition, speech recognition and speech recognition. Pattern recognition Linguistics is an interdisciplinary discipline in many fields. In recent decades, speech recognition has become an important bridge between people and machines, people and people. Although speech recognition technology has been widely used in various industries, the quality and efficiency of recognition have been greatly improved, but due to the human factors of speech, environmental factors, speech recognition algorithm and many other factors constraints, Full 100% recognition is still impossible. Based on the internal and external factors affecting speech recognition, this paper studies the key technologies and problems of speech recognition and discusses how to improve the recognition rate of speech recognition. The first part analyzes the human factors that affect the accuracy of speech recognition: the object of speech recognition is the signal source from different individuals. Therefore, the diversity and particularity of individuals determine that the same sentence will have different input signals. In this paper, the influence of human factors on speech recognition is analyzed from different perspectives, such as individual regional characteristics, individual gender and physiological characteristics, and individual speech style, emotional expression and so on. In the second part, the influence of the external environment on the speech signal acquisition is deeply discussed: the speech signal is collected by the speech recognition equipment after the voice signal is sent out, and there are also some uncertain external factors in the process. The external factors such as the equipment noise in the signal acquisition process and the occasional noise in the acquisition environment have great influence on the signal acquisition. These influences will directly lead to the incorrect results of speech signal training and recognition. In the third part, some popular algorithms and techniques are discussed from the point of view of speech recognition algorithm and recognition model method. In the process of speech recognition, there are many kinds of algorithms. In the early stage of signal processing, the key methods and algorithms are: prestress of speech signal, windowing processing of speech signal, short time average energy, short time average amplitude function, short time zero crossing rate, short time average energy, short time average amplitude function, short time zero crossing rate. Short-time autocorrelation analysis, short-time energy and zero-difference endpoint detection algorithm. In speech recognition, the extraction of feature parameters is an important part of recognition accuracy, and the quality of feature parameters depends on whether or not they can fully express all the information of the signal. At present, the popular characteristic parameter methods are linear prediction coefficient (LPCC), linear predictive cepstrum coefficient (LPCC) and Mel frequency cepstrum coefficient (MFCC). Recognition model method is another important part of speech recognition technology: dynamic time warping (DTW), Hidden Markov Model (hmm), Vector quantization (VQ) and so on. In this paper, we design a noise processing method using filtering for speech signals in a noisy environment by designing a speech recognition system. With MFCC as the characteristic parameter, two recognition models, VQ and HMM, are used to observe the experimental results and analyze the speech recognition effect.
【學(xué)位授予單位】：陜西師范大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2014
【分類號】：TN912.34

【參考文獻(xiàn)】

相關(guān)期刊論文前10條

1 馬志欣;王宏;李鑫;;語音識別技術(shù)綜述[J];昌吉學(xué)院學(xué)報(bào);2006年03期

2 史東承;韓玲艷;于明會;;基于HMM/SVM的音頻自動分類[J];長春工業(yè)大學(xué)學(xué)報(bào)(自然科學(xué)版);2008年02期

3 楊大利,徐明星,吳文虎;噪音環(huán)境下的語音識別研究[J];計(jì)算機(jī)工程與應(yīng)用;2003年20期

4 何湘智;語音識別的研究與發(fā)展[J];計(jì)算機(jī)與現(xiàn)代化;2002年03期

5 張玲華;鄭寶玉;楊震;;基于LPC分析的語音特征參數(shù)研究及其在說話人識別中的應(yīng)用[J];南京郵電學(xué)院學(xué)報(bào);2005年06期

6 李宇明;權(quán)威方言在語言規(guī)范中的地位[J];清華大學(xué)學(xué)報(bào)(哲學(xué)社會科學(xué)版);2004年05期

7 舒倩;李銀國;;基于MFCC0的語音端點(diǎn)檢測方法[J];通信技術(shù);2007年11期

8 文翰;黃國順;;語音識別中DTW算法改進(jìn)研究[J];微計(jì)算機(jī)信息;2010年19期

9 王金明,張雄偉;話者識別系統(tǒng)中語音特征參數(shù)的研究與仿真[J];系統(tǒng)仿真學(xué)報(bào);2003年09期

10 禹琳琳;;語音識別技術(shù)及應(yīng)用綜述[J];現(xiàn)代電子技術(shù);2013年13期

，

本文編號：1836593

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://www.sikaile.net/kejilunwen/wltx/1836593.html

上一篇：安全播出系統(tǒng)中的應(yīng)急系統(tǒng)
下一篇：認(rèn)知無線電網(wǎng)絡(luò)中一種可靠的分布式頻譜檢測策略

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

語音識別技術(shù)的關(guān)鍵問題研究