當(dāng)前位置：主頁 > 科技論文 > 網(wǎng)絡(luò)通信論文 >

基于子空間的說話人自適應(yīng)技術(shù)研究

發(fā)布時(shí)間：2018-02-17 05:22

本文關(guān)鍵詞： 連續(xù)語音識(shí)別說話人自適應(yīng) 流形學(xué)習(xí) 本征音正交局部保持投影正則化方法特征參數(shù)歸一化　出處：《解放軍信息工程大學(xué)》2014年碩士論文　論文類型：學(xué)位論文

【摘要】：訓(xùn)練與測試數(shù)據(jù)之間關(guān)于說話人的失配制約著連續(xù)語音識(shí)別系統(tǒng)的實(shí)用化。如何利用少量的自適應(yīng)數(shù)據(jù),增加聲學(xué)模型與測試數(shù)據(jù)之間的匹配程度,一直是連續(xù)語音識(shí)別研究的重點(diǎn)和難點(diǎn)問題。子空間方法通過對高維空間中的低維流形結(jié)構(gòu)進(jìn)行建模,不僅可以對高維空間進(jìn)行降維,有效地避免維數(shù)災(zāi)難問題,而且可以發(fā)現(xiàn)數(shù)據(jù)本身的結(jié)構(gòu)特點(diǎn),提高模型參數(shù)估計(jì)的穩(wěn)健性。本文研究如何利用子空間技術(shù)獲得更為實(shí)用化的說話人自適應(yīng)技術(shù),主要內(nèi)容如下：針對本征音算法在自適應(yīng)數(shù)據(jù)量較少時(shí),易出現(xiàn)過擬合導(dǎo)致系統(tǒng)性能下降的問題,提出了正則化本征音說話人自適應(yīng)方法。該方法通過對目標(biāo)函數(shù)引入適當(dāng)?shù)恼齽t化因子,構(gòu)造新的目標(biāo)函數(shù)進(jìn)行優(yōu)化,從而估計(jì)出更優(yōu)的說話人因子,提高解的穩(wěn)定性。在NIST LRE2003評測集上進(jìn)行的語種識(shí)別實(shí)驗(yàn)表明,改進(jìn)算法與基線系統(tǒng)相比,在測試語料為短語音段時(shí),系統(tǒng)性能有一定的提升,且測試語料越短,性能提升越明顯。在微軟語料庫上進(jìn)行的中文連續(xù)語音識(shí)別實(shí)驗(yàn)表明,在自適應(yīng)數(shù)據(jù)較為充足時(shí),正則化本征音自適應(yīng)方法略微降低了系統(tǒng)的性能,但在自適應(yīng)數(shù)據(jù)不足時(shí),正則化本征音自適應(yīng)方法可以有效的提高系統(tǒng)的穩(wěn)健性。針對本征音這類線性子空間方法無法精細(xì)描述非線性子空間內(nèi)在結(jié)構(gòu)的問題,提出了正交拉普拉斯說話人自適應(yīng)方法。該方法通過正交局部保持投影算法對說話人子空間進(jìn)行分析,在去除聲學(xué)無關(guān)信息的基礎(chǔ)上,進(jìn)一步發(fā)現(xiàn)這些信息的內(nèi)在結(jié)構(gòu)。并分別給出了該方法適用于語種識(shí)別和連續(xù)語音識(shí)別的系統(tǒng)框架和實(shí)現(xiàn)步驟。在NIST LRE 2003評測集上進(jìn)行的語種識(shí)別實(shí)驗(yàn)證明正交拉普拉斯算法能夠有效的提升特征的區(qū)分性。在微軟語料庫上進(jìn)行的中文連續(xù)語音識(shí)別實(shí)驗(yàn)進(jìn)一步證明該方法優(yōu)于本征音說話人自適應(yīng)方法。針對模型層的說話人自適應(yīng)影響解碼速度的問題,提出了特征空間本征音自適應(yīng)方法。該方法借鑒RATZ算法,采用高斯混合模型對特征空間中的說話人信息進(jìn)行建模,同時(shí)充分利用估計(jì)參數(shù)之間的相關(guān)性,減少估計(jì)參數(shù)的數(shù)量,在對特征空間精確建模的同時(shí),降低了算法對自適應(yīng)數(shù)據(jù)量的需求。在基于微軟語料庫的中文連續(xù)語音識(shí)別實(shí)驗(yàn)中,特征空間本征音自適應(yīng)方法在自適應(yīng)數(shù)據(jù)量極少時(shí)仍能取得較好的性能,同時(shí)配合說話人自適應(yīng)訓(xùn)練能夠進(jìn)一步降低詞錯(cuò)誤率。
[Abstract]:The mismatch between the training and test data about the speaker restricts the practicability of the continuous speech recognition system. How to use a small amount of adaptive data to increase the matching degree between the acoustic model and the test data, Subspace method can not only reduce the dimension of high-dimensional space, but also avoid the problem of dimensionality disaster by modeling the low-dimensional manifold structure in high-dimensional space. Moreover, the structural characteristics of the data can be found and the robustness of model parameter estimation can be improved. In this paper, we study how to use subspace technology to obtain more practical speaker adaptive technology. The main contents are as follows: when the adaptive data amount is small, the intrinsic sound algorithm is prone to the problem that over-fitting results in the deterioration of system performance. A regularization eigen-speaker adaptive method is proposed in this paper. By introducing a proper regularization factor to the objective function, a new objective function is constructed and the better speaker factor is estimated. The experiment of language recognition on NIST LRE2003 evaluation set shows that compared with the baseline system, the improved algorithm can improve the performance of the system when the test corpus is phrasal segment, and the shorter the test data is, the better the performance of the improved algorithm is. The performance improvement is more obvious. The Chinese continuous speech recognition experiment on Microsoft corpus shows that when the adaptive data is more adequate, the regularized intrinsic tone adaptive method slightly reduces the performance of the system, but when the adaptive data is insufficient, the regularized intrinsic tone adaptive method reduces the performance of the system slightly, but when the adaptive data is insufficient, The regularization eigentone adaptive method can effectively improve the robustness of the system. The linear subspace method such as eigensound can not accurately describe the problem of structure in nonlinear subspace. An orthogonal Laplace speaker adaptive method is proposed, in which the speaker subspace is analyzed by orthogonal local preserving projection algorithm, and the acoustic-independent information is removed. Furthermore, the internal structure of these information is found. The system framework and implementation steps of this method for language recognition and continuous speech recognition are given respectively. The experimental results of language recognition based on NIST LRE 2003 prove that the orthogonal method is orthogonal. Laplace algorithm can effectively improve the distinction of features. The experiment of Chinese continuous speech recognition in Microsoft corpus further proves that this method is superior to the intrinsic speaker adaptive method. The problem of human adaptation affecting decoding speed, Based on the RATZ algorithm, Gao Si hybrid model is used to model the speaker information in the feature space, and the correlation between the estimated parameters is fully utilized to reduce the number of the estimated parameters. At the same time, the requirement of adaptive data is reduced. In the experiment of Chinese continuous speech recognition based on Microsoft corpus, The eigenspace eigensound adaptive method can achieve good performance even when the adaptive data is small, and the speaker adaptive training can further reduce the word error rate.
【學(xué)位授予單位】：解放軍信息工程大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2014
【分類號(hào)】：TN912.34

【相似文獻(xiàn)】

相關(guān)期刊論文前10條

1 謝承迪;自適應(yīng)運(yùn)籌濾波方法[J];數(shù)值計(jì)算與計(jì)算機(jī)應(yīng)用;1994年01期

2 周宏宇;王愛民;;非線性偏微分方程數(shù)值求解的自適應(yīng)方法研究[J];計(jì)算機(jī)工程與應(yīng)用;2011年20期

3 張宗國,羅笑南;基于自適應(yīng)龍格-庫塔方法的柔性織物仿真[J];計(jì)算機(jī)應(yīng)用研究;2004年12期

4 詹昊可;蔡志明;苑秉成;;主動(dòng)聲納空時(shí)自適應(yīng)混響抑制方法[J];數(shù)據(jù)采集與處理;2009年01期

5 張征;劉更;劉天祥;;接觸力學(xué)自適應(yīng)無網(wǎng)格計(jì)算系統(tǒng)設(shè)計(jì)[J];計(jì)算機(jī)仿真;2008年06期

6 豐洪才,盧正鼎;基于MAP和MLLR的綜合漸進(jìn)自適應(yīng)方法研究[J];計(jì)算機(jī)工程;2005年05期

7 宋玉明,方大綱;自適應(yīng)小波多重網(wǎng)格方法及其計(jì)算效率[J];南京理工大學(xué)學(xué)報(bào)(自然科學(xué)版);1997年03期

8 葛海龍;馬曉星;許暢;;自適應(yīng)軟件系統(tǒng)構(gòu)造——自動(dòng)避障三輪小車的案例研究[J];計(jì)算機(jī)科學(xué)與探索;2012年05期

9 毛虎平;蘇鐵熊;李建軍;;多元模型自適應(yīng)與時(shí)間譜元法結(jié)合的動(dòng)態(tài)優(yōu)化[J];計(jì)算機(jī)輔助設(shè)計(jì)與圖形學(xué)學(xué)報(bào);2013年11期

10 蔡鐵;朱杰;;基于支持說話人權(quán)重的快速說話人自適應(yīng)算法[J];上海交通大學(xué)學(xué)報(bào);2005年12期

相關(guān)會(huì)議論文前10條

1 袁駟;和雪峰;;一個(gè)高效的一維有限元自適應(yīng)求解的新方案——第十三屆全國結(jié)構(gòu)工程學(xué)術(shù)大會(huì)特邀報(bào)告[A];第十三屆全國結(jié)構(gòu)工程學(xué)術(shù)會(huì)議論文集（第Ⅰ冊）[C];2004年

2 袁駟;方楠;王旭;葉康生;邢沁妍;;二維有限元線法自適應(yīng)分析的若干新進(jìn)展[A];第19屆全國結(jié)構(gòu)工程學(xué)術(shù)會(huì)議論文集（第Ⅰ冊）[C];2010年

3 汪新;;自適應(yīng)邊界元方法[A];計(jì)算力學(xué)研究與進(jìn)展——中國力學(xué)學(xué)會(huì)青年工作委員會(huì)第三屆學(xué)術(shù)年會(huì)論文集[C];1999年

4 段慶生;袁國興;;激光等離子體流場的網(wǎng)格自適應(yīng)方法[A];中國空氣動(dòng)力學(xué)學(xué)會(huì)物理氣體動(dòng)力學(xué)專業(yè)委員會(huì)第十一屆學(xué)術(shù)交流會(huì)會(huì)議論文集[C];2003年

5 安峰巖;孫紅靈;李曉東;田靜;;分散自適應(yīng)主動(dòng)控制參數(shù)優(yōu)化設(shè)計(jì)[A];中國聲學(xué)學(xué)會(huì)第九屆青年學(xué)術(shù)會(huì)議論文集[C];2011年

6 周春華;;不可壓流數(shù)值模擬中基于事后誤差估算的網(wǎng)格自適應(yīng)方法[A];計(jì)算流體力學(xué)研究進(jìn)展——第十二屆全國計(jì)算流體力學(xué)會(huì)議論文集[C];2004年

7 袁駟;和雪峰;;一個(gè)高效的一維有限元自適應(yīng)求解的新方案[A];工程力學(xué)學(xué)術(shù)研討會(huì)論文集[C];2004年

8 于光;鄭治真;;重力觀測數(shù)據(jù)中零漂的扣除——自適應(yīng)技術(shù)的一種應(yīng)用[A];中國地震學(xué)會(huì)第三次全國地震科學(xué)學(xué)術(shù)討論會(huì)論文摘要匯編[C];1986年

9 吳根清;鄭方;金凌;吳文虎;;一種在線遞增式語言模型自適應(yīng)方法[A];第六屆全國人機(jī)語音通訊學(xué)術(shù)會(huì)議論文集[C];2001年

10 王會(huì)珍;朱靖波;季鐸;葉娜;張斌;;基于反饋學(xué)習(xí)自適應(yīng)的中文話題追蹤[A];第二屆全國信息檢索與內(nèi)容安全學(xué)術(shù)會(huì)議（NCIRCS-2005）論文集[C];2005年

相關(guān)博士學(xué)位論文前10條

1 陳碧歡;基于需求和體系結(jié)構(gòu)的軟件系統(tǒng)自適應(yīng)方法[D];復(fù)旦大學(xué);2014年

2 王周峰;幾種光柵問題的自適應(yīng)DtN有限元方法[D];南京大學(xué);2015年

3 趙迎功;統(tǒng)計(jì)機(jī)器翻譯中領(lǐng)域自適應(yīng)問題研究[D];南京大學(xué);2015年

4 張西文;飽和砂土地震液化自適應(yīng)步長數(shù)值方法研究[D];大連理工大學(xué);2015年

5 李江濤;車載導(dǎo)航路網(wǎng)的胞式化尋路與密度自適應(yīng)[D];清華大學(xué);2015年

6 杜炎;基于EEP法的一維非線性有限元自適應(yīng)分析[D];清華大學(xué);2012年

7 周宇;中國手語識(shí)別中自適應(yīng)問題的研究[D];哈爾濱工業(yè)大學(xué);2010年

8 肖嘉;基于EEP法的線法二階常微分方程組有限元自適應(yīng)分析[D];清華大學(xué);2009年

9 韓志熔;網(wǎng)格自適應(yīng)與并行計(jì)算在氣動(dòng)力計(jì)算中的應(yīng)用[D];南京航空航天大學(xué);2013年

10 陳根龍;基于并行自適應(yīng)有限元的互連線建模與分析方法[D];復(fù)旦大學(xué);2012年

相關(guān)碩士學(xué)位論文前10條

1 謝奕;基于Agent的開放系統(tǒng)自適應(yīng)框架[D];復(fù)旦大學(xué);2014年

2 陳星;帶齊次混合邊界特征值問題的一種基于多尺度離散的有限元自適應(yīng)算法[D];貴州師范大學(xué);2015年

3 余媛媛;基于移位反迭代的非協(xié)調(diào)Crouzeix-Raviart有限元自適應(yīng)方法求Laplace特征值問題[D];貴州師范大學(xué);2015年

4 王彪;弱不連續(xù)問題的p型自適應(yīng)有限元及其快速求解方法[D];湘潭大學(xué);2015年

5 韓騎;自適應(yīng)非結(jié)構(gòu)有限元MT二維起伏地形正反演研究[D];中國地質(zhì)大學(xué);2015年

6 楊緒魁;基于子空間的說話人自適應(yīng)技術(shù)研究[D];解放軍信息工程大學(xué);2014年

7 白思林;h-，，p-，hp-自適應(yīng)邊界元方法研究[D];燕山大學(xué);2009年

8 舒冬;二維自適應(yīng)有限元靜力分析方法研究[D];中南大學(xué);2012年

9 楊銀;奇異攝動(dòng)問題的自適應(yīng)方法[D];湘潭大學(xué);2006年

10 夏佳佳;大規(guī)模森林場景的自適應(yīng)可視化技術(shù)研究[D];浙江工業(yè)大學(xué);2012年

本文編號(hào)：1517315

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://www.sikaile.net/kejilunwen/wltx/1517315.html

上一篇：雷達(dá)極化檢測器性能對比分析
下一篇：模型預(yù)測前向神經(jīng)網(wǎng)絡(luò)算法及其在組合導(dǎo)航中的應(yīng)用

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于子空間的說話人自適應(yīng)技術(shù)研究