三維幾何發(fā)音模型的構(gòu)建與控制

發(fā)布時間：2019-05-22 08:32

【摘要】：基于發(fā)音機理的語音合成模型模擬語音生成的發(fā)音運動和空氣動力學過程。我們嘗試構(gòu)建一個更加精確的發(fā)音運動模型來逼近發(fā)音器官的形態(tài)學特性,從而得到一個更好的發(fā)音合成系統(tǒng)。目前有兩個主流的建模策略:生理模型和幾何模型。本文基于中文數(shù)據(jù)庫構(gòu)建三維幾何發(fā)音模型,與神經(jīng)生理模型相比較,這一幾何模型忽略復雜肌肉力的影響。因此,幾何發(fā)音模型的實時性隨著運算量的減少而得到提高,這使得幾何發(fā)音模型適用于實時性要求比較高的應用。本文提出了一種基于MRI(磁共振成像)和CBCT(錐形束C T)構(gòu)建三維幾何發(fā)音模型的新方法,由于磁共振成像技術(shù)能夠比較清晰地成像出聲道發(fā)音器官輪廓的形狀,并且磁共振成像技術(shù)對人體造成的傷害較小,因此越來越多的應用于語音合成研究。由于骨質(zhì)結(jié)構(gòu)不能在MRI中直接清晰地采集成像,我們采集了CBCT的數(shù)據(jù)來補充骨質(zhì)結(jié)構(gòu)的信息,進行上下顎的填補。通過磁共振成像技術(shù)采集得到的發(fā)音器官的數(shù)據(jù)庫,對于構(gòu)建出聲道模型進而分析不同發(fā)音帶來的聲道發(fā)音器官形狀的變化規(guī)律具有很大的優(yōu)勢。并且以其建立精確的三維聲道模型,進一步對發(fā)音過程的聲道可視化,對于語音教學應用和語音生成機理分析等都具有重要的意義。本文對中文磁共振數(shù)據(jù)庫中一個受試者的104組發(fā)音數(shù)據(jù)進行研究,研究方法具體步驟如下:數(shù)據(jù)庫及其預處理,數(shù)據(jù)標注以及三維網(wǎng)格建模,數(shù)據(jù)分析以及驗證評價,碰撞檢測以及響應。線性成分分析方法結(jié)果顯示,每個發(fā)音器官可以用三個以內(nèi)參數(shù)來很好地進行描述,并且參數(shù)控制集的累積貢獻率高于80%。用此分析結(jié)果對各個發(fā)音器官進行重構(gòu)而得到的均方根誤差均小于1.0 mm。本文創(chuàng)新點在于提出了一種新穎的三維聲道發(fā)音器官建模方法,其中我們考慮了發(fā)音器官的生理邊界點,建模過程有兩個主要的改進,融合不同切片的數(shù)據(jù)來提升發(fā)音器官輪廓的標注精確性以及根據(jù)發(fā)音器官的解刨結(jié)構(gòu)來建立發(fā)音器官的三維網(wǎng)格。這樣既保證了發(fā)音器官的完整性,又保留了發(fā)音器官上生理特征點的對應性。最后,本文構(gòu)建了基于漢語發(fā)音數(shù)據(jù)的三維幾何發(fā)音模型,這對于漢語的語音語言教學,漢語普通話的廣泛推廣,語音的病理糾正等應用提供了理論基礎。
[Abstract]:The speech synthesis model based on pronunciation mechanism simulates the pronunciation motion and aerodynamics process of speech generation. We try to construct a more accurate pronunciation motion model to approximate the morphological characteristics of pronunciation organs, so as to obtain a better pronunciation synthesis system. At present, there are two mainstream modeling strategies: physiological model and geometric model. In this paper, a three-dimensional geometric pronunciation model is constructed based on Chinese database. Compared with the neurophysiological model, this geometric model ignores the influence of complex muscle strength. Therefore, the real-time performance of geometric pronunciation model is improved with the decrease of computation, which makes the geometric pronunciation model suitable for applications with high real-time requirements. In this paper, a new method of constructing 3D geometric pronunciation model based on MRI (magnetic resonance imaging) and CBCT (conical beam CT) is proposed. Because magnetic resonance imaging (MRI) technology can clearly image the shape of vocal organ outline, And magnetic resonance imaging (MRI) is less harmful to human body, so it is more and more used in speech synthesis research. Because the bone structure can not be collected directly and clearly in MRI, we collect the data of CBCT to supplement the information of bone structure and fill the upper and lower jaws. The database of vocal organs collected by magnetic resonance imaging (MRI) has great advantages in building a channel model and analyzing the shape of vocal organs caused by different sounds. It is of great significance for the application of pronunciation teaching and the analysis of phonetic generation mechanism to establish an accurate three-dimensional channel model to further visualization of the pronunciation process. In this paper, 104 groups of pronunciation data of a subject in Chinese magnetic resonance database are studied. The specific steps of the research method are as follows: database and its preprocessing, data tagging and 3D grid modeling, data analysis and verification and evaluation. Collision detection and response. The results of linear component analysis show that each pronunciation organ can be well described by less than three parameters, and the cumulative contribution rate of the parameter control set is more than 80%. The root mean square errors obtained from the reconstruction of each pronunciation organ are less than 1.0 mm.. The innovation of this paper is to propose a novel modeling method of three-dimensional vocal organs, in which we consider the physiological boundary points of vocal organs, and there are two main improvements in the modeling process. The data of different slices are combined to improve the accuracy of phonetic organ outline marking and to establish the three-dimensional grid of pronunciation organ according to the unplaning structure of pronunciation organ. This not only ensures the integrity of the pronunciation organ, but also preserves the correspondence of the physiological characteristic points on the pronunciation organ. Finally, a three-dimensional geometric pronunciation model based on Chinese pronunciation data is constructed, which provides a theoretical basis for the application of Chinese phonetic language teaching, the extensive promotion of Chinese Putonghua, and the pathological correction of pronunciation.
【學位授予單位】：天津大學
【學位級別】：碩士
【學位授予年份】：2016
【分類號】：TP391.41;TN912.3

【相似文獻】

相關(guān)期刊論文前10條

1 張棟梁;于來行;;基于三維幾何的信息隱藏和檢測研究[J];周口師范學院學報;2013年05期

2 袁苗龍,周濟,張新訪;三維幾何布局的一類啟發(fā)式求解算法[J];計算機學報;1999年09期

3 孫立鐫;金釗;;用于三維幾何約束求解的分組重寫方法[J];計算機工程與應用;2010年27期

4 唐仁奎;廖麗;;機械CAD三維幾何繪圖原理與技巧探析[J];科學咨詢(科技·管理);2013年06期

5 吳湘,趙萬生,魏莉;三維幾何表示法[J];航天制造技術(shù);2002年04期

6 黃學良;王波興;陳立平;黃正東;;三維幾何約束系統(tǒng)的等價性分析[J];軟件學報;2011年05期

7 宋春玉,孫立鐫;三維幾何約束模型中的一種幾何推理求解機制[J];哈爾濱理工大學學報;2004年02期

8 吳濤,高福運,白躍偉,陳卓寧;用三維幾何約束構(gòu)建概念化設計階段的三維布局[J];計算機輔助設計與圖形學學報;2003年07期

9 仰之;一種用于三維幾何量的計算機輔助測試系統(tǒng)[J];數(shù)據(jù)采集與處理;1992年04期

10 喬雨,王波興,向文;基于自由度分析的三維幾何約束推理求解[J];計算機輔助設計與圖形學學報;2002年06期

相關(guān)會議論文前10條

1 紀慶革;李敏君;;三維幾何數(shù)據(jù)壓縮與簡化綜述[A];中國圖象圖形學會第十屆全國圖像圖形學術(shù)會議（CIG’2001）和第一屆全國虛擬現(xiàn)實技術(shù)研討會（CVR’2001）論文集[C];2001年

2 汪榮貴;張佑生;高雋;彭青松;;房屋的三維幾何特征在航空影象中的投影性質(zhì)[A];全國第13屆計算機輔助設計與圖形學（CAD/CG）學術(shù)會議論文集[C];2004年

3 錢林曉;王一濤;;對應試教育條件下學生學習行為的模型分析[A];2005年中國教育經(jīng)濟學年會會議論文集[C];2005年

4 高林;劉喜梅;;多模型中權(quán)值確定的新方法及其應用[A];2009年中國智能自動化會議論文集（第二分冊）[C];2009年

5 朱萍;劉偉澤;萬立濱;;基于實證研究的知識管理路線、方法和模型分析[A];航空工業(yè)檔案學會七屆四次理事會暨2013年度優(yōu)秀論文交流會論文集[C];2013年

6 潘潔;周宗放;;全流通下KMV模型中的違約點修正及實證研究[A];中國企業(yè)運籌學[C];2009年

7 肖田元;;仿真是基于模型的活動[A];新觀點新學說學術(shù)沙龍文集37：仿真是基于模型的實驗嗎[C];2009年

8 毛曹玨;曹銳;;兩種缺陷接地結(jié)構(gòu)的模型分析[A];2007年全國微波毫米波會議論文集（下冊）[C];2007年

9 吳義忠;陳立平;張昌杰;;基于多領域模型分析的參數(shù)優(yōu)化研究[A];慶祝中國力學學會成立50周年暨中國力學學會學術(shù)大會’2007論文摘要集（下）[C];2007年

10 董維中;;氣體模型對鈍體高超聲速流動數(shù)值計算影響的分析[A];第十屆全國計算流體力學會議論文集[C];2000年

相關(guān)重要報紙文章前10條

1 記者鄧筠然通訊員葉雨露;國內(nèi)首家高科技服務型企業(yè)落戶佛山[N];廣東科技報;2010年

2 山西省專用通信局林妍;三維幾何數(shù)據(jù)壓縮[N];山西科技報;2012年

3 王若愚;惜言如金能護嗓[N];保健時報;2006年

4 齊建榮;兒童口吃與心理有關(guān)[N];大眾衛(wèi)生報;2007年

5 衣曉峰本報記者姚艷春;別讓嗓子“超負荷”[N];黑龍江日報;2004年

6 紅火;人到年老應護嗓[N];中國中醫(yī)藥報;2006年

7 范超;淺談如何備戰(zhàn)統(tǒng)計建模大賽[N];中國信息報;2011年

8 媛萍;用模型分析企業(yè)戰(zhàn)略要素[N];中國高新技術(shù)產(chǎn)業(yè)導報;2002年

9 記者謝苗楓　通訊員盧健民彭梅蕾;暨大明年新增播音專業(yè)[N];南方日報;2008年

10 柳軍;幫助孩子正確發(fā)音[N];大眾衛(wèi)生報;2000年

相關(guān)博士學位論文前10條

1 周森;基于自動激光掃描技術(shù)的三維幾何在線測量系統(tǒng)研究[D];重慶大學;2015年

2 李睿;發(fā)音的3D可視化研究[D];中國科學技術(shù)大學;2016年

3 李瑜;多選題認知診斷測驗編制及多策略的多選題認知診斷模型的開發(fā)[D];江西師范大學;2014年

4 康慧燕;復雜網(wǎng)絡上帶有潛伏期的傳染病動力學模型研究[D];上海大學;2015年

5 郭瑋;基于多因素集成的疏散場模型研究[D];北京化工大學;2015年

6 張?zhí)祢?產(chǎn)漂流性卵小型魚類的生態(tài)位建模及分析[D];中國農(nóng)業(yè)大學;2016年

7 張會敏;基于小域估計的貧困指標測度方法與模型研究[D];天津財經(jīng)大學;2015年

8 宋澤芳;基于投資者情緒效應的均值—方差關(guān)系模型研究[D];廣州大學;2016年

9 徐帆;籠養(yǎng)食蟹猴自發(fā)抑郁模型的創(chuàng)建與驗證[D];重慶醫(yī)科大學;2015年

10 畢仁貴;考慮相關(guān)性的不確定凸集模型與非概率可靠性分析方法[D];湖南大學;2015年

相關(guān)碩士學位論文前10條

1 劉杰;三維幾何發(fā)音模型的構(gòu)建與控制[D];天津大學;2016年

2 張旭;一種基于函數(shù)映射的內(nèi)蘊對稱性方法[D];大連理工大學;2015年

3 劉志強;復雜場景真三維幾何表達方法研究[D];首都師范大學;2005年

4 田艷花;基于度驅(qū)動的漸進式三維幾何壓縮技術(shù)[D];國防科學技術(shù)大學;2006年

5 張福全;一種漸進式三維幾何壓縮算法的研究[D];國防科學技術(shù)大學;2006年

6 鄭鋅源;發(fā)音器官的運動與聲學信號之間映射關(guān)系的研究[D];天津大學;2016年

7 左昕;人體面部軟組織的三維幾何有限元建模方法研究[D];上海交通大學;2009年

8 楊凡;三維幾何網(wǎng)格模型壓縮算法的研究[D];南京理工大學;2006年

9 朱嘉蕊;基于科技接受模型的云出版服務模式研究[D];武漢理工大學;2014年

10 李昂;BIM技術(shù)在工程建設項目中模型創(chuàng)建和碰撞檢測的應用研究[D];東北林業(yè)大學;2015年

，

本文編號：2482809

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://www.sikaile.net/kejilunwen/ruanjiangongchenglunwen/2482809.html

上一篇：基于E-OEM的SWF半結(jié)構(gòu)化模型建立
下一篇：復雜環(huán)境下QR碼圖像的校正算法研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

三維幾何發(fā)音模型的構(gòu)建與控制