基于深度信念網(wǎng)絡(luò)的說(shuō)話者識(shí)別研究與實(shí)現(xiàn)

發(fā)布時(shí)間：2018-05-20 17:22

本文選題：說(shuō)話人識(shí)別 + 深度神經(jīng)網(wǎng)絡(luò)　；參考：《南京郵電大學(xué)》2017年碩士論文

【摘要】：隨著多媒體信息技術(shù)的快速發(fā)展,網(wǎng)絡(luò)語(yǔ)音資源呈現(xiàn)出了爆炸式地增長(zhǎng),因此如何利用語(yǔ)音進(jìn)行分類(lèi)和識(shí)別具有重要的意義。說(shuō)話人識(shí)別技術(shù)可以利用少量聲音數(shù)據(jù)區(qū)分說(shuō)話人,從而實(shí)現(xiàn)身份認(rèn)證的功能,它是語(yǔ)音信號(hào)處理中的關(guān)鍵技術(shù)。但是傳統(tǒng)的說(shuō)話人識(shí)別系統(tǒng)往往還存在學(xué)習(xí)不充分、網(wǎng)絡(luò)模型深度不夠以及語(yǔ)料數(shù)據(jù)不充分的情況下識(shí)別系統(tǒng)的真實(shí)模型往往復(fù)雜度不夠等情況。本文在分析說(shuō)話人識(shí)別方法優(yōu)缺點(diǎn)基礎(chǔ)上使用深度學(xué)習(xí)技術(shù)設(shè)計(jì)實(shí)現(xiàn)一個(gè)說(shuō)話人識(shí)別的系統(tǒng)。本文的主要工作如下:(1)歸納了說(shuō)話人識(shí)別方法和特征提取方式的特點(diǎn)和困難點(diǎn),對(duì)比分析目前常用的各種說(shuō)話人識(shí)別技術(shù)策略、模型和算法之間的優(yōu)缺點(diǎn)。(2)研究了基于深度學(xué)習(xí)的說(shuō)話人識(shí)別框架。將深度學(xué)習(xí)理論應(yīng)用到傳統(tǒng)的說(shuō)話人識(shí)別系統(tǒng),使用受限的玻爾茲曼機(jī)和后向傳播算法訓(xùn)練深度信念網(wǎng)絡(luò),從而克服了直接對(duì)多層網(wǎng)絡(luò)模型進(jìn)行訓(xùn)練的效率問(wèn)題。(3)引入信道環(huán)境下i-vector分析方法的說(shuō)話人識(shí)別,并在i-vector方法基礎(chǔ)上,對(duì)傳統(tǒng)高斯混合型說(shuō)話人識(shí)別進(jìn)行改善,提出一種使用無(wú)壓縮i-vector形式和深度學(xué)習(xí)相結(jié)合的方法。在使用無(wú)壓縮i-vector形式的深度學(xué)習(xí)說(shuō)話人識(shí)別方法上測(cè)試和傳統(tǒng)方法比對(duì)識(shí)別率的影響;不同性別對(duì)識(shí)別率的影響。(4)根據(jù)說(shuō)話人識(shí)別的處理流程,進(jìn)而給出基于深度學(xué)習(xí)說(shuō)話人識(shí)別的系統(tǒng)結(jié)構(gòu),對(duì)其中的核心模塊進(jìn)行了具體設(shè)計(jì)并予以仿真實(shí)現(xiàn),最后對(duì)各類(lèi)說(shuō)話人識(shí)別系統(tǒng)的性能展開(kāi)測(cè)試并對(duì)測(cè)試效果分析。
[Abstract]:With the rapid development of multimedia information technology, the network speech resources show explosive growth, so how to use speech classification and recognition has important significance. Speaker recognition is a key technology in speech signal processing, which can distinguish the speaker with a small amount of sound data and realize the function of identity authentication. However, traditional speaker recognition systems often have insufficient learning, insufficient depth of the network model and insufficient corpus data to identify the real model of the system is often not enough complexity and so on. On the basis of analyzing the advantages and disadvantages of speaker recognition methods, this paper designs and implements a speaker recognition system using depth learning technology. The main work of this paper is as follows: (1) the characteristics and difficulties of speaker recognition methods and feature extraction methods are summarized, and various commonly used speaker recognition techniques are compared and analyzed. The advantages and disadvantages between the model and the algorithm. 2) the speaker recognition framework based on deep learning is studied. The depth learning theory is applied to the traditional speaker recognition system. The restricted Boltzmann machine and the backward propagation algorithm are used to train the depth belief network. It overcomes the efficiency problem of training the multilayer network model directly. It introduces the speaker recognition of i-vector analysis method under the channel environment, and improves the traditional Gao Si hybrid speaker recognition based on the i-vector method. This paper presents a method of combining uncompressed i-vector with depth learning. To test and compare the effects of traditional methods on recognition rate in depth learning speaker recognition methods using uncompressed i-vector forms; the effect of gender on recognition rate. 4) according to the processing process of speaker recognition, Furthermore, the structure of speaker recognition system based on depth learning is given, and the core modules are designed and simulated. Finally, the performance of various speaker recognition systems is tested and the test results are analyzed.
【學(xué)位授予單位】：南京郵電大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類(lèi)號(hào)】：TN912.34

【參考文獻(xiàn)】

相關(guān)期刊論文前5條

1 于俊婷;劉伍穎;易綿竹;李雪;李娜;;國(guó)內(nèi)語(yǔ)音識(shí)別研究綜述[J];計(jì)算機(jī)光盤(pán)軟件與應(yīng)用;2014年10期

2 余凱;賈磊;陳雨強(qiáng);徐偉;;深度學(xué)習(xí)的昨天、今天和明天[J];計(jì)算機(jī)研究與發(fā)展;2013年09期

3 禹琳琳;;語(yǔ)音識(shí)別技術(shù)及應(yīng)用綜述[J];現(xiàn)代電子技術(shù);2013年13期

4 李海峰;李純果;;深度學(xué)習(xí)結(jié)構(gòu)和算法比較分析[J];河北大學(xué)學(xué)報(bào)(自然科學(xué)版);2012年05期

5 甄斌,吳璽宏,劉志敏,遲惠生;語(yǔ)音識(shí)別和說(shuō)話人識(shí)別中各倒譜分量的相對(duì)重要性[J];北京大學(xué)學(xué)報(bào)(自然科學(xué)版);2001年03期

相關(guān)碩士學(xué)位論文前4條

1 耿國(guó)勝;基于深度學(xué)習(xí)的說(shuō)話人識(shí)別技術(shù)研究[D];大連理工大學(xué);2014年

2 楊迪;基于多特征決策融合的說(shuō)話人識(shí)別研究[D];華北電力大學(xué);2013年

3 熊華喬;基于模型聚類(lèi)的說(shuō)話人識(shí)別方法研究[D];武漢理工大學(xué);2012年

4 陸春梅;與文本無(wú)關(guān)的開(kāi)集說(shuō)話人識(shí)別技術(shù)研究[D];西南交通大學(xué);2011年

，

本文編號(hào)：1915554

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://www.sikaile.net/kejilunwen/xinxigongchenglunwen/1915554.html

上一篇：鐵路移動(dòng)通信網(wǎng)網(wǎng)絡(luò)安全關(guān)鍵技術(shù)研究
下一篇：認(rèn)知無(wú)線車(chē)載自組織網(wǎng)絡(luò)中的聯(lián)合路由調(diào)度

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于深度信念網(wǎng)絡(luò)的說(shuō)話者識(shí)別研究與實(shí)現(xiàn)