數(shù)字音頻被動(dòng)取證關(guān)鍵技術(shù)研究
發(fā)布時(shí)間:2018-01-10 06:29
本文關(guān)鍵詞:數(shù)字音頻被動(dòng)取證關(guān)鍵技術(shù)研究 出處:《寧波大學(xué)》2016年博士論文 論文類型:學(xué)位論文
更多相關(guān)文章: 數(shù)字音頻 被動(dòng)取證 來(lái)源識(shí)別 壓縮歷史檢測(cè) 篡改定位 隱寫分析
【摘要】:數(shù)字音頻是人們?nèi)粘I钪凶钊菀撰@得數(shù)字媒體之一。除了以購(gòu)買、下載的方式獲得音頻文件外,還可以通過(guò)實(shí)時(shí)錄制的方式生成音頻/語(yǔ)音文件。然而,音頻編輯和處理軟件的不斷發(fā)展和完善,使得對(duì)音頻的編輯和修改變得更加簡(jiǎn)單和廉價(jià)。同時(shí),人耳也很難察覺(jué)這種修改留下的痕跡。因此,如何有效驗(yàn)證數(shù)字音頻的原始性、完整性和真實(shí)性,就成為了數(shù)字音頻被動(dòng)取證技術(shù)迫切需要解決的問(wèn)題。本文對(duì)數(shù)字音頻被動(dòng)取證中的關(guān)鍵問(wèn)題和技術(shù)進(jìn)行研究和探索,主要在取證音頻數(shù)據(jù)庫(kù)構(gòu)建、音頻來(lái)源取證、音頻壓縮歷史檢測(cè)、音頻內(nèi)容篡改檢測(cè)及音頻隱寫分析這五個(gè)方面開展了研究工作:1.針對(duì)目前數(shù)字音頻取證領(lǐng)域基準(zhǔn)音頻/語(yǔ)音庫(kù)缺乏的問(wèn)題,本文分別以CD音頻抓軌和現(xiàn)場(chǎng)語(yǔ)音錄制的方式,構(gòu)建了一個(gè)基礎(chǔ)音頻數(shù)據(jù)庫(kù)(CKC-AD)和一個(gè)基礎(chǔ)語(yǔ)音數(shù)據(jù)庫(kù)(CKC-SD)。前者包含2種類型,以及超過(guò)5種時(shí)長(zhǎng)、10種音樂(lè)流派、4種語(yǔ)言的音頻文件,共11172個(gè);后者使用38種不同型號(hào)錄音設(shè)備、對(duì)31個(gè)(21男10女)說(shuō)話人分別錄制了朗讀和口語(yǔ)兩部分語(yǔ)音。另外,本文在CKC-SD的基礎(chǔ)上,依據(jù)具體研究?jī)?nèi)容,進(jìn)一步構(gòu)建了TIMIT翻錄語(yǔ)音庫(kù)、二次翻錄音頻庫(kù)和設(shè)備本底噪聲數(shù)據(jù)庫(kù)。2.本文音頻來(lái)源取證方面的工作由二次翻錄音頻檢測(cè)和錄音來(lái)源設(shè)備識(shí)別兩部分組成:針對(duì)目前二次翻錄音頻檢測(cè)方法僅涉及單一偷錄或回放設(shè)備的問(wèn)題,本文深入分析了音頻回放翻錄過(guò)程中不同偷錄和回放設(shè)備對(duì)二次翻錄音頻的影響,并根據(jù)二次翻錄音頻和原始錄制音頻在高頻信息量分布上的差異構(gòu)建了特征向量。實(shí)驗(yàn)結(jié)果表明,該方法能有效區(qū)分原始錄制音頻和二次翻錄音頻,綜合分類準(zhǔn)確率達(dá)到了98.47%。另外,將該方法集成到GMM-UBM說(shuō)話人識(shí)別系統(tǒng)中,可大幅提高其抵抗音頻回放攻擊的能力,使其等錯(cuò)誤概率(EER)降低了47.06%。針對(duì)目前大多數(shù)錄音來(lái)源設(shè)備識(shí)別方法均是基于美爾倒譜系數(shù)(MFCC)特征或其他聲學(xué)特征的思路,本文從錄音設(shè)備本身的特性切入,提出了兩種錄音來(lái)源識(shí)別的方法。方法一是利用不同型號(hào)設(shè)備在音頻編碼過(guò)程中對(duì)各編碼參數(shù)使用特點(diǎn)的不同,構(gòu)建相關(guān)的統(tǒng)計(jì)量特征實(shí)現(xiàn)錄音來(lái)源設(shè)備的識(shí)別。實(shí)驗(yàn)結(jié)果表明,該方法對(duì)CKC-SD中10款錄制MP3音頻設(shè)備的平均識(shí)別率為99.97%,對(duì)14款錄制AAC/M4A音頻設(shè)備的平均正確檢測(cè)率為96.53%。另一個(gè)方法對(duì)方法一受錄音格式限制的局限性進(jìn)行了改進(jìn)。在深入研究不同錄音設(shè)備本底噪聲的基礎(chǔ)上,提出了設(shè)備本底噪聲的估計(jì)方法,并針對(duì)估計(jì)的本底噪聲構(gòu)建了頻譜形狀特征和頻譜分布特征來(lái)表征各設(shè)備。該方法實(shí)現(xiàn)了對(duì)CKC-SD庫(kù)中34款設(shè)備較為準(zhǔn)確的區(qū)分,其平均分類準(zhǔn)確率為95.53%。3.針對(duì)目前涉及較少的AAC音頻雙壓縮檢測(cè),本文提出了一種基于Huffman碼表索引的雙壓縮檢測(cè)方法。通過(guò)分析雙壓縮操作對(duì)碼表索引分布的改變,統(tǒng)計(jì)了碼表索引的直方圖和Markov單步轉(zhuǎn)移概率作為分類特征。對(duì)低轉(zhuǎn)高碼率的雙壓縮音頻(FAAC/FAAD2編解碼器),檢測(cè)準(zhǔn)確率達(dá)到了99%以上;但在相同碼率情況下,分類準(zhǔn)確率僅為79.56%。與該領(lǐng)域典型方法的對(duì)比結(jié)果表明,本方法整體上檢測(cè)準(zhǔn)確更高。另外,對(duì)MP3音頻的壓縮歷史檢測(cè)(不超過(guò)3次)和碼率估計(jì)進(jìn)行了探索,本文研究了Huffman碼表索引和比例因子在多次壓縮情況下的漸進(jìn)式變化,有針對(duì)性地構(gòu)建了均差、概率分布和互相關(guān)性統(tǒng)計(jì)量組成特征向量。實(shí)驗(yàn)結(jié)果表明:本方法對(duì)雙壓縮MP3音頻的檢測(cè)準(zhǔn)確率較目前該領(lǐng)域的幾種典型方法,整體上有所提升;在三次壓縮檢測(cè)方面,對(duì)低轉(zhuǎn)高、相同碼率及高轉(zhuǎn)低碼率的情況(前提條件:BR2=BR3),分類準(zhǔn)確率分別為97.73%、94.56%和80.28%,另外,在第三次碼率高于128kbps時(shí),能較為有效地從一、二、三次壓縮音頻混合集中區(qū)分三者。4.針對(duì)常見(jiàn)的篡改操作,本文提出了兩種篡改定位的方法。方法一受幀偏移方法的啟發(fā),利用篡改前后音頻量化特性的不一致性,將量化前后小值頻率系數(shù)的轉(zhuǎn)化率作為檢測(cè)變量實(shí)現(xiàn)篡改定位。實(shí)驗(yàn)結(jié)果表明,該方法對(duì)192kbps(原始未篡改MP3音頻的碼率)及以下音頻的篡改定位準(zhǔn)確率達(dá)到了98%。但該方法僅對(duì)篡改后以非壓縮格式保存的音頻有效。方法二基于重壓縮對(duì)幀結(jié)構(gòu)被破壞部分的音頻具有校正功能的原理,發(fā)現(xiàn)了篡改前后的音頻片段在估計(jì)的壓縮次數(shù)上的不一致性,從而將這種不一致性用于篡改定位。雖然從實(shí)驗(yàn)結(jié)果來(lái)看,由于受限于雙壓縮檢測(cè)方法的精度,該方法的定位準(zhǔn)確率暫無(wú)法令人滿意,但為研究壓縮音頻的篡改檢測(cè)開辟了一種新的思路。另外,該方法實(shí)用性更強(qiáng),可檢測(cè)篡改后的雙壓縮音頻。5.針對(duì)MP3Stego低嵌入率情況下檢測(cè)準(zhǔn)確率不高的問(wèn)題,通過(guò)分析MP3Stego隱寫操作對(duì)MP3音頻量化頻譜系數(shù)的影響,有針對(duì)性地對(duì)量化頻譜系數(shù)幅值的差值構(gòu)建了塊內(nèi)和塊間的Markov單步轉(zhuǎn)移概率特征,實(shí)現(xiàn)了對(duì)低嵌入率下MP3Stego的有效檢測(cè)。實(shí)驗(yàn)結(jié)果表明,該方法對(duì)嵌入強(qiáng)度為10.6%的MP3音頻,平均檢測(cè)準(zhǔn)確率能達(dá)到90.74%。隨著碼率的降低,檢測(cè)性能會(huì)有所下降,但仍優(yōu)于現(xiàn)有的典型方法。另一方面,本文還對(duì)另一個(gè)MP3隱寫工具——Under MP3Cover的隱寫原理進(jìn)行了深入剖析,發(fā)現(xiàn)其嵌入方法的核心是連續(xù)的LSB替換,但嵌入的位置間隔是通過(guò)參數(shù)Bit Spacing控制。依據(jù)其隱寫原理,對(duì)RS分析法進(jìn)行了改進(jìn),成功實(shí)現(xiàn)了對(duì)Under MP3Cover的檢測(cè),并能有效估計(jì)嵌入秘密信息的長(zhǎng)度。另外,對(duì)改進(jìn)方法中最佳翻轉(zhuǎn)算子的選擇、是否重疊分組以及參數(shù)Bit Spacing對(duì)嵌入強(qiáng)度估計(jì)準(zhǔn)確性的影響等問(wèn)題進(jìn)行了討論與分析。
[Abstract]:Digital audio digital media is one of the most easily available in people's daily life. In addition to the purchase, Download way audio files, but also through the real-time recording mode to generate audio / audio files. However, the development of audio editing and processing software and improve the audio editing and revising easier and cheap. At the same time, the human ear is difficult to detect the modified traces. Therefore, how to effectively verify the original digital audio, integrity and authenticity, has become a digital audio forensics problem need to be solved urgently. This paper studies and explores the key issues of digital audio forensics in and technology. The main building in the audio database forensics, audio forensics, audio compression history detection, audio content tamper detection and audio steganalysis which carried out the five aspects of research work: 1 At present, digital audio forensics field reference audio / speech database in this paper, the problem of lack of CD audio ripping and field voice recording mode, constructs a basic audio database (CKC-AD) and a basic speech database (CKC-SD). The former includes 2 types, as well as more than 5 long, 10 music genre, 4 language audio files, a total of 11172; the latter uses 38 types of recording equipment, of 31 (21 male and 10 female) were recorded speaker reading and spoken two part of speech. In addition, based on CKC-SD, according to the specific contents, further build TIMIT rip speech database two times, recording frequency database and background noise of the.2. database of the audio source forensics work consists of two parts: two rip audio detection and recording source device identification: according to the current two times audio frequency detection method only involved And single Toulu or playback equipment, this paper analyses the influence of the two recording and playback equipment of different audio playback audio ripping to steal the ripping process, and construct the feature vector according to the difference of the two rip audio and original recording audio in high frequency information of the distribution. The experimental results show that this method is effective the distinction between the original recording audio and two audio dubbing, comprehensive classification accuracy rate reached 98.47%. in addition, the scheme is integrated into the GMM-UBM speaker recognition system, can greatly improve the ability to resist the audio replay attack, so as to reduce error probability (EER) of 47.06%. for most of the recording source device identification methods are Mel based on cepstral coefficients (MFCC) or other characteristics of the acoustic characteristics of ideas, this essay starts with the characteristics of the recording equipment itself, proposed two methods of recording source recognition method is. By using the characteristics of the encoding parameters in the audio encoding process in different types of different equipment, construction of statistic features related to the recognition of the recording source device. The experimental results show that the method of CKC-SD in 10 recorded MP3 audio equipment average recognition rate is 99.97%. The average correct detection of AAC/M4A audio recording of paragraph 14 of the improvement rate of 96.53%. another method for a limitation of the recording format restrictions. Based on a thorough study of the different recording equipment of background noise, proposes a new estimation method of background noise of equipment, and for the estimation of the background noise of the constructed spectral shape feature and spectrum distribution to characterize the equipment. This method can distinguish the CKC-SD library on the 34 devices more accurately, the average classification accuracy of 95.53%.3. currently involves less AAC audio double compression detection in this paper. We propose a double compression detection method of Huffman code based on index. Through the analysis of double compression operation on the table index distribution, the statistical table index histogram and Markov single step transition probability as classification feature. On the double compressed audio low to high rate (FAAC/FAAD2 codec), the detection rate has reached more than 99%; but at the same rate, the classification accuracy rate is 79.56%. and the field only compare the typical method. The results show that the method of the overall detection accuracy is higher. In addition, the detection of MP3 audio compression history (not more than 3 times) and rate estimation are explored, this paper studies the Huffman index and the scale factor in the table multiple compression incremental change situation, targeted to build the mean probability distribution and correlation statistics form the feature vector. The experimental results show that this method of dual MP3 audio compression The detection accuracy is currently in the field of several typical methods, improved on the whole; in the three compression test, turn high to low, the same rate and high to low bit rate (prerequisite: BR2=BR3), the classification accuracy rate were 97.73%, 94.56% and 80.28%, in addition, in the third rate more than 128kbps, can be more efficiently from one, two, three times the compressed audio mixed centralized distinction between the three.4. for the common tampering, two methods are proposed in this paper. A method of tamper localization inspired by frame offset method, inconsistency in the use of tamper audio characteristics before and after quantization, transformation before and after quantization the small value of frequency coefficient detection rate as the variable to achieve tamper localization. The experimental results show that the method of 192kbps (the original rate untampered MP3 audio and audio) tamper localization accuracy reached 98%., but this method only for tampering with non pressure Save the audio formats effectively. Methods two based on compression principle has correction function of frame structure was destroyed part of the audio, audio clips found before and after tampering in the estimation of the number of compression inconsistency, and this inconsistency for tamper localization. Although from the experimental results, due to the limited double compression detection accuracy, positioning accuracy of this method will not be satisfactory, but opens up a new way to study the compressed audio tampering detection. In addition, this method is more practical and can detect tampering after double compressed audio.5. for MP3Stego detection under the condition of low embedding rate is not high accuracy rate, through the analysis of MP3Stego steganography effects on MP3 audio quantization of the spectral coefficients, for quantitative spectral coefficients of amplitude difference of building block and block the Markov single step transition probability. Sign, realize the effective detection of MP3Stego under low embedding rate. The experimental results show that the method of embedding strength for 10.6% MP3 audio, the average detection rate can reach 90.74%. as the rate decreased, the detection performance will decline, but still better than the existing typical methods. On the other hand, in-depth analysis in this paper also write tools -- Under MP3Cover on another MP3 hidden hidden principle, found the core embedding method is continuous LSB replacement, but the position is controlled by embedded interval parameter Bit Spacing. On the basis of the principle of steganography, RS analysis method was improved, the successful implementation of the detection of Under MP3Cover. And can effectively estimate the embedding length. In addition, the best method in selection of flip operator, whether overlapping grouping and parameters of Bit Spacing on the embedding strength influence the accuracy of the estimation results and other issues are also discussed. Theory and analysis.
【學(xué)位授予單位】:寧波大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP309
,
本文編號(hào):1404249
本文鏈接:http://www.sikaile.net/shoufeilunwen/xxkjbs/1404249.html
最近更新
教材專著