錄音真實性辨識和重翻錄檢測

發(fā)布時間：2019-03-05 09:27

【摘要】：語音信號數(shù)字化的普及,極大地方便了語音數(shù)據(jù)的存儲、傳輸和共享。與此同時,操作簡單、功能強大的音頻編輯軟件發(fā)展迅速,無論是錄制一段語音,還是對其進行加工、潤色或者其它處理,都成為一件輕而易舉的工作。這些技術(shù)在給人們帶來諸多便利的同時,也產(chǎn)生了許多安全隱患。例如,利用音頻處理技術(shù)可以篡改語音內(nèi)容或者產(chǎn)生偽造的語音,一旦這些虛假的錄音被某些人用于非法的目的,將對社會及他人的生命和財產(chǎn)安全造成一定的威脅。因此,對數(shù)字語音信號的真實性檢測具有十分重要的意義。盡管目前已經(jīng)有許多針對數(shù)字音頻的相關(guān)方面的研究,但遠不能滿足社會及大眾的需求。針對目前存在的音頻取證的相關(guān)問題,本文對錄音的真實性辨識和重翻錄檢測進行了研究,主要內(nèi)容如下:1)錄音真實性辨識。隨著音頻編輯軟件的普及,人們可以利用一些常用的功能(如濾波、混響等)對音頻進行美化和修飾。這些操作簡單、功能強大的編輯軟件帶給人們便利的同時,也給一些不法分子提供了可乘之機。比如音頻偽造者可以使用軟件中合適的濾波功能對拼接的音頻進行平滑,從而掩蓋拼接的痕跡。如果此類音頻作為證據(jù)出現(xiàn)于法庭審判并被接納,無疑會對審判結(jié)果產(chǎn)生重大影響。也有人通過變調(diào)功能來模仿其他人的聲音進行電話詐騙,給他人的財產(chǎn)安全造成極大的威脅。可見,對數(shù)字語音信號的真實性檢測是十分必要的。借鑒圖像共生矩陣的思想,本文提出了適用于音頻的幅度共生向量特征,即將語音信號進行量化操作,再對相鄰多個樣本點之間形成的共生向量進行概率分布的計算。該特征體現(xiàn)了相鄰樣本點之間的波動特性,對于處理語音的檢測起到了很好的效果,實驗的準確率能夠達到95%。另外實驗中我們列舉了兩種編輯軟件中的12種操作處理,對其進行了檢測和區(qū)分,結(jié)果證明該特征能夠進一步對處理功能進行辨識。2)數(shù)字語音的重翻錄檢測。重翻錄操作不僅可以起到偽造場景的作用,而且它能用于攻擊基于語音特征的身份認證系統(tǒng),因此檢測重翻錄操作也變得十分重要。本文主要通過數(shù)據(jù)統(tǒng)計分析的角度,使用擴展的幅度共生向量特征來區(qū)分原始語音和重翻錄語音。我們對幅度共生向量的量化閾值T進行了分析,同時增加了不同采樣間隔的樣本點組合,使其更加適用于重翻錄檢測。另外,我們構(gòu)建了一個重翻錄的數(shù)據(jù)庫,包含了多種錄制設備和不同的錄制環(huán)境等因素,為實驗部分提供了充分的數(shù)據(jù)。在與梅爾倒譜系數(shù)特征和原有的幅度共生向量特征的對比中驗證了該特征對重翻錄檢測的性能,基于該特征的重翻錄檢測的準確率能夠達到96%。同時我們將數(shù)據(jù)庫劃分成不同場景的子數(shù)據(jù)集,進行相同場景和不同場景的檢測,準確率分別能達到99.36%和95.69%。
[Abstract]:The popularity of digital speech signal greatly facilitates the storage, transmission and sharing of voice data. At the same time, the simple operation, powerful audio editing software has developed rapidly, whether to record a speech, or to process it, retouching or other processing, has become an easy job. These technologies bring a lot of convenience to people, at the same time, they also bring a lot of security risks. For example, the use of audio processing technology can tamper with speech content or produce forged speech. Once these false recordings are used for illegal purposes by some people, they will pose a threat to the safety of the life and property of society and others. Therefore, it is very important to detect the authenticity of digital speech signals. Although there have been a lot of research on digital audio, it can not meet the needs of the society and the general public. Aiming at the problems of audio forensics at present, this paper studies the authenticity identification and re-transcription detection of audio recording. The main contents are as follows: 1) recording authenticity identification. With the popularity of audio editing software, people can make use of some commonly used functions (such as filtering, reverberation, etc.) to beautify and modify audio. These operations are simple, powerful editing software brings convenience to people, but also gives some criminals an opportunity to take advantage of. For example, the audio forger can smooth the splicing audio by using the appropriate filtering function in the software to mask the stitching trace. If such audio appears as evidence in court and is accepted, it will undoubtedly have a significant impact on the outcome of the trial. Others make phone fraud by changing the voice of others, which poses a great threat to the property security of others. Therefore, it is very necessary to detect the authenticity of digital speech signals. Referring to the idea of image co-occurrence matrix, this paper proposes the amplitude co-occurrence vector feature suitable for audio frequency, that is to say, the speech signal is quantized, and then the probability distribution of the symbiosis vector formed between adjacent sample points is calculated. This feature reflects the fluctuation between adjacent sample points and has a good effect on speech detection. The accuracy of the experiment is up to 95%. In addition, we enumerate 12 kinds of operation processing in two kinds of editing software, and detect and distinguish them. The result shows that the feature can further identify the processing function. 2) the re-ripping detection of digital speech. Not only can it act as a forgery scene, but also it can be used to attack the authentication system based on speech features. Therefore, it is very important to detect the reentry operation. In this paper, the extended amplitude symbiosis vector feature is used to distinguish the original speech from the re-transcribed speech from the statistical analysis of the data. We analyze the quantized threshold T of amplitude co-occurrence vector and increase the combination of sample points with different sampling intervals, which makes it more suitable for re-reading detection. In addition, we build a database which contains many kinds of recording equipment and different recording environment, which provides sufficient data for the experimental part. In comparison with Mel cepstrum coefficient feature and original amplitude co-occurrence vector feature, the performance of this feature for re-ripping detection is verified, and the accuracy of re-ripping detection based on this feature can reach 96%. At the same time, we divide the database into sub-data sets of different scenarios, and detect the same scene and different scene, the accuracy can reach 99.36% and 95.69%, respectively.
【學位授予單位】：深圳大學
【學位級別】：碩士
【學位授予年份】：2017
【分類號】：TN912.3

【參考文獻】

相關(guān)期刊論文前8條

1 魯明明;張暉;沈慶宏;;基于功率譜特征的音頻指紋實現(xiàn)[J];電子測量技術(shù);2016年09期

2 王志鋒;賀前華;李艷雄;;錄音設備的建模和識別算法[J];信號處理;2013年04期

3 高程程;惠曉威;;基于灰度共生矩陣的紋理特征提取[J];計算機系統(tǒng)應用;2010年06期

4 邵松年;黃征;徐徹;施少培;楊旭;;數(shù)字音頻與錄制設備的相關(guān)性研究[J];計算機工程;2009年19期

5 姚秋明;柴佩琪;宣國榮;楊志強;施云慶;;基于期望最大化算法的音頻取證中的篡改檢測[J];計算機應用;2006年11期

6 薄華;馬縛龍;焦李成;;圖像紋理的灰度共生矩陣計算問題的分析[J];電子學報;2006年01期

7 白雪冰;王克奇;王輝;;基于灰度共生矩陣的木材紋理分類方法的研究[J];哈爾濱工業(yè)大學學報;2005年12期

8 童隆正,王磊,陳海榮,陳瑞芬,賀文;肝纖維化圖像的灰度共生矩陣分析[J];首都醫(yī)科大學學報;2003年03期

，

本文編號：2434758

資料下載