天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于機(jī)器學(xué)習(xí)的異常流量檢測系統(tǒng)的設(shè)計與實現(xiàn)

發(fā)布時間:2018-04-09 13:28

  本文選題:流量分析 切入點(diǎn):異常檢測 出處:《北京郵電大學(xué)》2017年碩士論文


【摘要】:現(xiàn)如今隨著互聯(lián)網(wǎng)技術(shù)的不斷發(fā)展,人們的生活和工作越來越依賴于各種互聯(lián)網(wǎng)應(yīng)用。但由于安全意識的缺乏和攻擊技術(shù)不斷向復(fù)雜化、多樣化發(fā)展,許多網(wǎng)絡(luò)應(yīng)用都遭受著各種各樣的網(wǎng)絡(luò)攻擊和安全威脅,暴露出很多的網(wǎng)絡(luò)安全漏洞。異常流量檢測作為攻擊防御的第一步為攻擊的攔截提供了有效的保障,因此,準(zhǔn)確地檢測出異常流量是保障網(wǎng)絡(luò)應(yīng)用可用性和安全性的必需。本文通過研究現(xiàn)有的異常流量檢測技術(shù),把先進(jìn)的機(jī)器學(xué)習(xí)方法引入到異常檢測系統(tǒng)中,提出并設(shè)計一個基于機(jī)器學(xué)習(xí)的異常流量檢測的模型。該模型主要包括四個部分:1)從數(shù)據(jù)挖掘角度統(tǒng)計分析異常流量的特點(diǎn)并形成惡意關(guān)鍵字庫與多維特征庫;2)對多維特征庫進(jìn)行有效性測試與集合優(yōu)化;3)選擇機(jī)器學(xué)習(xí)算法對訓(xùn)練集進(jìn)行學(xué)習(xí)與驗證,對分類結(jié)果進(jìn)行性能評估;4)在系統(tǒng)的實際應(yīng)用中將其部署于Hadoop與Spark云平臺,通過并行化的檢測提高異常流量檢測的效率。在分析異常流量特點(diǎn)的研究中,結(jié)合了基于特征規(guī)則和基于統(tǒng)計分析的方法,把異常流量檢測看作一個模式識別問題,分解出異常流量的共性以及與正常流量之間的差異性,將其歸納學(xué)習(xí)為特征字段,供機(jī)器學(xué)習(xí)算法進(jìn)行驗證和評估。在特征優(yōu)化的研究中,本文提出了基于Sigmoid的特征選擇算法,基于信息增益的特征排序算法以及基于時間反饋的特征優(yōu)化算法三個特征提取算法。通過過濾,排序,性能優(yōu)化三個步驟挖掘出多維特征集合中最優(yōu)的特征子集。在機(jī)器學(xué)習(xí)算法的選擇上,本文比較并評估了決策樹,隨機(jī)森林和GBDT三種優(yōu)秀的分類算法,并將并行化考慮其中,最終實驗證明了 GBDT算法在準(zhǔn)確率和召回率上的優(yōu)勢。最后,本文考慮到系統(tǒng)實際應(yīng)用所面臨的大數(shù)據(jù)環(huán)境,設(shè)計并實現(xiàn)了一套基于分布式的檢測系統(tǒng),利用Hadoop和Spark分布式平臺與云存儲的數(shù)據(jù)處理優(yōu)勢,將數(shù)據(jù)預(yù)處理,特征解析以及機(jī)器學(xué)習(xí)過程實現(xiàn)了完全的并行化,大大提高了系統(tǒng)的檢測效率。
[Abstract]:Nowadays, with the continuous development of Internet technology, people's life and work are more and more dependent on various Internet applications.However, due to the lack of security awareness and the continuous development of attack technology, many network applications suffer from various network attacks and security threats, exposing a lot of network security vulnerabilities.As the first step of attack defense, anomaly traffic detection provides an effective guarantee for the interception of attacks. Therefore, it is necessary to accurately detect abnormal traffic to ensure the usability and security of network applications.This paper introduces the advanced machine learning method into the anomaly detection system by studying the existing abnormal traffic detection technology, and proposes and designs a model of abnormal traffic detection based on machine learning.The model mainly includes four parts: 1) from the angle of data mining, the characteristics of abnormal traffic are statistically analyzed and the malicious keyword library and multidimensional signature library are formed. (2) the validity test and set optimization of multidimensional signature library are carried out.The learning and verification of the training set is based on the learning algorithm.Performance evaluation of the classification results is carried out. In the practical application of the system, it is deployed on the cloud platform of Hadoop and Spark to improve the efficiency of anomaly traffic detection by parallel detection.In the research of analyzing the characteristics of abnormal traffic, combining the method based on feature rule and statistical analysis, the detection of abnormal traffic is regarded as a pattern recognition problem, which decomposes the commonness of abnormal traffic and the difference between abnormal flow and normal traffic.Its inductive learning is used as feature field for machine learning algorithm to verify and evaluate.In the research of feature optimization, this paper proposes three feature extraction algorithms: feature selection algorithm based on Sigmoid, feature sorting algorithm based on information gain and feature optimization algorithm based on time feedback.Through filtering, sorting and performance optimization, the optimal feature subset of multidimensional feature set is mined.In the selection of machine learning algorithm, this paper compares and evaluates three excellent classification algorithms: decision tree, random forest and GBDT, and considers the parallelism among them. Finally, the experiment proves the superiority of GBDT algorithm in accuracy and recall.Finally, considering the big data environment that the system is facing in practical application, this paper designs and implements a set of distributed detection system, which makes use of the advantages of Hadoop and Spark distributed platform and cloud storage to preprocess the data.The process of feature resolution and machine learning achieves complete parallelization, which greatly improves the detection efficiency of the system.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP181;TP393.06

【參考文獻(xiàn)】

相關(guān)期刊論文 前10條

1 許曉東;楊燕;李剛;;基于K-means聚類的網(wǎng)絡(luò)流量異常檢測[J];無線通信技術(shù);2013年04期

2 冶曉隆;蘭巨龍;郭通;;基于主成分分析禁忌搜索和決策樹分類的異常流量檢測方法[J];計算機(jī)應(yīng)用;2013年10期

3 鄭黎明;鄒鵬;賈焰;韓偉紅;;網(wǎng)絡(luò)流量異常檢測中分類器的提取與訓(xùn)練方法研究[J];計算機(jī)學(xué)報;2012年04期

4 陳鴻昶;程國振;伊鵬;;基于多尺度特征融合的異常流量檢測方法[J];計算機(jī)科學(xué);2012年02期

5 程國振;程東年;俞定玖;;基于多尺度低秩模型的網(wǎng)絡(luò)異常流量檢測方法[J];通信學(xué)報;2012年01期

6 李強(qiáng);嚴(yán)承華;;基于直方圖聚類的網(wǎng)絡(luò)流量異常檢測技術(shù)研究[J];信息網(wǎng)絡(luò)安全;2012年01期

7 賴粵;黃河濤;謝勝利;;基于IXP2850的異常流量檢測模塊的設(shè)計與實現(xiàn)[J];計算機(jī)工程與設(shè)計;2011年07期

8 朱劍;李輝;;利用鏈路相關(guān)性進(jìn)行網(wǎng)絡(luò)流量異常檢測[J];計算機(jī)應(yīng)用與軟件;2011年06期

9 孫紅艷;張紅玉;;一種基于Under-sampling的BGP異常流量檢測方法[J];電子技術(shù);2011年01期

10 賈慧;高仲合;;基于自相似的異常流量檢測模型[J];通信技術(shù);2010年12期

相關(guān)博士學(xué)位論文 前3條

1 周穎杰;基于行為分析的通信網(wǎng)絡(luò)流量異常檢測與關(guān)聯(lián)分析[D];電子科技大學(xué);2013年

2 楊曉峰;基于機(jī)器學(xué)習(xí)的Web安全檢測方法研究[D];南京理工大學(xué);2011年

3 左申正;基于機(jī)器學(xué)習(xí)的網(wǎng)絡(luò)異常分析及響應(yīng)研究[D];北京郵電大學(xué);2010年

相關(guān)碩士學(xué)位論文 前3條

1 姜海東;基于機(jī)器學(xué)習(xí)的異常流量檢測[D];南京郵電大學(xué);2014年

2 許倩;基于特征統(tǒng)計分析的異常流量檢測技術(shù)研究[D];解放軍信息工程大學(xué);2012年

3 童行行;基于機(jī)器學(xué)習(xí)的網(wǎng)絡(luò)流量分析研究[D];清華大學(xué);2005年



本文編號:1726624

資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/guanlilunwen/ydhl/1726624.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶d4074***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com