天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 自動(dòng)化論文 >

網(wǎng)絡(luò)流量分類中特征工程的研究

發(fā)布時(shí)間:2018-06-25 22:16

  本文選題:網(wǎng)絡(luò)流量分類 + 最小最大規(guī)則 ; 參考:《南京郵電大學(xué)》2017年碩士論文


【摘要】:對(duì)網(wǎng)絡(luò)流量進(jìn)行分析與分類是實(shí)現(xiàn)網(wǎng)絡(luò)監(jiān)控和管理的一大途徑,并被廣泛應(yīng)用于網(wǎng)絡(luò)入侵檢測(cè)系統(tǒng)、網(wǎng)絡(luò)管理系統(tǒng)等領(lǐng)域中。然而,隨著動(dòng)態(tài)端口號(hào)技術(shù)以及對(duì)流量加密技術(shù)的發(fā)展,單純傳統(tǒng)的網(wǎng)絡(luò)流量分類方法已經(jīng)無(wú)法達(dá)到我們對(duì)其準(zhǔn)確性的要求。近年來(lái),基于機(jī)器學(xué)習(xí)的網(wǎng)絡(luò)流量分類受到廣泛關(guān)注,其僅需定義一組與流量相關(guān)的統(tǒng)計(jì)量作為特征,而不需要使用端口號(hào)等來(lái)表示流量,從而避免了傳統(tǒng)方法帶來(lái)的局限性。然而,網(wǎng)絡(luò)流量分類中存在著諸如類別不平衡、數(shù)據(jù)規(guī)模大等各種問(wèn)題,若僅單純使用傳統(tǒng)的機(jī)器學(xué)習(xí)算法同樣會(huì)導(dǎo)致分類性能較差。本文以此為出發(fā)點(diǎn),對(duì)機(jī)器學(xué)習(xí)算法進(jìn)行研究和改進(jìn),并用于網(wǎng)絡(luò)流量分類中以提高其性能。本文提出一種基于最小最大策略的集成特征選擇算法用于解決流量分類中遇到的類別不平衡問(wèn)題。該算法是將機(jī)器學(xué)習(xí)中特征選擇和集成學(xué)習(xí)相結(jié)合,主要分為兩個(gè)步驟,即數(shù)據(jù)劃分與特征選擇結(jié)果集成。先通過(guò)某方法將原始數(shù)據(jù)集劃分為若干數(shù)據(jù)子集,在對(duì)每個(gè)數(shù)據(jù)子集進(jìn)行特征選擇過(guò)后,再通過(guò)最小最大策略將每個(gè)數(shù)據(jù)子集的特征選擇結(jié)果進(jìn)行集成,得到最終的特征選擇結(jié)果。本文通過(guò)將該算法與其他集成特征選擇算法進(jìn)行比較,主要驗(yàn)證其在網(wǎng)絡(luò)流量分類中的性能。為了進(jìn)一步提升網(wǎng)絡(luò)流量分類的性能,本文通過(guò)考慮流量之間的相關(guān)性,在之前的流量數(shù)據(jù)集的基礎(chǔ)上提取了一組基于多條流量在時(shí)間/空間上的關(guān)聯(lián)性得到的特征,如與待分類流量擁有相同源IP地址的流量集合中流量的數(shù)量等。最后將提取了多流特征后的數(shù)據(jù)集使用提出的集成特征選擇策略進(jìn)行特征選擇并進(jìn)行分類以驗(yàn)證多流特征對(duì)網(wǎng)絡(luò)流量分類效果的影響。實(shí)驗(yàn)表明,其在結(jié)合了部分多流特征之后,效果明顯地提升。本文提出的集成特征選擇算法能有效地處理流量分類中類別不平衡的問(wèn)題。與此同時(shí),提取的多流特征也對(duì)流量分類的性能有一定地提升。
[Abstract]:The analysis and classification of network traffic is a great way to realize network monitoring and management, and is widely used in network intrusion detection system, network management system and other fields. However, with the development of dynamic port number technology and traffic encryption technology, the traditional network traffic classification method can not meet the requirements of its accuracy. In recent years, network traffic classification based on machine learning has attracted much attention. It only needs to define a set of statistics related to traffic as a feature, and does not need to use port numbers to represent traffic, thus avoiding the limitations of traditional methods. However, there are many problems in network traffic classification, such as class imbalance, large data scale and so on. If we only use traditional machine learning algorithms, the classification performance will also be poor. This paper studies and improves the machine learning algorithm and applies it to network traffic classification to improve its performance. In this paper, an ensemble feature selection algorithm based on minimum maximum strategy is proposed to solve the class imbalance problem in traffic classification. The algorithm is a combination of feature selection and ensemble learning in machine learning, which is divided into two steps: data partitioning and feature selection result integration. First, the original data set is divided into several data subsets by a certain method. After the feature selection of each data subset is carried out, the feature selection results of each data subset are integrated by the minimum maximum strategy. The final feature selection results are obtained. By comparing the algorithm with other integrated feature selection algorithms, this paper mainly verifies its performance in network traffic classification. In order to further improve the performance of network traffic classification, by considering the correlation between traffic, we extract a set of features based on the correlation of multiple traffic in time / space based on the previous traffic data set. Such as the amount of traffic in the traffic set with the same source IP address as the traffic to be classified. Finally, the data set after extracting multi-stream features is selected and classified using the proposed integrated feature selection strategy to verify the effect of multi-flow features on network traffic classification. The experimental results show that the effect is improved obviously by combining partial multi-flow features. The integrated feature selection algorithm proposed in this paper can effectively deal with the problem of class imbalance in traffic classification. At the same time, the extracted multi-stream features also improve the performance of traffic classification.
【學(xué)位授予單位】:南京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP393.06;TP181

【參考文獻(xiàn)】

相關(guān)期刊論文 前2條

1 劉珍;王若愚;蔡先發(fā);唐德玉;;互聯(lián)網(wǎng)流量分類中流量特征研究[J];計(jì)算機(jī)應(yīng)用研究;2017年01期

2 林平;余循宜;劉芳;雷振明;;基于流統(tǒng)計(jì)特性的網(wǎng)絡(luò)流量分類算法[J];北京郵電大學(xué)學(xué)報(bào);2008年02期

相關(guān)博士學(xué)位論文 前1條

1 林平;網(wǎng)絡(luò)流量的離線分析[D];北京郵電大學(xué);2010年

相關(guān)碩士學(xué)位論文 前1條

1 周國(guó)靜;基于最小最大規(guī)則的集成策略研究[D];南京郵電大學(xué);2015年

,

本文編號(hào):2067730

資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/kejilunwen/zidonghuakongzhilunwen/2067730.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶f2187***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com