天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

網(wǎng)路安全數(shù)據(jù)可視化系統(tǒng)的設(shè)計與研究

發(fā)布時間:2018-07-20 12:51
【摘要】:現(xiàn)代社會呈現(xiàn)指數(shù)增長的數(shù)字信息,促使數(shù)據(jù)分析學科進入了一個蓬勃發(fā)展的黃金年代。一直以來,人們總是試圖使用數(shù)據(jù)分析的方法,從源源不斷的數(shù)據(jù)資源中探索出與我們息息相關(guān)的信息。在網(wǎng)絡(luò)安全領(lǐng)域,使用數(shù)據(jù)分析來解決安全問題成為一個新的方法。人們收集到的各種安全日志數(shù)據(jù)的數(shù)據(jù)量巨大,如果沒有分析工具的幫助人們將無法處理和使用這些數(shù)據(jù)。尤其是人們還需要解決快速地理解網(wǎng)絡(luò)通信模式、識別網(wǎng)絡(luò)異常點和發(fā)現(xiàn)網(wǎng)絡(luò)攻擊等一系列問題。網(wǎng)絡(luò)安全可視化技術(shù)就是一種非常實用的技術(shù)。將可視化技術(shù)應用到網(wǎng)絡(luò)安全領(lǐng)域,把龐大的網(wǎng)絡(luò)數(shù)據(jù)轉(zhuǎn)變成易于理解的視覺圖像,利用人類視覺來獲取數(shù)據(jù)模型和結(jié)構(gòu),構(gòu)建起安全數(shù)據(jù)和認知之間的橋梁。可視化在網(wǎng)絡(luò)安全領(lǐng)域的流行是必然的:人們需要篩選的數(shù)據(jù)越多,就越希望把數(shù)據(jù)轉(zhuǎn)化成圖像,把圖像和文字并列顯示?梢暬蔀橐粋重要的分析工工具,運用它能夠直觀地呈現(xiàn)出安全數(shù)據(jù)背后所表現(xiàn)出來的模式和規(guī)律,從而幫助人們分析網(wǎng)絡(luò)現(xiàn)狀,處理已經(jīng)發(fā)現(xiàn)的安全事件以及及時預測未發(fā)生的潛在安全事件。同時,可視化分析工具能夠幫助我們更好地理解安全數(shù)據(jù),它幫助人們處理數(shù)據(jù)過載而節(jié)約時間,在告知人們信息的同時也讓人們參與數(shù)據(jù)收集和分析的過程。本文依據(jù)網(wǎng)絡(luò)安全可視化參考模型,借鑒分層架構(gòu)的思想,研究并設(shè)計了一款針對網(wǎng)絡(luò)安全數(shù)據(jù)可視化的web原型系統(tǒng)Nets.vis。該系統(tǒng)能夠完成從數(shù)據(jù)處理到生成視圖的過程。Nets.vis原型系統(tǒng)框架是一個分層、靈活、輕量級的網(wǎng)絡(luò)安全數(shù)據(jù)可視化框架。該系統(tǒng)使用了服務(wù)器-客戶端的結(jié)構(gòu),客戶端在用戶的瀏覽器中進行渲染,服務(wù)器端提供數(shù)據(jù)的獲取、存儲和分析并加載可視化組件。Nets.vis原型系統(tǒng)主要由以下7層構(gòu)成:(1)數(shù)據(jù)預處理層。主要對源數(shù)據(jù)進行數(shù)據(jù)清洗,將臟數(shù)據(jù)、無用的數(shù)據(jù)、錯誤的數(shù)據(jù)去掉,得到干凈可用的數(shù)據(jù)。(2)數(shù)據(jù)導入層。該層主要負責將MySQL數(shù)據(jù)庫中的數(shù)據(jù)導入到HDFS中。(3)數(shù)據(jù)存儲層。Nets.vis原型系統(tǒng)的所有實驗數(shù)據(jù)均保存在HDFS中。(4)數(shù)據(jù)管理層。整個Nets.vis原型系統(tǒng)的數(shù)據(jù)倉庫的數(shù)據(jù)都是由Hive來管理,也就是說,所有的數(shù)據(jù)都是由數(shù)據(jù)存儲層以Hive表的形式輸出到數(shù)據(jù)管理層。(5)數(shù)據(jù)服務(wù)層。在這一層,根據(jù)分析的需求,基于數(shù)據(jù)倉庫的數(shù)據(jù)進行各種分析和數(shù)據(jù)挖掘。(6)數(shù)據(jù)應用層。數(shù)據(jù)服務(wù)層的數(shù)據(jù)必須導回到關(guān)系型數(shù)據(jù)庫中,這是由于Hive執(zhí)行的高延遲不適合用來生成最終的可視化結(jié)果。(7)可視化層。用戶通過瀏覽器查看最終的可視化結(jié)果。整個Nets.vis系統(tǒng)的需求功能可以概括為:數(shù)據(jù)預處理、數(shù)據(jù)導入、數(shù)據(jù)分析、生成視圖。本文主要從以下幾個方面展開研究工作。首先,通過在Linux系統(tǒng)的服務(wù)器上部署Hadoop系統(tǒng),實現(xiàn)了對大規(guī)模數(shù)據(jù)的存儲和管理。Hadoop系統(tǒng)提供的Hive數(shù)據(jù)倉庫可用于存儲數(shù)據(jù),Sqoop可實現(xiàn)關(guān)系型數(shù)據(jù)庫MySQL與Hadoop之間的數(shù)據(jù)傳輸。研究中服務(wù)器端的數(shù)據(jù)導入、存儲及相關(guān)數(shù)據(jù)分析模塊均基于Hadoop平臺。使用Sqoop實現(xiàn)從關(guān)系型數(shù)據(jù)庫MySQL中導入數(shù)據(jù)到數(shù)據(jù)倉庫Hive中,再將分析后的結(jié)果導回到MySQL數(shù)據(jù)庫中?蛻舳耸褂肧pring MVC對Web端進行架構(gòu),并使用Bootstrap優(yōu)化原型系統(tǒng)的可視化界面。其次,由于在本文的Nets.vis可視化原型系統(tǒng)里,經(jīng)常會涉及到查詢等操作,因此優(yōu)化Hive的數(shù)據(jù)分析模塊的相關(guān)操作效率十分重要。本文使用空間亞線性算法對數(shù)據(jù)提取、轉(zhuǎn)換、加載、查詢等操作效率進行優(yōu)化改進。其中,使用尋找頻繁元素的Misra-Gries算法,通過計算找出出現(xiàn)最頻繁的元素。例如在網(wǎng)絡(luò)中找到頻繁出現(xiàn)的IP地址;使用估算不同元素的數(shù)量算法來估算數(shù)據(jù)流中不同元素的個數(shù),例如可以用于統(tǒng)計某個頁面的訪問ip數(shù)。與此同時,在數(shù)據(jù)分析模塊使用Canopy聚類結(jié)合k-means聚類對源IP進行分析。在數(shù)據(jù)分析模塊中選取屬性維度時,本文選取概率論和統(tǒng)計學中一種常用的皮爾遜積距相關(guān)系數(shù)和相關(guān)矩陣來驗證維度間的相關(guān)性。然后,Nets.vis原型系統(tǒng)的可視化模塊其主要目的是用于按照用戶的意愿進行數(shù)據(jù)集合的篩選。在可視化模塊中,本文主要使用Echarts和D3兩個可視化工具設(shè)計了符合網(wǎng)絡(luò)安全數(shù)據(jù)屬性的可視化組件,包括:氣泡圖、Treemap、平行坐標圖、關(guān)系圖、條形圖、折線圖以及矩形熱力圖。本文設(shè)計實現(xiàn)了基于SVG的可視化組件渲染方法,可以使可視化的結(jié)果更加豐富且直觀。同時,使用Brich算法對氣泡圖進行了布局上的改進。最后,本文采用“先總體后細節(jié)”的可視化指南,選取Nets.vis原型系統(tǒng)中的部分可視化組件,使用Vis China 2015挑戰(zhàn)賽提供的Tcp flow日志數(shù)據(jù)來驗證Nets.vis系統(tǒng)的可行性。第一步,使用層次聚類改進過的氣泡圖、條形圖以及關(guān)系圖,找出了網(wǎng)絡(luò)中的服務(wù)器與客戶端,挖掘了網(wǎng)絡(luò)的拓撲結(jié)構(gòu)。第二步,對服務(wù)器分別按照協(xié)議特征以及時間序列特征進行分類。第三步,挖掘網(wǎng)絡(luò)流量特征。對于流量特征的挖掘,本文考慮結(jié)合網(wǎng)絡(luò)流量數(shù)據(jù)具有的層次結(jié)構(gòu)屬性和時序?qū)傩?以折線圖實現(xiàn)數(shù)據(jù)整體時序特征的可視化,發(fā)現(xiàn)網(wǎng)絡(luò)“節(jié)假日模式”和“工作日模式”。第四步,以樹圖實現(xiàn)數(shù)據(jù)局部時間特征的可視化,發(fā)現(xiàn)產(chǎn)生異常的特定主機。實驗證明:使用Nets.vis系統(tǒng)可視化分析Tcp flow數(shù)據(jù)集,實現(xiàn)了由整體到局部的網(wǎng)絡(luò)分析,通過該系統(tǒng)能夠完成對網(wǎng)絡(luò)服務(wù)與客戶端的確定、對服務(wù)器進行分類、識別網(wǎng)絡(luò)流量模式以及發(fā)現(xiàn)網(wǎng)絡(luò)異常,便于分析人員對網(wǎng)絡(luò)的管理以及對網(wǎng)絡(luò)安全事態(tài)的感知。
[Abstract]:The digital information of the exponential growth in modern society has prompted the data analysis subject into a flourishing golden age. People always try to use the method of data analysis to explore the information which is closely related to us from the continuous data resources. In the field of network security, the use of data analysis to solve the security. The whole problem becomes a new method. The amount of data that people collect is huge, and people will not be able to handle and use these data without the help of analytical tools. In particular, people also need to solve a series of problems such as fast understanding of network communication patterns, identifying network anomaly points and discovering network attacks. Network security visualization technology is a very practical technology. It applies the visualization technology to the field of network security, transforms large network data into easy to understand visual images, uses human vision to obtain data model and structure, and constructs a bridge between security data and cognition. Visualization is popular in the field of network security. It is inevitable: the more data people need to screen, the more they want to transform the data into images, and to display the image and the text. Visualization becomes an important analytical tool, using it to visualize the patterns and rules displayed behind the security data, so as to help people to analyze the network status and deal with it. At the same time, visual analysis tools help us to better understand security data. It helps people to deal with data overload and save time. It also allows people to participate in data collection and analysis while informing people of information. This article is based on network security visibility. The web prototype system Nets.vis., which can complete from data processing to generated view, is a framework of hierarchical, flexible and lightweight network security data visualization framework. The system is used in this system. The server client structure, the client is rendered in the user's browser, the server side provides data acquisition, storage and analysis, and loading visual components.Nets.vis prototype system mainly consists of the following 7 layers: (1) data preprocessing layer. The main data is cleaned on the source data, dirty data, useless data, the wrong number The data import layer. (2) the data import layer. The layer is mainly responsible for importing the data in the MySQL database into the HDFS. (3) all the experimental data of the.Nets.vis prototype system of the data storage layer are kept in the HDFS. (4) the data management layer. The data warehouse data of the whole Nets.vis prototype system are managed by Hive, too. It is said that all data are output from the data storage layer to the data management layer in the form of Hive table. (5) data service layer. In this layer, various analysis and data mining are carried out based on data warehouse data according to the requirements of analysis. (6) data application layer. Data service layer data must be returned to relational database, which is due to Hi The high latency of VE execution is not suitable for generating the final visualization results. (7) the visualization layer. Users view the final visual results through the browser. The requirement function of the whole Nets.vis system can be summarized as data preprocessing, data import, data analysis, and generation view. The main research work is to be carried out from the following aspects. First, By deploying Hadoop system on the server of Linux system, the Hive data warehouse provided for the storage and management of large scale data can be used to store data. Sqoop can realize data transmission between MySQL and Hadoop in relational database. The data import, storage and related data analysis module of the server side in the study It is based on the Hadoop platform. Using Sqoop to import data from relational database MySQL into the data warehouse Hive, then lead the analysis results back to the MySQL database. The client uses Spring MVC to construct the Web end, and uses Bootstrap to optimize the visual interface of the prototype system. Secondly, because of the Nets.vis visualization in this article. In the prototype system, query and other operations are often involved, so it is very important to optimize the operation efficiency of the data analysis module of Hive. This paper uses the spatial sublinear algorithm to optimize the operation efficiency of data extraction, conversion, loading, query and so on. In this paper, the Misra-Gries algorithm for finding frequent elements is used to find out the results by calculation. The most frequent elements, such as finding frequent IP addresses in the network, estimate the number of different elements in the data stream using the number of algorithms that estimate the number of different elements, such as the number of access IP for a page. At the same time, the data analysis module uses Canopy clustering and K-means clustering to analyze the source IP. When selecting attribute dimensions in the data analysis module, this paper selects a common Pearson product distance correlation coefficient and correlation matrix in probability theory and statistics to verify the correlation between dimensions. Then, the main purpose of the visualization module of the Nets.vis prototype system is to screen the data set according to the user's wishes. In the module, this paper mainly uses two visual tools of Echarts and D3 to design visual components that conform to the network security data attributes, including bubble graph, Treemap, parallel coordinate diagram, relation diagram, bar graph, line diagram and rectangular thermal diagram. This paper designs and implements a visualization component rendering method based on SVG, which can make visual results More abundant and intuitive. At the same time, the Brich algorithm is used to improve the layout of the bubble graph. Finally, this paper uses the visual guide of "the first overall after details", selects some visual components in the Nets.vis prototype system, and uses the Tcp flow log data provided by the Vis China 2015 challenge to verify the feasibility of the Nets.vis system. One step, using hierarchical clustering improved bubble map, bar graph and relational graph, find the server and client in the network, excavate the network topology. Second steps, the server according to the protocol characteristics and time series characteristics are classified respectively. Third steps, digging network traffic characteristics. For the flow characteristics mining, this paper Considering the hierarchical attributes and temporal attributes of the network traffic data, the visualization of the whole time sequence characteristics of the data is realized by the fold line graph, and the network "holiday mode" and "working day mode" are found. The fourth step is to visualize the local time characteristics of the data with the tree graph, and find the specific host that produces the abnormal. The Nets.vis system is used to visualize the Tcp flow data set, and the network analysis from the whole to the local is realized. Through this system, the network service and the client can be determined, the server is classified, the network traffic pattern is identified and the network abnormality is found, which facilitates the analysis of the network management and the network security situation. Perception.
【學位授予單位】:蘭州交通大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP393.08

【參考文獻】

相關(guān)期刊論文 前7條

1 肖萬武;向?qū)?;計算機網(wǎng)絡(luò)安全可視化研究平臺設(shè)計與實現(xiàn)[J];現(xiàn)代電子技術(shù);2017年01期

2 李聰穎;王瑞剛;梁小江;;基于Hadoop的交互式大數(shù)據(jù)分析查詢處理方法[J];計算機技術(shù)與發(fā)展;2016年08期

3 趙穎;王權(quán);黃葉子;吳青;張勝;;多視圖合作的網(wǎng)絡(luò)流量時序數(shù)據(jù)可視分析[J];軟件學報;2016年05期

4 張勝;施榮華;趙穎;;基于多元異構(gòu)網(wǎng)絡(luò)安全數(shù)據(jù)可視化融合分析方法[J];計算機應用;2015年05期

5 余長俊;張燃;;云環(huán)境下基于Canopy聚類的FCM算法研究[J];計算機科學;2014年S2期

6 趙穎;樊曉平;周芳芳;汪飛;張加萬;;網(wǎng)絡(luò)安全數(shù)據(jù)可視化綜述[J];計算機輔助設(shè)計與圖形學學報;2014年05期

7 孫大為;張廣艷;鄭緯民;;大數(shù)據(jù)流式計算:關(guān)鍵技術(shù)及系統(tǒng)實例[J];軟件學報;2014年04期

相關(guān)博士學位論文 前2條

1 王懷暉;基于特征的復雜流場紋理可視化關(guān)鍵技術(shù)研究[D];國防科學技術(shù)大學;2015年

2 呂良福;DDoS攻擊的檢測及網(wǎng)絡(luò)安全可視化研究[D];天津大學;2008年

相關(guān)碩士學位論文 前1條

1 馮琦森;基于出租車軌跡的居民出行熱點路徑和區(qū)域挖掘[D];重慶大學;2016年

,

本文編號:2133572

資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/guanlilunwen/ydhl/2133572.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶e7843***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com