云環(huán)境下虛擬機(jī)監(jiān)控的研究與實(shí)現(xiàn)
發(fā)布時(shí)間:2018-04-15 06:30
本文選題:云監(jiān)控 + 虛擬機(jī); 參考:《大連理工大學(xué)》2016年碩士論文
【摘要】:如今,云計(jì)算已經(jīng)成為最廣泛的互聯(lián)網(wǎng)服務(wù)模式。對(duì)云平臺(tái)的資源進(jìn)行有效、及時(shí)、高效、低開銷的監(jiān)控,是保證云計(jì)算服務(wù)質(zhì)量的關(guān)鍵因素,同時(shí)為后續(xù)系統(tǒng)作業(yè)管理、負(fù)載管理和均衡等工作提供依據(jù)。云平臺(tái)中虛擬節(jié)點(diǎn)的資源監(jiān)控包括:虛擬機(jī)運(yùn)行狀態(tài)的獲取、虛擬機(jī)異常的分析和故障的預(yù)測(cè)。本文提出一種基于聚類的異常檢測(cè)算法,并基于該算法設(shè)計(jì)和實(shí)現(xiàn)了云環(huán)境下的虛擬機(jī)異常檢測(cè)系統(tǒng),該系統(tǒng)的主要功能是對(duì)云平臺(tái)中虛擬機(jī)節(jié)點(diǎn)的運(yùn)行狀態(tài)進(jìn)行監(jiān)控以及故障預(yù)警。基于聚類的異常檢測(cè)算法分為兩部分:基于聚類的建模方法和基于非參數(shù)CUSUM的異常分析方法。第一部分利用k-means和k-modes兩種聚類方法分別建模。首先輸入訓(xùn)練數(shù)據(jù)并指定聚類中心;然后用兩種算法分別對(duì)虛擬機(jī)狀態(tài)建模,得出結(jié)果并對(duì)結(jié)果做出修正;最后根據(jù)建模結(jié)果將虛擬機(jī)狀態(tài)分為三類:正常、異常、故障。由于本文采集的數(shù)據(jù)均為數(shù)值類型,因此兩種算法中k-means效果較好。第二部分對(duì)劃分為異常的數(shù)據(jù)進(jìn)行處理。利用CUSUM算法,當(dāng)系統(tǒng)發(fā)現(xiàn)虛擬機(jī)狀態(tài)異常時(shí),增大采集頻率,并對(duì)異常數(shù)據(jù)進(jìn)行累計(jì),達(dá)到預(yù)警門限時(shí)發(fā)出預(yù)警。在Hadoop和Spark平臺(tái)上實(shí)現(xiàn)云環(huán)境下虛擬機(jī)監(jiān)控系統(tǒng)。系統(tǒng)采用集中式監(jiān)控體系結(jié)構(gòu),對(duì)主從節(jié)點(diǎn)的虛擬機(jī)進(jìn)行設(shè)計(jì)。從節(jié)點(diǎn)的功能是對(duì)虛擬機(jī)運(yùn)行狀態(tài)的數(shù)據(jù)進(jìn)行采集;將采集到的數(shù)據(jù)通過(guò)Kafka消息系統(tǒng)發(fā)送給主節(jié)點(diǎn)并存入Rsdis數(shù)據(jù)庫(kù)中。主節(jié)點(diǎn)通過(guò)消息系統(tǒng)接收檢測(cè)數(shù)據(jù),并利用相關(guān)算法對(duì)異常分析和故障預(yù)警,同時(shí)主節(jié)點(diǎn)具有用戶接口,供用戶查看虛擬機(jī)運(yùn)行狀態(tài)以及具體報(bào)警信息。實(shí)驗(yàn)結(jié)果表明,Spark平臺(tái)下的監(jiān)控系統(tǒng)能實(shí)現(xiàn)預(yù)期功能,而Hadoop平臺(tái)下時(shí)效性稍差一些。
[Abstract]:Today, cloud computing has become the most extensive Internet service model.The monitoring of cloud platform resources is effective, timely, efficient and low cost, which is the key factor to ensure the quality of cloud computing service. It also provides the basis for the following work such as job management, load management and balance.The resource monitoring of virtual nodes in cloud platform includes the acquisition of virtual machine running state, the analysis of virtual machine anomaly and the prediction of fault.This paper presents an anomaly detection algorithm based on clustering, and designs and implements a virtual machine anomaly detection system based on this algorithm.The main function of the system is to monitor the running state of the virtual machine node in the cloud platform and to warn the failure.The algorithm of anomaly detection based on clustering is divided into two parts: modeling method based on clustering and anomaly analysis method based on nonparametric CUSUM.In the first part, two clustering methods, k-means and k-modes, are used to model the model.First input the training data and specify the clustering center; then use two algorithms to model the state of the virtual machine get the results and make a correction. Finally according to the modeling results the virtual machine state can be divided into three categories: normal abnormal fault.Because the data collected in this paper are of numerical type, k-means is effective in the two algorithms.The second part deals with the data divided into anomalies.Using CUSUM algorithm, when the system finds the abnormal state of the virtual machine, it increases the acquisition frequency, accumulates the abnormal data, and issues an early warning when the warning threshold is reached.The virtual machine monitoring system under cloud environment is implemented on Hadoop and Spark platform.The system adopts centralized monitoring architecture to design the virtual machine of master-slave node.The function of slave node is to collect the data of virtual machine running state, and send the collected data to the master node through Kafka message system and store the data in Rsdis database.The primary node receives the detection data through the message system, and uses the related algorithms to analyze the anomaly and the fault early warning. At the same time, the primary node has a user interface for the user to view the running status of the virtual machine and the specific alarm information.The experimental results show that the monitoring system based on Spark platform can achieve the expected function, but the timeliness of the system under Hadoop platform is slightly worse.
【學(xué)位授予單位】:大連理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP302
【參考文獻(xiàn)】
相關(guān)期刊論文 前6條
1 張?jiān)瀑F;趙華;王麗娜;;基于工業(yè)控制模型的非參數(shù)CUSUM入侵檢測(cè)方法[J];東南大學(xué)學(xué)報(bào)(自然科學(xué)版);2012年S1期
2 王千;王成;馮振元;葉金鳳;;K-means聚類算法研究綜述[J];電子設(shè)計(jì)工程;2012年07期
3 白亮;梁吉業(yè);曹付元;;基于粗糙集的改進(jìn)K-Modes聚類算法[J];計(jì)算機(jī)科學(xué);2009年01期
4 孫吉貴;劉杰;趙連宇;;聚類算法研究[J];軟件學(xué)報(bào);2008年01期
5 孫知信;唐益慰;程媛;;基于改進(jìn)CUSUM算法的路由器異常流量檢測(cè)[J];軟件學(xué)報(bào);2005年12期
6 盧建芝,尹春霖,莊肖斌,蘆康俊,李鷗;基于非參數(shù)CUSUM算法的DDoS攻擊的檢測(cè)[J];計(jì)算機(jī)與網(wǎng)絡(luò);2004年Z1期
相關(guān)碩士學(xué)位論文 前1條
1 王敏;分類屬性數(shù)據(jù)聚類算法研究[D];江蘇大學(xué);2008年
,本文編號(hào):1752948
本文鏈接:http://www.sikaile.net/kejilunwen/jisuanjikexuelunwen/1752948.html
最近更新
教材專著