天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于圖論的復雜網絡社團挖掘與結構分析

發(fā)布時間:2017-12-26 20:35

  本文關鍵詞:基于圖論的復雜網絡社團挖掘與結構分析 出處:《西安電子科技大學》2016年博士論文 論文類型:學位論文


  更多相關文章: 圖理論 社團挖掘 結構分析 復雜網絡 算法


【摘要】:網絡科學是當前研究現實世界事物之間關系的有力工具之一。將復雜系統建模為復雜網絡是進一步分析和研究現實世界對象之間關系的前提,其中點表示復雜系統中的對象,邊表示對象之間的關系。面對建模成的復雜網絡,迫切的問題是探索網絡中蘊含的信息,從中等尺度角度揭示網絡的組成及其物理意義,即如何從復雜網絡中提取元數據組。元數據組是一組具有真實物理意義的節(jié)點,可通過實驗驗證節(jié)點的各種子集是否具有相同或相似的功能提取元數據組。由于材料成本、人工成本、時間成本等客觀限制,面對當前的海量數據這種枚舉式的驗證是不可能完成的任務。因此人們將目光轉向計算的方法,具體的流程是先將要研究的復雜系統建模成復雜網絡,對其中的元數據組進行建模如通常建模為社團或模塊,并設計相應的社團挖掘算法。通過計算方法,可方便、快速、低成本地提取復雜系統中的元數據組。本質上計算方法包含兩個核心科學問題:首先是如何定義社團,即根據網絡中元數據組的特性怎樣描述社團的問題,如傳統地將社團定義為內部邊連接緊密外部邊稀疏的稠密子圖;其次是如何設計出高效的社團挖掘算法。本文主要基于圖論圍繞社團挖掘的兩個核心科學問題開展研究,提出了邊反三角中心性并基于此設計了社團挖掘算法EACH,定義了兩種新型社團,即Cograph社團和2-club社團,并設計了相應的挖掘算法EPCA和DIVANC,最后給出2-club社團在蛋白質相互作用網絡中功能模塊結構分析中的應用。主要內容詳述如下:1.針對當前經典的社團挖掘算法主要基于內緊外松的社團設計,且較多的考慮如何設計衡量社團‘內緊’的相關指標而對‘外松’關注不夠的問題。本文從路徑外切的角度提出用來衡量社團外松程度的邊反三角中心性,進一步基于此設計劃分的社團挖掘算法EACH。邊反三角中心性定義為該邊參與形成的P4(由4個節(jié)點,3條邊組成的簡單路徑)數目與該邊參與的可能的P4數目之比。EACH迭代地刪去邊反三角中心性最高的邊直到當前所有邊的邊反三角中心性全部為0,并基于孤立點處理策略將刪邊過程中的孤立點加入到當前合適的連通分支中,輸出的連通分支即為社團。該算法簡單高效,挖掘的社團緊湊,物理意義顯著。2.以稠密子圖為中尺度結構研究復雜網絡中元數據組時忽略了元數據組內部的結構,然而其對理解該元數據組的功能形成機制及運作模式至關重要。為了探索元數據組各成員之間的內部拓撲結構,定義了具有圖論特征的Cograph社團,其結構唯一地對應著Cotree;贑otree,可進一步清晰地揭示Cograph社團中規(guī)模和層次可變且結構等價的子結構,為用戶提供從微觀尺度到中觀尺度連續(xù)觀察社團內部結構構造的獨特視角。本文提出邊P4中心性并基于此設計了挖掘Cograph社團的劃分算法EPCA。邊P4中心性定義為該邊參與形成P4的數目,由于P4是由4個節(jié)點3條邊順序相連組成的簡單無環(huán)路徑,所以若邊P4中心性較大,這條邊可能是連接社團之間的邊。EPCA迭代地刪去邊P4中心性最高的邊,刪到當前網絡中所有邊的邊P4中心性為0停時結束,當前連通分支全為Cograph,輸出其作為Cograph社團。源于邊P4中心性,EPCA計算成本低、無參數、不依賴任何外部的度量。3.面對當前復雜網絡社團挖掘研究中遇到的挑戰(zhàn):內緊外松的稠密子圖并不能刻畫元數據組的本質特征,以及受蛋白質相互作用網絡中的元數據組由稠密的相互作用三元組模體組成啟發(fā),本文定義了富含相互作用三元組的2-club社團刻畫復雜網絡中的元數據組。2-club社團是連通的2-club,而2-club是直徑為2的子圖;谶呅∩持行男栽O計了算法DIVANC挖掘2-club社團,進一步提出簡單的兩步重疊策略將DIVANC擴展成可挖掘重疊的2-club社團。構造邊小生境中心性時考慮其參與的P4數目和其參與形成的三角形數目。邊小生境中心性較大,該邊可能是社團之間的邊。DIVANC迭代地刪去邊小生境中心性最高的邊,刪到當前網絡中所有邊的邊小生境中心性為0停時結束,當前連通分支全為2-club,輸出非重疊2-club社團,若需要重疊2-club社團,執(zhí)行兩步重疊策略,得到重疊2-club社團。4.透徹地理解PPI網絡中的功能模塊仍然是當前系統生物學研究中的一個重要挑戰(zhàn)。針對當前大多數關于模塊的分析主要集中在如何從整個PPI網絡中挖掘具有生物意義的功能模塊,而較少關注其內部結構,沒有給出其內部結構與功能的關聯分析的問題,本文基于圖論分析功能模塊的內部結構,揭示功能模塊的相關特征情況,并進一步推斷其在整個PPI網絡中的功能,給出了理解PPI網絡中功能模塊組織結構和特性分布的新視角。
[Abstract]:Network science is one of the powerful tools to study the relationship between things in the real world. Modeling complex systems as complex networks is a prerequisite for further analysis and study of relationships between real-world objects, where points represent objects in complex systems, and edges represent relations between objects. Faced with the complex network modeled, the urgent problem is to explore the information contained in the network, and reveal the composition and physical meaning of the network from a moderate scale. That is to say, how to extract metadata from complex networks. Metadata group is a set of nodes that have real physical meaning. It can verify whether all kinds of subset of nodes have the same or similar functions to extract metadata group by experiment. Because of the objective limitations of material cost, labor cost and time and cost, it is impossible to complete the enumerated verification in the face of the current mass data. Therefore, people turn their attention to the calculation method. The specific process is to first model the complex system to be complex network, and model the metadata group, such as usually modeling as a community or module, and design the corresponding community mining algorithm. Through the calculation method, the metadata group in the complex system can be extracted conveniently, fast and low. The essence of calculation method consists of two core scientific problems: the first is how to define the community, according to the characteristics of the network in data set to describe associations, such as the traditional community is defined as the internal side is tightly connected with the external side of the sparse dense sub graph; second is how to design an efficient algorithm for mining association. In this paper, two key scientific problems in graph theory on Community Mining Based on the research, put forward and reverse triangle edge center is designed based on this association mining algorithm EACH, defines two new clubs, namely the Cograph community and the 2-club community, and designed the corresponding mining algorithm EPCA and DIVANC application are presented in the 2-club community the function in the protein interaction network module in structural analysis. The main contents are as follows: 1., aiming at the current classical community mining algorithm, mainly based on tight and loose community design, and much consideration is given to how to design related indicators to measure the "tightness" of the society. From the angle of path cutting, this paper puts forward the edge anti triangle centrality which is used to measure the degree of community loosening, and further is based on the association mining algorithm EACH of this design. The side inverse triangle centrality is defined as the ratio of the number of P4 (a simple path composed of 4 nodes, 3 sides) to the possible number of P4 that the side participates in. EACH iteratively cuts out the top edge of the reverse triangle center, until the edges of all the edges are all 0, and based on the outlier processing strategy, we add the outliers in the edge deletion process to the suitable connected branches, and the output connected components are the communities. The algorithm is simple and efficient, the mining community is compact and the physical meaning is significant. 2., the dense subgraph as a mesoscale structure is used to study the meta data set in complex network. It ignores the internal structure of metadata group. However, it is very important for understanding the function mechanism and operation mode of the metadata group. In order to explore the internal topology between the members of the metadata group, the Cograph community with graph theory features is defined, whose structure is uniquely corresponding to the Cotree. Based on Cotree, we can further reveal clearly the variable and equivalent sub structure of Cograph community in terms of scale and hierarchy, and provide users with a unique perspective to continuously observe the internal structure of communities from microscale to mesoscale. In this paper, the edge P4 centrality is proposed and based on this, the partition algorithm for mining Cograph community is designed, EPCA. The edge P4 centrality is defined as the number of P4 that the side participates in, because P4 is a simple acyclic path consisting of 4 nodes and 3 edges sequentially, so if the center of the edge P4 is large, the edge may connect the edges between the communities. EPCA iteratively eliminates the edges with the highest P4 edge, and deletes the edges of all the edges in the current network. The P4 centrality is 0 at the end of the stopping time, and the current connected branches are all Cograph, and output them as Cograph communities. Due to the edge P4 centrality, EPCA has low computing cost, no parameter, and no external measurement. 3. in the face of the current complex network community mining research challenges: the essential characteristics of the tight loose dense subgraph can not describe the metadata group, and metadata group in protein interaction networks by the interaction of dense three tuple module body inspiration, this paper in the interaction of three tuple meaning 2-club Association describe the set of metadata in complex networks. The 2-club community is a connected 2-club, and 2-club is a subgraph with a diameter of 2. Based on the niche niche design, we design algorithm DIVANC to mine 2-club communities, and further propose a simple two step overlap strategy to extend DIVANC to 2-club communities that can be mined. The number of P4 and the number of triangles involved in the formation of the niche centrality are considered. The side niche is centrality, which may be the edge of the community. DIVANC iteratively deleting edges of the highest niche center, to delete all the edges in the network side of the center of the end of the 0 niche to stop, the connectedness of all 2-club, the output of non overlapping 2-club community, if need to overlap the 2-club community, the implementation of the two step strategy to overlap, overlapping the 2-club community. 4. a thorough understanding of the functional modules in the PPI network is still an important challenge in the current research of system biology. For most of the current on the module of the analysis focused on how to dig out from the entire PPI network function module has a biological significance, and less concerned about its internal structure, correlation analysis has given its internal structure and function of the problem, this paper analyze the internal structure of graph theory based on function module, function module is revealed
【學位授予單位】:西安電子科技大學
【學位級別】:博士
【學位授予年份】:2016
【分類號】:TP311.13;O157.5

【相似文獻】

相關期刊論文 前10條

1 鐘柯;肖昱;許s,

本文編號:1338770


資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/shoufeilunwen/jckxbs/1338770.html


Copyright(c)文論論文網All Rights Reserved | 網站地圖 |

版權申明:資料由用戶f6034***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com