稀疏數(shù)據(jù)的貝葉斯網(wǎng)絡(luò)結(jié)構(gòu)學(xué)習(xí)
本文選題:稀疏數(shù)據(jù) 切入點(diǎn):貝葉斯網(wǎng)絡(luò) 出處:《山東師范大學(xué)》2017年碩士論文
【摘要】:圖模型被廣泛用于表示和分析隨機(jī)變量之間的因果關(guān)系以及條件獨(dú)立性.圖模型中主要包括有向無環(huán)圖,無向圖和鏈圖.其中有向無環(huán)圖,也被稱作貝葉斯網(wǎng)絡(luò),圖中的邊都是有向邊,并且不能構(gòu)成有向環(huán).貝葉斯網(wǎng)絡(luò)用來描述隨機(jī)變量的因果關(guān)系.本文主要提出對(duì)貝葉斯網(wǎng)絡(luò)結(jié)構(gòu)進(jìn)行學(xué)習(xí)的算法.貝葉斯網(wǎng)絡(luò)結(jié)構(gòu)學(xué)習(xí)算法主要有三類:1基于獨(dú)立性檢驗(yàn)的約束算法;2基于評(píng)分搜索的算法;3將獨(dú)立性檢驗(yàn)和評(píng)分搜索綜合利用起來的算法.2008年,Xie和Geng針對(duì)貝葉斯網(wǎng)絡(luò)結(jié)構(gòu)學(xué)習(xí)提出貝葉斯網(wǎng)絡(luò)結(jié)構(gòu)學(xué)習(xí)的遞歸分解算法.這個(gè)算法是將大規(guī)模的貝葉斯網(wǎng)絡(luò)結(jié)構(gòu)學(xué)習(xí)問題遞歸地分割為較小規(guī)模的結(jié)構(gòu)學(xué)習(xí)的問題.這個(gè)算法主要應(yīng)用無向獨(dú)立圖的構(gòu)建,這就導(dǎo)致了它有兩個(gè)困難:一是在數(shù)據(jù)稀疏,變量較多的情況下無向獨(dú)立圖的構(gòu)建不夠準(zhǔn)確;二是當(dāng)變量較多時(shí),無向圖的構(gòu)建也是比較復(fù)雜的.2013年Cai等人提出了可測(cè)因果分割算法(Scalable cAusation Discovery Al-gorithm, SADA) .這個(gè)算法將變量集 V 分割成三個(gè)集合 (V1,C,V2),其 中只要保證在給定C的情況下,V1與V2之間沒有邊直接相連即可,并不需要它們條件獨(dú)立,所以能夠解決數(shù)據(jù)稀疏的困難.但是,Cai的算法在合并的過程中有可能出現(xiàn)假邊,針對(duì)這個(gè)問題,本論文提出一種再學(xué)習(xí)的檢查學(xué)習(xí)算法,這種算法的提出,結(jié)合了Cai和Xie算法中的優(yōu)勢(shì),解決了稀疏數(shù)據(jù)貝葉斯網(wǎng)絡(luò)結(jié)構(gòu)學(xué)習(xí)的問題.本文提出的算法,首先將一個(gè)貝葉斯網(wǎng)絡(luò)的變量集合不斷地調(diào)用可測(cè)因果分割算法進(jìn)行分割;然后在每一組因果分割上,先進(jìn)行局部結(jié)構(gòu)學(xué)習(xí)再進(jìn)行合并得到可能有假邊存在整體結(jié)構(gòu);最后尋找因果分割集及其鄰居集合,在這之上調(diào)用再學(xué)習(xí)檢查算法,進(jìn)行修正學(xué)習(xí)以得到正確貝葉斯網(wǎng)絡(luò)的骨架圖,最后利用Meek準(zhǔn)則確定出等價(jià)類來.
[Abstract]:Graph models are widely used to express and analyze causality and conditional independence between random variables.The graph model mainly includes directed acyclic graph, undirected graph and chain graph.The directed acyclic graph is also called Bayesian network. The edges of the graph are directed edges and cannot form directed rings.Bayesian networks are used to describe the causality of random variables.In this paper, a learning algorithm for Bayesian network structure is proposed.There are three kinds of Bayesian network structure learning algorithms: one is a constraint algorithm based on independence test; the other is an algorithm based on score search. In 2008, Xie and Geng aimed at BayesianA recursive decomposition algorithm for learning Bayesian network structure is proposed.This algorithm recursively divides the large-scale Bayesian network structure learning problem into smaller scale structural learning problems.This algorithm mainly applies the construction of undirected independent graph, which leads to two difficulties: one is that the construction of undirected independent graph is not accurate enough when the data is sparse, and the other is that when there are more variables, the construction of undirected independent graph is not accurate enough.The construction of undirected graph is also complicated. In 2013, Cai et al proposed scalable cAusation Discovery algorithm (SADAA).In this algorithm, the variable set V is divided into three sets: v _ 1 / C _ 1 / V _ 2, which ensures that there is no direct connection between V _ 1 and V _ 2 under a given C condition, and they do not need to be conditional independent, so the difficulty of data sparsity can be solved.However, false edges may appear in the merging process of Cai algorithm. In view of this problem, this paper proposes a relearning check learning algorithm, which combines the advantages of Cai and Xie algorithms.The problem of sparse data Bayesian network structure learning is solved.The algorithm proposed in this paper firstly segments a set of Bayesian network variables by calling the testable causality segmentation algorithm continuously, and then on each set of causal segmentation,The local structure learning is carried out first, then the global structure with false edges is obtained. Finally, the causal partition set and its neighbor set are found, and then the learning and re-checking algorithm is called.The correct skeleton diagram of Bayesian network is obtained by modified learning, and the equivalent class is determined by using Meek criterion.
【學(xué)位授予單位】:山東師范大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:F224
【相似文獻(xiàn)】
相關(guān)期刊論文 前1條
1 祝琴;戴愛明;;高維稀疏數(shù)據(jù)對(duì)象—屬性的非關(guān)聯(lián)子空間分析[J];中國管理信息化;2011年09期
相關(guān)會(huì)議論文 前2條
1 李博多;李建中;高宏;彭麗萍;;一種支持大規(guī)模稀疏數(shù)據(jù)表上相似性查詢的索引設(shè)計(jì)[A];第二十五屆中國數(shù)據(jù)庫學(xué)術(shù)會(huì)議論文集(一)[C];2008年
2 陳郁馨;程序;趙鵬;孟必平;李紅燕;王騰蛟;;云環(huán)境中一種面向海量稀疏數(shù)據(jù)存儲(chǔ)的缺失值處理方法[A];第29屆中國數(shù)據(jù)庫學(xué)術(shù)會(huì)議論文集(B輯)(NDBC2012)[C];2012年
相關(guān)博士學(xué)位論文 前2條
1 曾玉華;稀疏數(shù)據(jù)恢復(fù)的結(jié)構(gòu)優(yōu)化模型及其算法研究[D];湖南大學(xué);2016年
2 袁宇;稀疏數(shù)據(jù)條件下河流入海污染物通量的估算[D];東北大學(xué);2009年
相關(guān)碩士學(xué)位論文 前10條
1 楊玉;稀疏數(shù)據(jù)的貝葉斯網(wǎng)絡(luò)結(jié)構(gòu)學(xué)習(xí)[D];山東師范大學(xué);2017年
2 陳婷婷;面向稀疏數(shù)據(jù)的微分隱私數(shù)據(jù)發(fā)布[D];福州大學(xué);2014年
3 劉帥;高維稀疏數(shù)據(jù)的相關(guān)性度量方法研究[D];首都經(jīng)濟(jì)貿(mào)易大學(xué);2014年
4 尹松;高屬性維稀疏數(shù)據(jù)動(dòng)態(tài)抽象聚類方法研究[D];廣西大學(xué);2005年
5 張珍;貝葉斯Meta-分析[D];山東師范大學(xué);2017年
6 任健;基于壓縮感知的非均勻空間立體陣SAR三維層析成像[D];哈爾濱工業(yè)大學(xué);2014年
7 李曉萍;有約束條件優(yōu)化問題的MM算法[D];蘭州大學(xué);2017年
8 田正東;基于子空間分析的DOA估計(jì)算法研究[D];南京郵電大學(xué);2017年
9 趙程檐;花授粉算法的研究及應(yīng)用[D];廣西民族大學(xué);2017年
10 葉曉平;高階多模型狀態(tài)估計(jì)算法及應(yīng)用[D];哈爾濱工業(yè)大學(xué);2017年
,本文編號(hào):1728547
本文鏈接:http://www.sikaile.net/jingjifazhanlunwen/1728547.html