含有計(jì)數(shù)量詞的圖模式匹配研究與實(shí)現(xiàn)

發(fā)布時(shí)間：2019-03-09 18:32

【摘要】：計(jì)數(shù)量詞作為一種增強(qiáng)表達(dá)的方式,加入到圖模式匹配中可以更加準(zhǔn)確地描述客觀世界。通過簡單地在查詢圖的邊上附加計(jì)數(shù)量詞可以很自然的表達(dá)出全部和存在量化表達(dá),比例和數(shù)量聚集表達(dá),以及否定表達(dá)。這種增加了表達(dá)能力的圖模式匹配在社會(huì)媒體營銷,知識(shí)發(fā)現(xiàn),網(wǎng)絡(luò)安全等領(lǐng)域都有廣泛且直接的應(yīng)用。關(guān)于量詞化的圖模式匹配問題的研究目前仍然比較新,而且該問題本身泛化了傳統(tǒng)圖模式匹配問題條件,傳統(tǒng)圖模式匹配問題僅屬于本文問題的一個(gè)特例。因而本文所研究的量詞化的圖模式匹配問題復(fù)雜度遠(yuǎn)難于傳統(tǒng)的圖模式匹配。用邊上計(jì)數(shù)量詞是否包含否定計(jì)數(shù)量詞可以將該問題劃分成兩個(gè)復(fù)雜度相異的兩個(gè)子問題,第一類子問題模式圖邊上只包含肯定的計(jì)數(shù)量詞,它的復(fù)雜度可以證明是NP完全的,并且在特定條件下復(fù)雜度的上下界都為NP難,而含有否定計(jì)數(shù)量詞邊的匹配問題復(fù)雜度則是DP完全的,特定條件下復(fù)雜度上下界為DP難,如此高的復(fù)雜度使得蠻力法解決該問題的開銷無法令人接受。本文實(shí)現(xiàn)了(1)正QGP問題的匹配算法QMatch,該算法通過調(diào)用子圖同構(gòu)子算法而改進(jìn)實(shí)現(xiàn),本文通過實(shí)驗(yàn)驗(yàn)證發(fā)現(xiàn)算法在模式圖規(guī)模在4以內(nèi)時(shí)的匹配時(shí)間是比較好的,規(guī)模變大后的匹配時(shí)間呈指數(shù)式增長。(2)負(fù)QGP問題的匹配算法Inc QMatch,該算法避免了直接驗(yàn)證負(fù)計(jì)數(shù)量詞的高復(fù)雜度,在正問題解決的基礎(chǔ)上,利用集差思想,構(gòu)建兩個(gè)新的查詢圖結(jié)構(gòu)得到最后的結(jié)果集,通過記錄終結(jié)結(jié)果集的方式可以避免第二次查詢從頭開始有效地減少了運(yùn)行時(shí)間,并且通過多線程的方式對(duì)該問題的匹配實(shí)現(xiàn)了并行化,實(shí)驗(yàn)表明當(dāng)邊上的負(fù)計(jì)數(shù)量詞個(gè)數(shù)只有一時(shí),Inc QMatch的時(shí)間開銷在查詢圖規(guī)模超過4以后同樣呈顯指數(shù)式的增長。(3)利用哈希編碼技術(shù)給節(jié)點(diǎn)周圍邊標(biāo)簽和節(jié)點(diǎn)標(biāo)簽進(jìn)行編碼,通過與的手段壓縮周圍結(jié)構(gòu)信息,并將周圍的標(biāo)簽種類信息壓縮到一串二進(jìn)制字串當(dāng)中,通過或運(yùn)算匹配節(jié)點(diǎn),得到小規(guī)模候選集合,在本文實(shí)驗(yàn)中,使用8+8位編碼,可以過濾掉95%的不匹配,優(yōu)化效率非常明顯,且在查詢圖規(guī)模是6的時(shí)候都表現(xiàn)出了小于10秒的運(yùn)行時(shí)間,優(yōu)化效果非常顯著。
[Abstract]:Counting quantifiers, as an enhanced expression, can describe the objective world more accurately when added to graph pattern matching. By simply appending quantifiers to the edges of the query graph, it is natural to express all and existing quantitative expressions, proportion and number aggregation, and negative expressions. This kind of graph pattern matching, which increases the expression ability, is widely and directly applied in the fields of social media marketing, knowledge discovery, network security and so on. The research on quantized graph pattern matching problem is still relatively new, and the problem itself generalizes the condition of traditional graph pattern matching problem, and the traditional graph pattern matching problem is only a special case of this problem. Therefore, the complexity of quantized graph pattern matching problem studied in this paper is far more difficult than the traditional graph pattern matching problem. Whether the edge quantifier contains negative quantifier can divide the problem into two sub-problems with different complexity. The first subproblem contains only positive quantifiers on the edge of the pattern diagram, and its complexity can be proved to be NP complete. And the upper and lower bounds of complexity are difficult for NP under certain conditions, while the complexity of matching problems with negative number of words is DP complete, and the upper and lower bounds of complexity are difficult for DP under specific conditions. Such high complexity makes the cost of brute force solving the problem unacceptable. In this paper, (1) the matching algorithm QMatch, for positive QGP problem is implemented. This algorithm is improved by calling the subgraph isomorphism sub-algorithm. The experiment shows that the matching time of the algorithm is better when the scale of the pattern graph is less than 4, and the matching time of the algorithm is better when the scale of the pattern graph is less than 4. The matching time increases exponentially after the scale increases. (2) the matching algorithm Inc QMatch, of the negative QGP problem avoids the high complexity of verifying the negative quantifier directly, on the basis of solving the positive problem, the idea of set difference is used. Building two new query graph structures to get the final result set, by recording the end result set, can prevent the second query from starting from scratch effectively reducing the run time. And the problem is parallelized by multi-thread matching. The experiment shows that when the number of negative quantifiers on the edge is only one time, the number of negative quantifiers on the edge is only one time. The time cost of Inc QMatch increases exponentially after the size of the query graph exceeds 4. (3) Hash coding technology is used to encode the edge tags and node labels around the nodes and compress the surrounding structural information by means of the same method. And the surrounding label type information is compressed into a string of binary strings, and small-scale candidate sets can be obtained by or operation of matching nodes. In this experiment, 95% mismatch can be filtered out by using 8.8-bit encoding. The optimization efficiency is very obvious, and the run time is less than 10 seconds when the query graph scale is 6, and the optimization effect is very remarkable.
【學(xué)位授予單位】：哈爾濱工業(yè)大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類號(hào)】：TP391.41

【參考文獻(xiàn)】

相關(guān)期刊論文前6條

1 張麗霞;王偉平;高建良;王建新;;利用局部評(píng)估的分布式圖模式匹配算法[J];國防科技大學(xué)學(xué)報(bào);2016年02期

2 張麗霞;王偉平;高建良;王建新;;面向模式圖變化的增量圖模式匹配[J];軟件學(xué)報(bào);2015年11期

3 于靜;劉燕兵;張宇;劉夢雅;譚建龍;郭莉;;大規(guī)模圖數(shù)據(jù)匹配技術(shù)綜述[J];計(jì)算機(jī)研究與發(fā)展;2015年02期

4 呂雪棟;馮志勇;王鑫;饒國政;付宇新;;StepMatch:一種基于BSP計(jì)算模型的SPARQL基本圖模式匹配算法[J];計(jì)算機(jī)研究與發(fā)展;2013年S2期

5 張航;王宏志;李建中;高宏;;基于2-hop優(yōu)化的子圖模式匹配算法[J];黑龍江大學(xué)自然科學(xué)學(xué)報(bào);2010年01期

6 吳祉群;吉方;黃文;;基于遺傳算法圖像模式匹配的收斂性研究[J];現(xiàn)代電子技術(shù);2007年01期

，

本文編號(hào)：2437744

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://www.sikaile.net/kejilunwen/ruanjiangongchenglunwen/2437744.html

上一篇：數(shù)字圖像處理中二維經(jīng)驗(yàn)?zāi)Ｊ椒纸怅P(guān)鍵問題研究
下一篇：面向嵌入式系統(tǒng)的Android圖形系統(tǒng)的再設(shè)計(jì)

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級(jí)|國家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

含有計(jì)數(shù)量詞的圖模式匹配研究與實(shí)現(xiàn)