天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

面向數(shù)據(jù)挖掘的關(guān)系型領域知識融合方法研究

發(fā)布時間:2018-12-10 12:21
【摘要】:現(xiàn)有數(shù)據(jù)挖掘技術(shù)所面向的數(shù)據(jù)大多是在原始層次上的,相應的挖掘方法是無領域知識融合,或者是依賴于用戶參與的人工方式融合領域知識來實現(xiàn)知識發(fā)現(xiàn)的過程。然而,實際應用領域的數(shù)據(jù)存在層次上的差異,有些數(shù)據(jù)是原始級的,還有些數(shù)據(jù)與其他一些數(shù)據(jù)密切相關(guān),并且采用這些相關(guān)數(shù)據(jù)的適當?shù)慕M合或泛化粒度可能更好地揭示其內(nèi)在的規(guī)律。因此,充分利用與原始數(shù)據(jù)相關(guān)的領域知識指導數(shù)據(jù)挖掘的工作,能“從極不相同的粒度上觀察和分析同一問題”,達到在合理的數(shù)據(jù)層次上獲取知識,在不同的數(shù)據(jù)層次上靈活轉(zhuǎn)換,做到往返自如,毫無困難,這成為重要的研究課題。鑒于實際應用領域中,大量的數(shù)據(jù)存在著以屬性擴展或延伸為代表形式的領域知識,而此類領域知識大多采用關(guān)系表的形式出現(xiàn)。因此,本文重點研究關(guān)系型領域知識的表示及其與數(shù)據(jù)挖掘研究工作融合的方法,從而自動有效的開展知識發(fā)現(xiàn)工作。本文主要研究工作如下:(1)提出基于關(guān)系模型領域知識的結(jié)構(gòu)化表示模型DKMRM (Domain Knowledge of Multi-Relations Model,DKMRM)。模型中采用關(guān)系模型對數(shù)據(jù)表中的相關(guān)屬性的領域知識進行映射或投影,從而構(gòu)成領域知識的上下文關(guān)系表,進而形成了復雜的多關(guān)系表示模型。在面向關(guān)系型數(shù)據(jù)庫系統(tǒng)進行挖掘時,利用這種模型和必要的變換策略,可以將某些原始數(shù)據(jù)泛化或例化到合理的層次,以獲得更符合用戶個性化需求的知識形式。(2)基于DKMRM的數(shù)據(jù)挖掘研究工作。提出面向數(shù)據(jù)挖掘的關(guān)系型領域知識融合方法。以分類問題為實際案例,建立融合關(guān)系型領域知識的分類挖掘方法框架。針對傳統(tǒng)挖掘方法存在的局限性,本方法框架有效解決傳遞源、傳遞路徑、終止策略、傳遞的偏差統(tǒng)計等關(guān)鍵問題。(3)提出基于屬性選擇的多關(guān)系分類挖掘算法CC-DKMR ( Classification of Characters based on Domain Knowledge of Multi-Relations,CC-DKMR)和基于關(guān)系表選擇的多關(guān)系分類挖掘算法 CS-DKMR (Classification of Sheets based on Domain Knowledge of Multi-Relations,CS-DKMR),以尋求在不同的數(shù)據(jù)粒度層次上挖掘模式和靈活的轉(zhuǎn)換機制,從領域知識中獲取更有價值的知識。實驗表明此方法是有效的。(4)提出在數(shù)據(jù)挖掘的評測階段融合領域知識的挖掘算法的評測方法,解決數(shù)據(jù)挖掘的算法(程序)存在的“oracle”現(xiàn)象,傳統(tǒng)的評測方法難以具有適應性的問題;谕懽儨y試技術(shù),該方法有效利用領域知識,并針對分類、關(guān)聯(lián)、聚類挖掘算法的具體案例開展研究分析,構(gòu)造了針對具體算法的蛻變關(guān)系。實驗結(jié)果表明,此方法能有效達到評測目的,并具有適用其它領域的推廣可行性。
[Abstract]:Most of the existing data mining technologies are based on the original level. The corresponding mining methods are domainless knowledge fusion or the process of realizing knowledge discovery by integrating domain knowledge with the user's participation. However, there are hierarchical differences in data in practical application areas, some of which are raw, others that are closely related to others, And the proper combination or generalization granularity of these related data may better reveal its inherent law. Therefore, to make full use of domain knowledge related to raw data to guide the work of data mining, we can "observe and analyze the same problem from very different granularity", so as to obtain knowledge at a reasonable data level. Flexible conversion at different data levels, free commutation, no difficulty, this has become an important research topic. In view of the fact that a large number of data exist in the field of practical application, there is domain knowledge in the form of attribute extension or extension, and most of such domain knowledge appears in the form of relational tables. Therefore, this paper focuses on the representation of relational domain knowledge and its fusion with data mining research, so that knowledge discovery can be carried out automatically and effectively. The main work of this paper is as follows: (1) A structured representation model based on relational model domain knowledge (DKMRM (Domain Knowledge of Multi-Relations Model,DKMRM) is proposed. In the model, the relational model is used to map or project the domain knowledge of the related attributes in the data table, so as to form the contextual table of domain knowledge, and then form a complex multi-relational representation model. When mining for relational database system, some raw data can be generalized or exemplified to a reasonable level by using this model and necessary transformation strategy. (2) the research work of data mining based on DKMRM. A relational domain knowledge fusion method for data mining is proposed. Taking the classification problem as a practical case, the framework of classification mining method for integrating relational domain knowledge is established. In view of the limitations of traditional mining methods, the framework of this method effectively solves the problem of transfer source, transfer path and termination strategy. (3) A multi-relational classification mining algorithm CC-DKMR (Classification of Characters based on Domain Knowledge of Multi-Relations, based on attribute selection is proposed. CC-DKMR) and CS-DKMR (Classification of Sheets based on Domain Knowledge of Multi-Relations,CS-DKMR), a multi-relational classification mining algorithm based on relational table selection, to seek for mining patterns and flexible transformation mechanisms at different data granularity levels. Acquire more valuable knowledge from domain knowledge. Experimental results show that this method is effective. (4) A method for evaluating the fusion of domain knowledge in the evaluation stage of data mining is proposed to solve the "oracle" phenomenon in the algorithm (program) of data mining. It is difficult for traditional evaluation methods to be adaptive. Based on the metamorphosis testing technology, the method effectively utilizes domain knowledge, and carries out research and analysis on the specific cases of classification, association and clustering mining algorithm, and constructs the metamorphosis relation for the specific algorithm. The experimental results show that this method can effectively achieve the purpose of evaluation and is applicable to other fields.
【學位授予單位】:合肥工業(yè)大學
【學位級別】:博士
【學位授予年份】:2016
【分類號】:TP311.13

【參考文獻】

相關(guān)期刊論文 前10條

1 謝亮;張晶;胡學鋼;;主從關(guān)系數(shù)據(jù)庫中關(guān)聯(lián)規(guī)則挖掘算法研究[J];合肥工業(yè)大學學報(自然科學版);2009年05期

2 董國偉;徐寶文;陳林;聶長海;王璐璐;;蛻變測試技術(shù)綜述[J];計算機科學與探索;2009年02期

3 彭珍;楊炳儒;李冬艷;侯偉;寧頂利;;多關(guān)系數(shù)據(jù)分類方法綜述[J];計算機工程與應用;2008年34期

4 何軍;劉紅巖;杜小勇;;挖掘多關(guān)系關(guān)聯(lián)規(guī)則[J];軟件學報;2007年11期

5 徐光美;楊炳儒;張偉;寧淑榮;;多關(guān)系數(shù)據(jù)挖掘方法研究[J];計算機應用研究;2006年09期

6 李道國;苗奪謙;杜偉林;;粒度計算在人工神經(jīng)網(wǎng)絡中的應用[J];同濟大學學報(自然科學版);2006年07期

7 ;A Granular Computing Model Based on Tolerance relation[J];The Journal of China Universities of Posts and Telecommunications;2005年03期

8 朱靖波,陳文亮;基于領域知識的文本分類[J];東北大學學報;2005年08期

9 吳鵬,施小純,唐江峻,林惠民,陳宗岳;關(guān)于蛻變測試和特殊用例測試的實例研究(英文)[J];軟件學報;2005年07期

10 李道國,苗奪謙,張紅云;粒度計算的理論、模型與方法[J];復旦學報(自然科學版);2004年05期



本文編號:2370556

資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/shoufeilunwen/xxkjbs/2370556.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶0e28b***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com