基于關(guān)聯(lián)規(guī)則的查詢擴(kuò)展技術(shù)研究
[Abstract]:With the rapid increase of network information, it is still difficult to find the exact information that people want through search engine, and the query rate is not high and the precision rate is low, which becomes the urgent problem that search engine needs to solve. In order to solve this problem, this paper studies the query extension technology based on association rules according to the viewpoint of Van Rijsbergen scholars to improve the retrieval ability by modifying the original query. The main contents are as follows: 1. Firstly, the basic contents of this paper: data mining, association rules, query expansion, detailed introduction, and analysis of the existing query extension technology based on association rules. Pointing out the advantages and disadvantages, aiming at the common shortcomings: the existing query expansion algorithms based on association rules do not pay attention to the mining efficiency of association rules mining algorithms and whether the mining algorithms are suitable or not. 2. Aiming at the above problems, this paper proposes a query expansion algorithm based on maximum frequent itemset mining for the first time, which adopts the query technology based on vector space model. The first retrieval of n documents is partitioned, the processed participle is represented by vertical data format, the support degree of item set is obtained by the method of intersection, and the data structure of set enumeration tree is adopted at the same time. A certain pruning strategy is used to mine the maximum frequent itemsets, and the extended lexicon is obtained, and the extended words are combined with the initial query words for secondary retrieval. Experimental results show that compared with the previous algorithms, the efficiency of the algorithm is improved. 3. The query expansion algorithm based on maximum frequent itemsets mining is proposed in this paper. It is based on the assumption that the importance of the original query word and the extension word is the same, and the weight of the original query word and the extended word is not considered. At the same time, the maximal frequent itemsets are mined, and the support degree information of some frequent items is lost. To solve the above problems, this paper proposes a query expansion algorithm based on frequently closed itemsets. The algorithm adopts HT-struct link structure, adopts depth-first search strategy, combines certain pruning technology, mining frequent closed itemsets, obtains association rules, and obtains extended lexicon. At the same time, the algorithm measures the weight of extended words according to the confidence degree of the rules. Experiments show that the efficiency of the algorithm is improved and the algorithm is feasible.
【學(xué)位授予單位】:解放軍信息工程大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP311.13
【參考文獻(xiàn)】
相關(guān)期刊論文 前5條
1 黃美璇;;基于主題發(fā)現(xiàn)的輿情分析系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[J];北京聯(lián)合大學(xué)學(xué)報(bào)(自然科學(xué)版);2012年01期
2 黃名選;嚴(yán)小衛(wèi);張師超;;查詢擴(kuò)展技術(shù)進(jìn)展與展望[J];計(jì)算機(jī)應(yīng)用與軟件;2007年11期
3 崔航,文繼榮,李敏強(qiáng);基于用戶日志的查詢擴(kuò)展統(tǒng)計(jì)模型[J];軟件學(xué)報(bào);2003年09期
4 黃名選;嚴(yán)小衛(wèi);張師超;;基于矩陣加權(quán)關(guān)聯(lián)規(guī)則挖掘的偽相關(guān)反饋查詢擴(kuò)展[J];軟件學(xué)報(bào);2009年07期
5 繆裕青;金波;陳國良;;HTCLOSE:快速挖掘微陣列數(shù)據(jù)集中的頻繁閉合模式[J];小型微型計(jì)算機(jī)系統(tǒng);2008年02期
相關(guān)博士學(xué)位論文 前2條
1 繆裕青;關(guān)聯(lián)規(guī)則挖掘及其在基因表達(dá)數(shù)據(jù)中的應(yīng)用[D];中國科學(xué)技術(shù)大學(xué);2007年
2 米楊;基于頂級(jí)本體整合的醫(yī)學(xué)領(lǐng)域語義標(biāo)注研究[D];吉林大學(xué);2012年
相關(guān)碩士學(xué)位論文 前7條
1 周劍烽;基于語義本體的信息檢索方法的研究[D];杭州電子科技大學(xué);2010年
2 唐蓉;搜索引擎重復(fù)網(wǎng)頁檢測技術(shù)研究[D];重慶理工大學(xué);2011年
3 譚義紅;關(guān)聯(lián)規(guī)則挖掘及其在概念檢索中的應(yīng)用研究[D];湖南大學(xué);2003年
4 薛云;Internet上元搜索引擎的研究與設(shè)計(jì)[D];太原理工大學(xué);2003年
5 朱冀;以概念分層為背景知識(shí)的關(guān)聯(lián)規(guī)則挖掘算法的分析[D];電子科技大學(xué);2004年
6 黃名選;基于完全加權(quán)關(guān)聯(lián)規(guī)則挖掘的查詢擴(kuò)展研究[D];廣西師范大學(xué);2007年
7 彭程;關(guān)聯(lián)規(guī)則在搜索引擎中的應(yīng)用及研究[D];西安理工大學(xué);2010年
本文編號(hào):2213026
本文鏈接:http://www.sikaile.net/kejilunwen/sousuoyinqinglunwen/2213026.html