天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 科技論文 > 搜索引擎論文 >

基于本體和用戶日志的查詢擴展研究

發(fā)布時間:2018-05-03 19:12

  本文選題:本體 + 查詢擴展; 參考:《湖南大學》2013年碩士論文


【摘要】:隨著因特網信息的爆炸式增長,用戶如何從大量的信息中獲取自己真正想要的信息變得越來越棘手。搜索引擎在一定程度上解決了用戶查找有用信息的問題。但用戶在使用搜索引擎時往往無法準確表達自己的查詢意圖,經常出現(xiàn)查詢詞使用不當或者查詢詞過短等問題導致搜索引擎查全率和查準率低下,無法返回有用信息。對用戶查詢進行擴展變得十分迫切。 查詢擴展技術經歷了幾十年的發(fā)展,國內外的研究人員已提出多種查詢擴展方法。然而這些常見方法在進行擴展時往往不能從語義層面理解用戶輸入,且因其擴展詞的來源具有不確定性,容易加入查詢無關詞,造成“查詢漂移”問題。本文結合領域本體和用戶查詢日志提出一種基于本體和用戶日志的查詢擴展算法。利用領域本體從語義層面擴展用戶查詢形成初始擴展概念集,結合用戶查詢日志利用詞共現(xiàn)分析對初始擴展概念集進行二次篩選。主要內容如下: (1)闡述了課題的研究背景與意義,分析了當前查詢擴展技術的研究進展與存在的不足、對課題相關的背景知識和相關理論作了介紹,為后文研究工作的開展奠定了理論基礎。 (2)提出了一種基于本體的概念語義相似度計算公式,對候選擴展詞進行語義相似度計算,從語義層面對用戶查詢進行擴展。 (3)提出了一種基于用戶日志的詞共現(xiàn)計算公式,,對初始擴展詞進行詞共現(xiàn)計算,以計算結果作為擴展詞的詞共現(xiàn)權值,結合擴展詞的語義相似度權值和詞共現(xiàn)權值進行二次篩選,從而避免初始擴展易出現(xiàn)的“查詢漂移”問題。 (4)根據(jù)本文提出的基于本體和用戶日志的查詢擴展算法,結合國產軟硬件售后服務跟蹤系統(tǒng)的查詢需求設計并實現(xiàn)了一個原型系統(tǒng)。介紹了系統(tǒng)的整體框架及各個組成模塊。最后在該系統(tǒng)上進行了對比實驗測試。實驗結果表明,與傳統(tǒng)的查詢擴展方法相比較,本文方法在保障良好魯棒性的同時,有效地提高了檢索準確率。
[Abstract]:With the explosive growth of Internet information, it becomes more and more difficult for users to obtain the information they really want from a large amount of information. Search engine solves the problem of searching useful information to some extent. However, when users use search engines, they often can not express their query intention accurately. Problems such as improper use of query words or too short query words often lead to low recall and precision of search engines, which can not return useful information. It is urgent to extend user queries. Query extension technology has experienced decades of development, researchers at home and abroad have proposed a variety of query expansion methods. However, these common methods are often unable to understand user input from the semantic level, and because of the uncertainty of the source of the extension words, it is easy to add query independent words, resulting in the problem of "query drift". This paper presents an extended query algorithm based on domain ontology and user log. Domain ontology is used to extend user query from semantic level to form initial extended concept set. Combined with user query log, the initial extended concept set is filtered twice by word cooccurrence analysis. The main contents are as follows: 1) the research background and significance of the subject are expounded, the research progress and shortcomings of the current query extension technology are analyzed, and the related background knowledge and related theories are introduced, which lays a theoretical foundation for the later research work. (2) an ontology-based formula for calculating semantic similarity of concepts is proposed to calculate the semantic similarity of candidate extension words and to extend user queries from the semantic level. In this paper, a formula of word co-occurrence calculation based on user log is proposed, and the result is used as the word co-occurrence weight of the extended word. Combining the semantic similarity weights and co-occurrence weights of extended words, the problem of "query drift" which is easy to occur in initial extension can be avoided. 4) according to the query expansion algorithm based on ontology and user log proposed in this paper, a prototype system is designed and implemented according to the query requirements of domestic hardware and software after-sales service tracking system. The whole frame and each component module of the system are introduced. Finally, a comparative experiment was carried out on the system. The experimental results show that compared with the traditional query expansion method, this method not only guarantees good robustness, but also effectively improves the retrieval accuracy.
【學位授予單位】:湖南大學
【學位級別】:碩士
【學位授予年份】:2013
【分類號】:TP391.1

【參考文獻】

相關期刊論文 前10條

1 袁里馳;;一種基于互信息的詞聚類算法[J];系統(tǒng)工程;2008年05期

2 王建勇,單松巍,雷鳴,謝正茂,李曉明;海量Web搜索引擎系統(tǒng)中用戶行為的分布特征及其啟示[J];中國科學E輯:技術科學;2001年04期

3 張超盟;李戰(zhàn)懷;溫宗臣;;局部上下文分析剪枝概念樹的查詢擴展[J];計算機工程;2009年14期

4 趙偉,戴新宇,尹存燕,陳家駿;一種規(guī)則與統(tǒng)計相結合的漢語分詞方法[J];計算機應用研究;2004年03期

5 黃名選;嚴小衛(wèi);張師超;;查詢擴展技術進展與展望[J];計算機應用與軟件;2007年11期

6 余慧佳;劉奕群;張敏;茹立云;馬少平;;基于大規(guī)模日志分析的搜索引擎用戶行為分析[J];中文信息學報;2007年01期

7 陳

本文編號:1839733


資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/kejilunwen/sousuoyinqinglunwen/1839733.html


Copyright(c)文論論文網All Rights Reserved | 網站地圖 |

版權申明:資料由用戶e593b***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com