面向用戶偏好的Web搜索排序模型研究
發(fā)布時間:2018-02-28 19:37
本文關鍵詞: 信息檢索 用戶偏好 PageRank算法 搜索引擎 出處:《天津理工大學》2013年碩士論文 論文類型:學位論文
【摘要】:隨著互聯(lián)網(wǎng)技術的發(fā)展和日益普及,Web搜索查詢?yōu)槿藗兲峁┝素S富的信息,便捷的服務。然而,為了提高服務的質量,如何使互聯(lián)網(wǎng)返回與當前用戶請求更加吻合的結果,成為目前計算機應用技術領域研究的熱點問題。在互聯(lián)網(wǎng)的背景下,Web信息檢索系統(tǒng)若能明確當前用戶的興趣偏好以及查詢意圖,,則其檢索出來的結果不僅與用戶查詢目的的相關性極高,而且由于明確用戶的查詢主題,對提高搜索引擎的搜索速度有一定提高。為此,本課題針對當前用戶的興趣偏好和查詢主題等熱點問題進行了研究。 首先,針對傳統(tǒng)的Web搜索查詢系統(tǒng)存在的主題偏離、概念模糊等問題,提出了基于用戶反饋的排序優(yōu)化方法,采用與用戶相關的反饋信息,優(yōu)化用戶的查詢關鍵詞,起到了降低查詢詞潛在歧義性的作用。并以此結果為搜索查詢的依據(jù)。 其次,針對當前用戶對Web搜索查詢系統(tǒng)的特定服務需求愈來愈高的問題,提出了基于用戶查詢意圖的搜索排序方法,采用馬爾科夫鏈進行建模,基于隨機補足理論將用戶偏好及查詢意圖與現(xiàn)有的搜索排序技術相結合,改善了搜索系統(tǒng)的查詢效率,以期滿足用戶的查詢需求。 最后,針對理論知識研究中存在的問題及不足,本文在Lucene開發(fā)平臺下,搭建了基于Heritrix的搜索系統(tǒng),同時,基于對網(wǎng)頁抓取過程中的網(wǎng)頁相關度、概念、鏈接等關鍵技術的分析,結合文中提出的相關搜索排序方法,構建了Web搜索系統(tǒng),以期為用戶提供較高質量的信息檢索服務,并通過模擬實驗,仿真驗證提供對實際的搜索系統(tǒng)性能必要的改進。
[Abstract]:With the development of Internet technology and the increasing popularity of Web search queries, people are provided with abundant information and convenient services. However, in order to improve the quality of services, how to make the Internet return results that are more consistent with current user requests, It has become a hot issue in the field of computer application technology. In the background of the Internet, if the Web information retrieval system can make clear the current user's interest preference and query intention, The result is not only highly related to the user's query purpose, but also improves the search speed of the search engine because the user's query subject is clear. In this paper, the current user interest preference and query topics are studied. Firstly, aiming at the problems of topic deviation and fuzzy concept in traditional Web search and query system, a ranking optimization method based on user feedback is proposed, which uses feedback information related to users to optimize the query keywords of users. It can reduce the potential ambiguity of query words, and the result is the basis of search query. Secondly, in order to solve the problem that users need more and more special services in Web search and query system, a search sorting method based on user's query intention is proposed, and Markov chain is used to model the system. Based on the random complement theory, the user preference and query intention are combined with the existing search sorting technology to improve the query efficiency of the search system, in order to meet the query needs of users. Finally, aiming at the problems and shortcomings in the research of theoretical knowledge, this paper builds a search system based on Heritrix under the Lucene development platform, meanwhile, based on the relevance of web pages in the process of web page capture, the concept. Based on the analysis of key technologies such as link and the related search sorting method proposed in this paper, a Web search system is constructed in order to provide users with a high quality information retrieval service. Simulation verification provides necessary improvements to the performance of the actual search system.
【學位授予單位】:天津理工大學
【學位級別】:碩士
【學位授予年份】:2013
【分類號】:TP391.3
【引證文獻】
相關博士學位論文 前1條
1 史斌;面向語義網(wǎng)的語義搜索引擎關鍵技術研究[D];北京工業(yè)大學;2010年
本文編號:1548663
本文鏈接:http://www.sikaile.net/kejilunwen/sousuoyinqinglunwen/1548663.html
最近更新
教材專著