基于Session過程的搜索優(yōu)化
發(fā)布時(shí)間:2018-07-05 06:14
本文選題:信息檢索 + Session馬爾可夫隨機(jī)場(chǎng)。 參考:《北京郵電大學(xué)》2013年碩士論文
【摘要】:隨著互聯(lián)網(wǎng)信息的爆炸式增長(zhǎng),搜索引擎在網(wǎng)絡(luò)信息查找中起到至關(guān)重要的作用。而對(duì)海量數(shù)據(jù),傳統(tǒng)搜索算法存在應(yīng)用局限性。首先,面向關(guān)鍵詞的搜索方式,對(duì)用戶構(gòu)建查詢的能力要求較高。其次,利用用戶的簡(jiǎn)短查詢與海量信息進(jìn)行相關(guān)性匹配,準(zhǔn)確率和召回率較低。最后,通用的搜索算法無法提供個(gè)性化檢索服務(wù)。為解決上述問題,本文以Session信息為對(duì)象,研究基于Session過程的搜索優(yōu)化。 Session過程是指用戶為滿足其預(yù)先設(shè)定的搜索需求,在搜索過程中,進(jìn)行的一系列查詢?cè)~的修改以及與搜索結(jié)果的交互行為,包括對(duì)搜索結(jié)果頁(yè)而的點(diǎn)擊行為、瀏覽時(shí)間等信息。本文以Session信息為依托,提出了基于馬爾可夫隨機(jī)場(chǎng)的Session檢索模型,以實(shí)現(xiàn)搜索優(yōu)化的目的。本文的主要研究包括以下幾方面。 第一,以馬爾可夫隨機(jī)場(chǎng)為理論基礎(chǔ),構(gòu)建而向Session過程的檢索模型。通過對(duì)用戶搜索行為模式的分析,從Session過程的時(shí)序特性出發(fā),構(gòu)建動(dòng)態(tài)演進(jìn)的Session檢索模型。 第二,以語言學(xué)特性分析為基礎(chǔ),研究詞關(guān)聯(lián)性假設(shè)在Session檢索過程的優(yōu)化作用。本文從詞完全獨(dú)立模式FIP及詞序列關(guān)聯(lián)模式SDP出發(fā),構(gòu)建了FISM和SDSM兩類Session檢索模型,進(jìn)而探討詞關(guān)聯(lián)性假設(shè)在Session檢索過程中產(chǎn)生的影響。 第三,以Session信息的類別劃分為基礎(chǔ),研究Session各類信息在檢索中的影響力。本文將Session信息劃分為兩類:歷史查詢HQ和歷史點(diǎn)擊網(wǎng)貞HC。通過Session檢索模型的定義,以E(Qi),E(Ci),E(Qi+Ci)以及E(WAFi)四種查詢?cè)氐臉?gòu)建方式,實(shí)現(xiàn)各類歷史信息與檢索過程的有效結(jié)合。 第四,以詞激活力為理論基礎(chǔ),結(jié)合Session信息進(jìn)行查詢擴(kuò)展,研究基于詞激活力的Session檢索模型的有效性。 針對(duì)上述研究點(diǎn),本文進(jìn)行了Session檢索模型的分類實(shí)驗(yàn)設(shè)計(jì)及實(shí)現(xiàn)。實(shí)驗(yàn)結(jié)果表明,基于馬爾可夫隨機(jī)場(chǎng)的Session檢索模型能夠?qū)崿F(xiàn)搜索優(yōu)化的作用。
[Abstract]:With the explosive growth of Internet information, search engine plays a vital role in the search of network information. For mass data, the traditional search algorithm has some limitations. First of all, keyword-oriented search, the ability to build queries for users is high. Secondly, using the user's short query and the mass of information to match the correlation, the accuracy and recall rate are lower. Finally, the common search algorithm can not provide personalized retrieval services. In order to solve the above problems, this paper takes session information as an object to study the search optimization based on session process. Session process refers to the user in search process in order to meet their pre-set search requirements. A series of query terms are modified and interacted with search results, including click behavior on search results page, browsing time and so on. Based on session information, a session retrieval model based on Markov random field is proposed in this paper to achieve search optimization. The main research of this paper includes the following aspects. Firstly, based on Markov random field theory, the retrieval model of session process is constructed. Based on the analysis of user search behavior and the temporal characteristics of session process, a dynamic evolving session retrieval model is constructed. Secondly, on the basis of linguistic characteristic analysis, this paper studies the optimal function of word relevance hypothesis in session retrieval process. In this paper, we construct two kinds of session retrieval models, FISM and SDSM, based on word completely independent mode FIP and word sequence association schema SDP, and then discuss the influence of word relevance hypothesis in session retrieval process. Thirdly, based on the classification of session information, the influence of session information in retrieval is studied. In this paper, session information is divided into two categories: historical query HQ and historical click-net HCHC. By the definition of session retrieval model, four query elements, E (Qi) E (ci) E (Qi ci) and E (WAFi), are constructed to realize the effective combination of all kinds of historical information and retrieval process. Fourthly, the validity of session retrieval model based on word activation power is studied by combining session information with the theory of word activation. Aiming at the above research points, this paper designs and implements the classification experiment of session retrieval model. Experimental results show that the session retrieval model based on Markov random field can achieve search optimization.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前2條
1 李曉光;王大玲;于戈;;基于統(tǒng)計(jì)語言模型的信息檢索[J];計(jì)算機(jī)科學(xué);2005年08期
2 余慧佳;劉奕群;張敏;茹立云;馬少平;;基于大規(guī)模日志分析的搜索引擎用戶行為分析[J];中文信息學(xué)報(bào);2007年01期
相關(guān)碩士學(xué)位論文 前1條
1 胡亦清;輿情系統(tǒng)中傾向性分析與實(shí)現(xiàn)[D];北京郵電大學(xué);2012年
,本文編號(hào):2099254
本文鏈接:http://www.sikaile.net/kejilunwen/sousuoyinqinglunwen/2099254.html
最近更新
教材專著