天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 搜索引擎論文 >

企業(yè)級(jí)元搜索引擎的研究與應(yīng)用

發(fā)布時(shí)間:2018-03-06 07:37

  本文選題:企業(yè)級(jí) 切入點(diǎn):元搜索引擎 出處:《復(fù)旦大學(xué)》2012年碩士論文 論文類型:學(xué)位論文


【摘要】:伴隨全球信息化進(jìn)程的長(zhǎng)期高速發(fā)展,各式各樣的信息以電子文件的形式在人們的存儲(chǔ)設(shè)備中得到極速膨脹。與此同時(shí),人們對(duì)信息的獲取也提出了更高的要求。如何在浩如煙海的電子文件中快速準(zhǔn)確地獲取人們想要的信息,無(wú)疑成為一道難題。搜索引擎的誕生和發(fā)展在一定程度上能夠解決這一難題,但同時(shí)搜索引擎也存在其局限性,我們無(wú)法期望單一搜索引擎能夠滿足不同場(chǎng)景下用戶多變的搜索需求。 本文所要面對(duì)的搜索場(chǎng)景為集團(tuán)環(huán)境下的企業(yè)級(jí)搜索。集團(tuán)中的每個(gè)分公司保有其自身的搜索引擎,對(duì)分公司內(nèi)部提供文檔的全文搜索服務(wù),但同時(shí)集團(tuán)又有涵蓋各分公司文檔的搜索需求,因此就需要構(gòu)建一個(gè)面向集團(tuán)的企業(yè)級(jí)元搜索引擎。與web元搜索引擎存在較大不同,這里主要關(guān)注的是特定企業(yè)場(chǎng)景下的元搜索文檔排序算法。 本文對(duì)經(jīng)典的全文搜索引擎排序算法和元搜索引擎排序算法進(jìn)行了廣泛和深入的研究,分析和歸納了各排序算法的特點(diǎn)和適用場(chǎng)景。然后深入探討了Lucene的文檔評(píng)分機(jī)制,提出了針對(duì)元搜索引擎應(yīng)用場(chǎng)景的規(guī)范化公式,以消除原本Lucene成員搜索引擎中對(duì)文檔分值不適當(dāng)?shù)木植啃约訖?quán)。最后結(jié)合文檔分值類算法、加權(quán)類算法以及Hits算法中的經(jīng)典思想,提出一種混合型的加權(quán)算法,對(duì)元搜索環(huán)境中的文檔分值進(jìn)行迭代加權(quán),以改變文檔相關(guān)度分值,達(dá)到排序結(jié)果優(yōu)化的效果。并在以上研究的基礎(chǔ)上,實(shí)現(xiàn)了.分公司全文搜索引擎系統(tǒng)和集團(tuán)元搜索引擎系統(tǒng)。
[Abstract]:With the rapid development of the global information process, all kinds of information in the form of electronic files in the form of people's storage devices get extremely rapid expansion at the same time, People also put forward higher requirements for obtaining information. How to get the information people want quickly and accurately in the vast number of electronic documents, The birth and development of search engines can solve this problem to a certain extent, but at the same time search engines also have their limitations. We cannot expect a single search engine to meet the variable search needs of users in different scenarios. Each branch in the group maintains its own search engine and provides a full-text search service for documents within the branch. But at the same time, the group also has the search requirements covering the documents of each branch, so it is necessary to build an enterprise-level meta search engine oriented to the group, which is quite different from the web meta search engine. The main concern here is the meta-search document sorting algorithm in a particular enterprise scenario. In this paper, the classic full-text search engine sorting algorithm and meta-search engine sorting algorithm are studied extensively and deeply, the characteristics and applicable scenarios of each sort algorithm are analyzed and summarized, and then the document scoring mechanism of Lucene is deeply discussed. This paper proposes a normalized formula for the application scenario of meta search engine to eliminate the local weighting of the improper value of document in the original Lucene member search engine. Finally, combining with the algorithm of document value class, In this paper, a hybrid weighted algorithm is proposed, in which the document scores in the meta search environment are weighted iteratively to change the document correlation score, and the classical ideas in the weighted class algorithm and the Hits algorithm are proposed. On the basis of the above research, we have realized the full-text search engine system and the group meta-search engine system of the branch company.
【學(xué)位授予單位】:復(fù)旦大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP391.3

【參考文獻(xiàn)】

相關(guān)期刊論文 前3條

1 李廣建,黃];元搜索引擎及其主要技術(shù)[J];情報(bào)科學(xué);2002年02期

2 張磊;;搜索引擎綜述[J];泰州科技;2008年08期

3 孔芳芳;;元搜索引擎系統(tǒng)的研究[J];科技創(chuàng)新導(dǎo)報(bào);2009年35期

,

本文編號(hào):1573936

資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/kejilunwen/sousuoyinqinglunwen/1573936.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶49ebf***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com