面向數(shù)學搜索的排序算法研究
發(fā)布時間:2018-08-02 14:48
【摘要】:目前,Web中的數(shù)學信息量逐漸增加,數(shù)學搜索成為人們關注的焦點。近幾年,瀏覽器對數(shù)學公式的顯示和存儲問題己得到逐步解決,為面向數(shù)學公式的搜索引擎的研究和開發(fā)提供了良好的基礎。 盡管數(shù)學公式可以存儲在web文檔中,在網絡中搜索數(shù)學公式仍具有局限性。數(shù)學公式具有復雜的二維結構以及蘊涵有復雜的數(shù)學表達意義,不同描述的數(shù)學公式可能具有相同的意義,同一數(shù)學公式的表示形式可能有多種,另外用戶查詢公式可能為某一公式的子公式,因此用傳統(tǒng)的文本檢索系統(tǒng)搜索數(shù)學公式顯得力所不足。國際上現(xiàn)有的或者正在研究的數(shù)學公式檢索系統(tǒng),在建立索引方面已取得逐步發(fā)展,在返回結果集的排序算法方面大部分仍應用文本搜索引擎的排序算法,未深入研究面向數(shù)學公式搜索結果排序的算法。因此,本文將在深入研究現(xiàn)有的基于文本搜索引擎排序算法的原理和基礎上,結合數(shù)學公式的特點以及數(shù)學公式間的關系(等價、代數(shù)相關、子公式等)嘗試提出面向數(shù)學公式搜索排序的算法。本文將計算機代數(shù)系統(tǒng)(CAS)和數(shù)學公式搜索引擎相結合去挖掘公式與公式之間的關系,不但為查詢公式和網頁之間相關度的計算方面提供更加合理可靠的相關度量方法,還將促進系統(tǒng)對數(shù)學公式語義檢索的能力。
[Abstract]:At present, the amount of mathematical information in Web is increasing gradually, and mathematical search has become the focus of attention. In recent years, the problem of displaying and storing mathematical formulas in browsers has been gradually solved, which provides a good foundation for the research and development of search engines oriented to mathematical formulas. Although mathematical formulas can be stored in web documents, searching for them in a network has its limitations. Mathematical formulas have complex two-dimensional structure and implicature of complex mathematical expressions. Different mathematical formulas may have the same meaning, and the same mathematical formulas may have many forms of expression. In addition, the user query formula may be a subformula of a certain formula, so it is insufficient to search the mathematical formula with the traditional text retrieval system. The existing or currently studied mathematical formula retrieval systems in the world have made gradual progress in indexing, and most of the sorting algorithms for returning result sets still use the sorting algorithms of text search engines. The algorithm for sorting search results for mathematical formulas is not studied in depth. Therefore, on the basis of studying the principle and foundation of the existing text search engine sorting algorithm, this paper will combine the characteristics of mathematical formula and the relationship between mathematical formulas (equivalent, algebraic correlation, etc.) Subformulas, etc.) an algorithm for searching and sorting mathematical formulas is proposed. In this paper, the computer algebra system (CAS) and the search engine of mathematical formulas are combined to mine the relationship between the formulas and the formulas, which not only provides a more reasonable and reliable correlation measure method for the calculation of the correlation between the query formulas and the web pages. It will also promote the system's ability of semantic retrieval of mathematical formulas.
【學位授予單位】:蘭州大學
【學位級別】:碩士
【學位授予年份】:2012
【分類號】:TP391.3;O223
本文編號:2159783
[Abstract]:At present, the amount of mathematical information in Web is increasing gradually, and mathematical search has become the focus of attention. In recent years, the problem of displaying and storing mathematical formulas in browsers has been gradually solved, which provides a good foundation for the research and development of search engines oriented to mathematical formulas. Although mathematical formulas can be stored in web documents, searching for them in a network has its limitations. Mathematical formulas have complex two-dimensional structure and implicature of complex mathematical expressions. Different mathematical formulas may have the same meaning, and the same mathematical formulas may have many forms of expression. In addition, the user query formula may be a subformula of a certain formula, so it is insufficient to search the mathematical formula with the traditional text retrieval system. The existing or currently studied mathematical formula retrieval systems in the world have made gradual progress in indexing, and most of the sorting algorithms for returning result sets still use the sorting algorithms of text search engines. The algorithm for sorting search results for mathematical formulas is not studied in depth. Therefore, on the basis of studying the principle and foundation of the existing text search engine sorting algorithm, this paper will combine the characteristics of mathematical formula and the relationship between mathematical formulas (equivalent, algebraic correlation, etc.) Subformulas, etc.) an algorithm for searching and sorting mathematical formulas is proposed. In this paper, the computer algebra system (CAS) and the search engine of mathematical formulas are combined to mine the relationship between the formulas and the formulas, which not only provides a more reasonable and reliable correlation measure method for the calculation of the correlation between the query formulas and the web pages. It will also promote the system's ability of semantic retrieval of mathematical formulas.
【學位授予單位】:蘭州大學
【學位級別】:碩士
【學位授予年份】:2012
【分類號】:TP391.3;O223
【參考文獻】
相關期刊論文 前3條
1 李世奇;計算機代數(shù)系統(tǒng)MAPLE及其程序設計語言[J];重慶師范學院學報(自然科學版);1998年04期
2 姜楚江;余軼軍;;基于分塊和凈化的搜索引擎排序算法[J];計算機工程與應用;2012年01期
3 李紹華;高文宇;;搜索引擎頁面排序算法研究綜述[J];計算機應用研究;2007年06期
,本文編號:2159783
本文鏈接:http://www.sikaile.net/kejilunwen/sousuoyinqinglunwen/2159783.html
教材專著