基于垂直搜索引擎的農(nóng)業(yè)信息推薦關(guān)鍵技術(shù)研究
本文選題:農(nóng)業(yè)信息推薦 切入點:垂直搜索引擎 出處:《沈陽農(nóng)業(yè)大學(xué)》2016年博士論文 論文類型:學(xué)位論文
【摘要】:互聯(lián)網(wǎng)已經(jīng)成為人們獲取信息資源的重要渠道,面對浩如煙海的互聯(lián)網(wǎng)信息,個性化的信息推薦是未來信息服務(wù)的發(fā)展方向。另一方面,各級政府和部門投入大量資源建立了涵蓋農(nóng)業(yè)科技、畜牧、水產(chǎn)、農(nóng)墾、農(nóng)機(jī)等領(lǐng)域的信息平臺,由于農(nóng)村地區(qū)信息基礎(chǔ)建設(shè)的缺乏,廣大農(nóng)業(yè)生產(chǎn)經(jīng)營者在信息分析、信息處理等方面能力的匱乏,導(dǎo)致這些對農(nóng)業(yè)生產(chǎn)具有重要指導(dǎo)意義的信息卻無法有針對性的傳播到農(nóng)業(yè)生產(chǎn)經(jīng)營者手中。涉農(nóng)人員僅依靠大眾媒體、農(nóng)業(yè)信息機(jī)構(gòu)和口頭傳播的方式,很難獲取到個性化的農(nóng)業(yè)信息服務(wù)。該研究的目標(biāo)是將散布在互聯(lián)網(wǎng)上的大量農(nóng)業(yè)相關(guān)信息進(jìn)行采集、分析和處理,準(zhǔn)確把握涉農(nóng)用戶的意愿和需求,主動將需求信息精準(zhǔn)的傳播到涉農(nóng)用戶手中,提高農(nóng)業(yè)信息的在農(nóng)業(yè)生產(chǎn)過程中的指導(dǎo)作用和社會經(jīng)濟(jì)效益,F(xiàn)有的推薦系統(tǒng)在農(nóng)業(yè)領(lǐng)域的應(yīng)用主要存在三個問題,一是農(nóng)業(yè)領(lǐng)域信息專注度不夠;二是涉農(nóng)用戶興趣過擬合和冷啟動問題;三是現(xiàn)有的信息推薦系統(tǒng)未根據(jù)農(nóng)業(yè)的屬性特色實現(xiàn)涉農(nóng)用戶個性化的分類和推薦問題。針對以上問題,本研究對農(nóng)業(yè)信息推薦系統(tǒng)的數(shù)據(jù)源、用戶興趣模型、推薦算法三個重要部件的關(guān)鍵技術(shù)進(jìn)行了深入的研究。主要包括:農(nóng)業(yè)信息采集與分析、用戶興趣模型構(gòu)建、推薦模型構(gòu)建和推薦算法改進(jìn)、軟件自主決策機(jī)制等關(guān)鍵技術(shù),為個性化農(nóng)業(yè)信息推薦服務(wù)的實現(xiàn)提供技術(shù)支撐。論文的主要研究工作概括如下:1.通過對搜索引擎功能和搜索效果的比較研究,設(shè)計了基于Nutch的農(nóng)業(yè)垂直搜索引擎,實現(xiàn)互聯(lián)網(wǎng)農(nóng)業(yè)信息的采集、過濾和分析,構(gòu)建了農(nóng)業(yè)信息推薦資源庫。針對垂直搜索在農(nóng)業(yè)領(lǐng)域的應(yīng)用特點和面臨的不足,采用字標(biāo)注分詞技術(shù)和參考農(nóng)業(yè)專業(yè)術(shù)語語料庫識別新詞的方法改進(jìn)了搜索引擎的分詞模塊,實驗表明,該分詞模塊的分詞效果與其他分詞系統(tǒng)相比,對農(nóng)業(yè)領(lǐng)域文本信息的分詞準(zhǔn)確度有所提高,結(jié)合對種子URL質(zhì)量的控制,增強(qiáng)農(nóng)業(yè)相關(guān)網(wǎng)頁的抓取精度和深度。2.針對農(nóng)業(yè)網(wǎng)絡(luò)資源存在的空間屬性表示不統(tǒng)一、顯性表達(dá)缺失的問題,研究農(nóng)業(yè)領(lǐng)域空間屬性信息提取方法,提出了一種借助行政區(qū)劃本體庫對農(nóng)業(yè)領(lǐng)域空間屬性的辨別和抽取方法,設(shè)計了顯性空間屬性抽取算法和基于通用搜索引擎的隱性空間屬性抽取算法,采用卡方檢驗的方法解決了隱性空間屬性抽取方法中返回空間屬性不唯一的問題。兩種抽取算法可有效標(biāo)注網(wǎng)頁信息中的空間屬性信息,實現(xiàn)用戶和項目地域特征的提取,為涉農(nóng)用戶興趣模型中地域標(biāo)簽的建立和基于地域特征的個性化農(nóng)業(yè)信息推薦模式的實現(xiàn)提供必要的信息。3.采用問卷調(diào)查的方法對涉農(nóng)人員的農(nóng)業(yè)信息需求情況和獲取信息方式進(jìn)行了研究,針對現(xiàn)有的農(nóng)業(yè)信息服務(wù)方式無法實現(xiàn)個性化服務(wù)的現(xiàn)狀,構(gòu)建了全面反映涉農(nóng)用戶興趣的模型ATBUIM。選定涉農(nóng)用戶的顯式和隱式信息來源,研究了用戶背景、瀏覽行為對用戶興趣度的估算方法和權(quán)重,構(gòu)建了基于互信息和農(nóng)業(yè)領(lǐng)域資源分類標(biāo)簽的貝葉斯網(wǎng)絡(luò)涉農(nóng)用戶興趣模型,將農(nóng)業(yè)領(lǐng)域標(biāo)簽間的互信息作為節(jié)點條件概率,采用結(jié)構(gòu)學(xué)習(xí)的方法實現(xiàn)模型的更新和優(yōu)化。該模型將用戶興趣信息進(jìn)行加權(quán)處理,體現(xiàn)不同類型信息的在模型構(gòu)建中的比重,更加全面和準(zhǔn)確的反映涉農(nóng)用戶的興趣領(lǐng)域,為實現(xiàn)精準(zhǔn)、有效的農(nóng)業(yè)信息推薦算法奠定基礎(chǔ)。4.分析和比較了三種推薦算法,針對傳統(tǒng)推薦算法存在的冷啟動和數(shù)據(jù)稀疏問題,提出了解決方法和策略,設(shè)計了高效的組合推薦算法模型。提出添加特征標(biāo)簽改進(jìn)算法相似度的方法,解決了傳統(tǒng)基于內(nèi)容推薦算法中新用戶無法推薦的問題。針對協(xié)同過濾算法中存在的數(shù)據(jù)稀疏性問題,提出了結(jié)合涉農(nóng)用戶的評分、特征因素與農(nóng)業(yè)項目的評分、特征因素的協(xié)同過濾算法,算法中目標(biāo)用戶和目標(biāo)項目的預(yù)測評分均為最近鄰居綜合了評分相似度和特征相似度的結(jié)果,加權(quán)結(jié)合兩項預(yù)測評分獲得最終推薦結(jié)果,經(jīng)實驗表明,改進(jìn)的協(xié)同過濾算法在相同數(shù)據(jù)稀疏度的環(huán)境下平均絕對值偏差更小,推薦精度表現(xiàn)更好。針對單推薦算法存在的不足,基于泛函網(wǎng)絡(luò)提出了一種組合推薦算法,構(gòu)建了組合推薦模型。實驗表明,組合推薦算法計算用戶對項目的預(yù)測評分更接近用戶對項目的實際評分。5.針對信息推薦服務(wù)模式在新的網(wǎng)絡(luò)環(huán)境下能夠主動調(diào)整自身結(jié)構(gòu)、狀態(tài)和行為的服務(wù)需求,提出了一種面向農(nóng)業(yè)領(lǐng)域的軟件自主決策機(jī)制;诒倔w將農(nóng)業(yè)網(wǎng)絡(luò)信息中的領(lǐng)域知識、消息和服務(wù)信息等信息構(gòu)建模型,設(shè)計了面向農(nóng)業(yè)領(lǐng)域知識的思維決策模型AKDM,將環(huán)境信息轉(zhuǎn)換成信念、愿望和意圖集合,并利用信念-愿望-意圖之間的決策推理關(guān)系指導(dǎo)Agent完成農(nóng)業(yè)信息推薦行為。分析和實驗表明,該機(jī)制在農(nóng)業(yè)領(lǐng)域知識和規(guī)則的約束下,實現(xiàn)了自主思維決策過程,完成了農(nóng)業(yè)信息的推薦。綜上所述,論文對互聯(lián)網(wǎng)農(nóng)業(yè)信息的有效搜索、涉農(nóng)用戶興趣模型構(gòu)建、農(nóng)業(yè)信息精準(zhǔn)推薦算法和軟件自主決策機(jī)制做出的研究,可以為農(nóng)業(yè)領(lǐng)域信息個性化推薦服務(wù)的實現(xiàn)提供技術(shù)支撐。
[Abstract]:The Internet has become an important channel for people to obtain information resources, facing the multitude of Internet information, personalized information recommendation is the future development direction of information service. On the other hand, governments at all levels and departments to invest a lot of resources covering the establishment of agricultural science and technology, animal husbandry, fishery, agriculture, agricultural machinery and other areas of information platform, due to the lack of information infrastructure in rural areas, the majority of information analysis in agricultural production operators, lack of information processing ability, causes these has important guiding significance for agricultural production information cannot be targeted to spread agricultural production operators hands. Related personnel only rely on mass media, agricultural information institutions and the form of oral communication, it is difficult to get agricultural personalized information service. The aim of this study is a large number of agriculture related information will be spread on the Internet for collection points Analysis and processing, accurate grasp of agricultural user's wishes and needs, the initiative will demand accurate information to spread to the hands of users to improve the agricultural, agricultural information in agricultural production in the process of guiding role and social and economic benefits. The application of the existing recommender systems in the field of agriculture there are three main problems, one is the field of agriculture information focus the two is not enough; related user interest over fitting and cold start problem; three is the information recommendation systems not based on the attribute of characteristic agriculture to realize classification and personalized recommendation problem related to the above problems, this study recommended system data source for agricultural information, user interest model, the key technology of three important components of recommendation algorithm for an in-depth study. Mainly includes: the collection and analysis of agricultural information, build user interest model construction and proposed improved algorithm recommendation model, software The key technology of autonomous decision-making mechanism, to provide technical support for the realization of personalized information recommendation service for agriculture. The main research work is summarized as follows: 1. through the comparative study on the function of search engine and search results, the design of agricultural vertical search engine based on Nutch, realize the Agricultural Internet information collection, filtering and analysis, construction of agriculture recommended information resource. According to the application characteristics of vertical search in the field of agriculture and problems, using the method of word segmentation technology and annotation reference agricultural terminology corpus recognition words improved segmentation module, search engine experiments show that the segmentation module segmentation results compared with other segmentation system, segmentation of the text information in the field of agriculture the accuracy is improved, combined with the control of seed quality of URL, according to agricultural agriculture related web crawler to enhance precision and depth of.2. The spatial attribute existing cyber source said the industry is not unified, the problem of the lack of explicit expression, extraction method of agriculture spatial information, proposes a ontology with administrative identification and extraction method of agricultural field spatial attribute, the dominant design space attributes extraction algorithm based on implicit space and attribute extraction algorithm of general search engine the method of using chi square test to solve the implicit space attribute extraction method in return space property is not the only problem. Two kinds of extraction algorithm can effectively label space attribute information of the web page information extraction of geographical features of users and items, establish agricultural user interest model and method of regional labels to provide the necessary information.3. personalized agricultural information geographical characteristics of the recommended model based on the questionnaire survey on the related personnel of agricultural information demand. Conditions and means of getting information is studied, according to the present situation of agricultural information service currently unable to realize the personalized service, the construction of explicit and implicit information source model ATBUIM. to reflect user interest in the selected agricultural agricultural users, user research background, estimation method and weighted browsing behavior on the user's interest, construction the Bayesian network related user classification and labelling of mutual information and agricultural resources in the field of interest based on the model of agricultural field label between mutual information as the node conditional probability, using the method of structure learning update and optimization model. The model will be weighted to reflect the user interest information, different types of information in the proportion of the model construction. The more comprehensively and accurately reflect the user's interest in agricultural areas, to achieve accurate and effective agricultural information recommendation algorithm to lay the foundation for.4. analysis And comparison of three kinds of recommendation algorithms for cold start and data sparse problem of traditional recommendation algorithm, propose solutions and strategies, design the efficient combination recommendation algorithm model. This method improved algorithm of similarity add feature tags, to solve the traditional content recommendation of new users to recommend the algorithm based on data for. Sparsity in Collaborative filtering algorithm, combining related user ratings, characteristic factor and agricultural project score, collaborative filtering algorithm of feature factors, predictive scoring algorithm of target users and target items are nearest neighbor comprehensive similarity and similarity score results, combined with the weighted score to obtain the final recommendation the two prediction results, the experimental results show that the improved collaborative filtering algorithm on the same data sparsity environment mean absolute deviation is smaller, push The recommended precision performance. Aiming at the shortage of existing single recommendation algorithm, functional network based on a combination of recommendation algorithm is proposed to construct combination recommendation model. Experimental results show that the combination of the recommendation algorithm to calculate the user item rating prediction is closer to the user on the project's actual score for.5. mode recommended service information in the new network environment can take the initiative to adjust its structure, status and behavior of the service demand, put forward a kind of agriculture - oriented software independent decision-making mechanism based on ontology. The agriculture information network in the field of knowledge, model construction and news service information and other information, designed the AKDM thinking decision model for agricultural knowledge, environmental information into belief desire and intention, collection, and use the reasoning relation guide Agent belief desire intention between the completion of the recommended behavior of agricultural information analysis and experimental table. Ming, the mechanism in knowledge and rules in the field of agriculture under the constraints to achieve the independent thinking of the decision-making process, completed the agricultural information recommendation. In summary, the effective search of Internet agricultural information, agricultural user interest model construction, research on agricultural information precision recommendation algorithm and software of autonomous decision-making mechanism, can provide technical support for the realization of personalized recommendation service for agriculture in the field of information.
【學(xué)位授予單位】:沈陽農(nóng)業(yè)大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2016
【分類號】:TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 朱揚勇;孫婧;;推薦系統(tǒng)研究進(jìn)展[J];計算機(jī)科學(xué)與探索;2015年05期
2 周浩;李翔;劉功申;;相關(guān)信息加權(quán)的自適應(yīng)多標(biāo)簽分類算法[J];計算機(jī)應(yīng)用與軟件;2015年01期
3 崔春生;;基于泛函網(wǎng)絡(luò)的組合推薦算法[J];系統(tǒng)工程理論與實踐;2014年04期
4 李丹;;我國農(nóng)業(yè)網(wǎng)站信息的有效推廣研究[J];價值工程;2013年27期
5 于衛(wèi)紅;;基于JADE Agent與FSM的電子商務(wù)訂單實時處理[J];計算機(jī)應(yīng)用與軟件;2013年08期
6 武建佳;趙偉;;WInternet:從物網(wǎng)到物聯(lián)網(wǎng)[J];計算機(jī)研究與發(fā)展;2013年06期
7 關(guān)世杰;趙海;;互聯(lián)網(wǎng)技術(shù)領(lǐng)域科研合作網(wǎng)絡(luò)分析[J];東北大學(xué)學(xué)報(自然科學(xué)版);2013年04期
8 王忠杰;鄭偉;徐曉飛;初佃輝;;基于MDP的服務(wù)不確定性自適應(yīng)決策方法[J];計算機(jī)學(xué)報;2013年02期
9 黃翼彪;;實現(xiàn)Lucene接口的中文分詞器的比較研究[J];科技信息;2012年12期
10 郝水龍;吳共慶;胡學(xué)鋼;;基于層次向量空間模型的用戶興趣表示及更新[J];南京大學(xué)學(xué)報(自然科學(xué)版);2012年02期
相關(guān)會議論文 前1條
1 黃昌寧;趙海;;由字構(gòu)詞——中文分詞新方法[A];中文信息處理前沿進(jìn)展——中國中文信息學(xué)會二十五周年學(xué)術(shù)會議論文集[C];2006年
相關(guān)博士學(xué)位論文 前10條
1 葉飛;一種基于齊普夫定律的識別語料中高低詞頻分界點的新方法及其應(yīng)用[D];南開大學(xué);2014年
2 劉淇;基于用戶興趣建模的推薦方法及應(yīng)用研究[D];中國科學(xué)技術(shù)大學(xué);2013年
3 江會星;漢語命名實體識別研究[D];北京郵電大學(xué);2012年
4 胡宜敏;農(nóng)業(yè)垂直搜索引擎語義化若干問題的研究與實現(xiàn)[D];中國科學(xué)技術(shù)大學(xué);2012年
5 趙洪亮;基于資源整合的農(nóng)業(yè)信息服務(wù)平臺構(gòu)建與實現(xiàn)[D];沈陽農(nóng)業(yè)大學(xué);2012年
6 伍宇;移動計算中自適應(yīng)負(fù)載轉(zhuǎn)移決策模型研究[D];復(fù)旦大學(xué);2012年
7 王振;不同區(qū)域農(nóng)業(yè)信息化推進(jìn)模式研究[D];中國農(nóng)業(yè)科學(xué)院;2011年
8 夏培勇;個性化推薦技術(shù)中的協(xié)同過濾算法研究[D];中國海洋大學(xué);2011年
9 張亮;推薦系統(tǒng)中協(xié)同過濾算法若干問題的研究[D];北京郵電大學(xué);2009年
10 李超鋒;Web使用挖掘關(guān)鍵技術(shù)研究[D];華中科技大學(xué);2007年
相關(guān)碩士學(xué)位論文 前10條
1 單京晶;基于內(nèi)容的個性化推薦系統(tǒng)研究[D];東北師范大學(xué);2015年
2 魏聰;互聯(lián)網(wǎng)訪問數(shù)據(jù)用戶識別與興趣度分析[D];東華大學(xué);2015年
3 張智慧;基于Nutch的農(nóng)業(yè)垂直搜索引擎的研究[D];河北農(nóng)業(yè)大學(xué);2014年
4 王曉琴;基于Nutch的農(nóng)業(yè)垂直搜索引擎研究與實現(xiàn)[D];西北農(nóng)林科技大學(xué);2014年
5 徐穩(wěn);湖南農(nóng)業(yè)信息區(qū)域推送關(guān)鍵技術(shù)研究[D];湖南農(nóng)業(yè)大學(xué);2013年
6 劉鑫;基于混合推薦的網(wǎng)頁推薦系統(tǒng)的研究與實現(xiàn)[D];北京工業(yè)大學(xué);2013年
7 白濤;中文分詞在農(nóng)業(yè)垂直搜索引擎中的應(yīng)用研究[D];新疆農(nóng)業(yè)大學(xué);2013年
8 于靜一;基于Solr實現(xiàn)農(nóng)業(yè)信息擴(kuò)展檢索的研究[D];河北農(nóng)業(yè)大學(xué);2013年
9 黃翼彪;開源中文分詞器的比較研究[D];鄭州大學(xué);2013年
10 溫梅;個性化推薦中基于貝葉斯網(wǎng)絡(luò)的用戶興趣模型研究[D];華中師范大學(xué);2013年
,本文編號:1628439
本文鏈接:http://www.sikaile.net/kejilunwen/sousuoyinqinglunwen/1628439.html