面向餐館評(píng)論的情感分析關(guān)鍵技術(shù)研究
發(fā)布時(shí)間:2018-06-18 03:02
本文選題:循環(huán)神經(jīng)網(wǎng)絡(luò) + LSTM ; 參考:《哈爾濱工業(yè)大學(xué)》2017年碩士論文
【摘要】:隨著互聯(lián)網(wǎng)與電子商務(wù)的發(fā)展,網(wǎng)上購(gòu)物、網(wǎng)上訂餐等方便快捷的應(yīng)用日益深入人們的生活,相應(yīng)地人們?cè)谶@些平臺(tái)上發(fā)表的評(píng)論信息也正在呈指數(shù)級(jí)的方式增長(zhǎng)。這些信息數(shù)量龐大,擁有極其重要的研究?jī)r(jià)值。對(duì)這些評(píng)論信息進(jìn)行分析,從中獲得消費(fèi)者對(duì)每個(gè)評(píng)價(jià)對(duì)象的評(píng)價(jià)極性,不僅能指導(dǎo)消費(fèi)者的消費(fèi)行為,而且有利于商家掌握消費(fèi)者需求,從而對(duì)產(chǎn)品進(jìn)行改進(jìn)。本文對(duì)餐館評(píng)論領(lǐng)域評(píng)價(jià)對(duì)象的抽取和評(píng)價(jià)極性判別兩個(gè)情感分析子任務(wù)進(jìn)行研究,選擇效果最好的方法應(yīng)用于餐館評(píng)論情感分析系統(tǒng)。具體地,本文研究?jī)?nèi)容如下:首先,研究評(píng)價(jià)對(duì)象的抽取方法。提出基于輸出依賴(lài)的雙向LSTM模型,該模型在LSTM模型的基礎(chǔ)之上通過(guò)利用兩個(gè)獨(dú)立的隱含層來(lái)對(duì)文本進(jìn)行雙向處理,從而充分利用文本上文和下文中所蘊(yùn)含的有效特征,同時(shí)在輸出層之間加入自連接,有效利用輸出序列之間存在的依賴(lài)關(guān)系,并通過(guò)加入詞性特征、句法特征、情感傾向特征和命名實(shí)體識(shí)別特征來(lái)提升模型的效果。其次,實(shí)現(xiàn)了條件隨機(jī)場(chǎng)方法,主要在特征選擇與組合上對(duì)模型的效果進(jìn)行提升。此外,實(shí)現(xiàn)了基于BLSTM-CRF的評(píng)價(jià)對(duì)象抽取方法,將BLSTM的輸出向量直接送入CRF模型中進(jìn)行計(jì)算,得到最佳輸出標(biāo)簽序列。其次,研究評(píng)價(jià)對(duì)象極性判別方法。提出基于雙向LSTM的評(píng)價(jià)對(duì)象極性判別模型,該模型利用兩個(gè)BLSTM網(wǎng)絡(luò)即BLSTML和BLSTMR來(lái)分別收集評(píng)價(jià)對(duì)象的上文和下文語(yǔ)義信息,在每個(gè)時(shí)間步驟上將當(dāng)前單詞詞向量和評(píng)價(jià)對(duì)象向量進(jìn)行連接后一同送入模型,從而使模型能捕獲到每個(gè)單詞與評(píng)價(jià)對(duì)象之間的語(yǔ)義關(guān)系。該模型取得了同類(lèi)模型的最好效果。此外,本文提出了基于提升的模型融合方法,該方法法將支持向量機(jī)模型和隨機(jī)森林模型融合,在訓(xùn)練完一個(gè)分類(lèi)模型后,增大該模型錯(cuò)誤分類(lèi)的樣本所占的權(quán)重并減小該模型正確分類(lèi)的樣本的權(quán)重,最后按照各模型的效果對(duì)結(jié)果加權(quán)得到最終的結(jié)果。該方法做到了將線性分類(lèi)模型和非線性分類(lèi)模型的優(yōu)點(diǎn)結(jié)合。最后,設(shè)計(jì)實(shí)現(xiàn)基于餐館評(píng)論的情感分析系統(tǒng)。將基于輸出依賴(lài)雙向LSTM的評(píng)價(jià)對(duì)象抽取方法和基于雙向LSTM的評(píng)價(jià)極性判別方法應(yīng)用到系統(tǒng)中,提高了系統(tǒng)進(jìn)行評(píng)價(jià)對(duì)象抽取與極性判別的準(zhǔn)確性。該系統(tǒng)能夠直觀地以餅圖的方式將評(píng)價(jià)對(duì)象及評(píng)價(jià)極性占比形象地表示出來(lái)。
[Abstract]:With the development of Internet and electronic commerce, the convenient and fast application of online shopping, online ordering and so on is deepening into people's life. Accordingly, the comments on these platforms are also increasing exponentially. The amount of information is so large that it has extremely important research value. Through the analysis of these comments, the evaluation polarity of each evaluation object can be obtained, which can not only guide the consumer's consumption behavior, but also help the merchant to grasp the consumer's demand and improve the product. In this paper, we study the two sub-tasks of the selection and evaluation polarity of the evaluation objects in the field of restaurant review, and choose the best method to be applied to the restaurant comment emotion analysis system. Specifically, the contents of this paper are as follows: firstly, the extraction method of evaluation objects is studied. A bidirectional LSTM model based on output dependence is proposed. Based on the LSTM model, the two independent hidden layers are used to process the text bidirectional, so as to make full use of the effective features contained in the text above and below. At the same time, self-linking is added between the output layers to effectively utilize the dependency between output sequences, and to improve the effectiveness of the model by adding part-of-speech features, syntactic features, affective tendency features and named entity recognition features. Secondly, the conditional random field method is implemented to improve the performance of the model in feature selection and combination. In addition, the evaluation object extraction method based on BLSTM-CRF is implemented, and the output vector of BLSTM is directly input into the CRF model for calculation, and the optimal output label sequence is obtained. Secondly, the polarity discrimination method of evaluation object is studied. A polarity discriminant model of evaluation object based on bidirectional LSTM is proposed. Two BLSTM networks, BLSTML and BLSTMR, are used to collect the above and the following semantic information of the evaluation object, respectively. The current word vector and the evaluation object vector are linked into the model in each time step, so that the model can capture the semantic relationship between each word and the evaluation object. The model achieves the best effect of the same model. In addition, this paper proposes a model fusion method based on lifting, which combines support vector machine model and stochastic forest model, after training a classification model, The weight of the samples of the model is increased and the weight of the samples classified correctly is reduced. Finally, the final results are obtained by weighting the results according to the effects of each model. This method combines the advantages of linear classification model and nonlinear classification model. Finally, an emotional analysis system based on restaurant reviews is designed and implemented. The evaluation object extraction method based on output-dependent bidirectional LSTM and the evaluation polarity discrimination method based on bidirectional LSTM are applied to the system, which improves the accuracy of evaluation object extraction and polarity discrimination. The system can visualize the evaluation object and the proportion of evaluation polarity by pie chart.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類(lèi)號(hào)】:TP391.1
【參考文獻(xiàn)】
相關(guān)期刊論文 前3條
1 ZHANG Yangsen;JIANG Yuru;TONG Yixuan;;Study of Sentiment Classification for Chinese Microblog Based on Recurrent Neural Network[J];Chinese Journal of Electronics;2016年04期
2 程佳軍;張?chǎng)?張勝;王暉;劉博;;Sentiment Parsing of Chinese Microblogs Using Recurrent Neural Network[J];Journal of Donghua University(English Edition);2016年03期
3 唐慧豐;譚松波;程學(xué)旗;;基于監(jiān)督學(xué)習(xí)的中文情感分類(lèi)技術(shù)比較研究[J];中文信息學(xué)報(bào);2007年06期
,本文編號(hào):2033735
本文鏈接:http://www.sikaile.net/kejilunwen/ruanjiangongchenglunwen/2033735.html
最近更新
教材專(zhuān)著