基于CRFs的微博評論情感分類的研究
發(fā)布時間:2019-04-11 16:46
【摘要】:信息社會信息傳遞的方式多種多樣,通過微博這種便捷的信息交流方式,信息的傳遞已經(jīng)深入我們生活各個角落。由于在微博平臺上擁有數(shù)以萬計的用戶,而且經(jīng)常會在微博上發(fā)表對于某件事情或者某一熱點話題的討論帶有個人感情色彩的見解。因此,對微博平臺上留存的大量的語料進行分析,可發(fā)現(xiàn)大多數(shù)人群普遍的情緒、情感和價值取向,可為關心相關問題的決策者提供分析問題的依據(jù)。 本文首先對已有的語料的情感分析的相關的研究進行了歸納與總結(jié)。隨后,比較了幾種常用的情感分類模型,包括基于相似度的方法、貝葉斯分類器、支持向量機等。通過對各個模型的優(yōu)、缺點進行分析,最終,采用目前廣泛認可的一種情感分類方法——條件隨即場(CRFs);其次,采用詞語粒度級別上對文本中的中文句子進行特征性的標注,利用條件隨即場模型對實驗的語料進行訓練,形成訓練模型,,運用訓練好的模型對評論信息進行情感傾向性的判定。最后,提出一種情感強弱的分級機制,使得情感分析不僅僅局限于正面、中性以及反面三種情況,實驗結(jié)果量化了原有的三個方面,從而通過量化后的結(jié)果,對情感的強弱進行排名。 本文通過使用CRFs對語料的分析后得出的結(jié)果來看,CRFs對于情感語句具有較好的分類效果,而且運用實驗結(jié)果基本上驗證了作者提出的情感強弱分級的機制的可行性,通過量化的結(jié)果可為決策者提供數(shù)據(jù)的支撐。但研究中仍有需要改進的地方,如語料庫仍不十分完備等問題,日后會進一步完善。
[Abstract]:There are many ways of information transmission in information society. Weibo is a convenient way to communicate information, and the transmission of information has gone deep into every corner of our life. With tens of thousands of users on the Weibo platform, and often on Weibo, personal insights into the discussion of something or a hot topic are posted. Therefore, the analysis of a large number of corpus retained on the Weibo platform can find that the general emotion, emotion and value orientation of most people can provide the basis for the decision makers concerned about the related problems to analyze the problems. First of all, this paper summarizes the related research of emotional analysis of the existing corpus. Then, several commonly used affective classification models are compared, including similarity-based methods, Bayesian classifiers, support vector machines, and so on. Based on the analysis of the advantages and disadvantages of each model, finally, a widely accepted emotion classification method, conditional Random Field (CRFs);, is adopted at the end of the paper. Secondly, the Chinese sentences in the text are marked at the level of word granularity, and the experimental corpus is trained by the conditional random field model to form a training model. The trained model is used to judge the emotional tendency of the comment information. Finally, a classification mechanism of emotion strength is proposed, which makes emotional analysis not only confined to positive, neutral and negative cases, but also quantifies the original three aspects of the experimental results, so as to pass the quantized results. Rank the strength of the emotion. Through the analysis of the corpus by using CRFs, this paper shows that CRFs has a good classification effect for affective sentences, and the experimental results basically verify the feasibility of the mechanism of emotional intensity classification proposed by the author. Quantitative results can provide data support for decision makers. However, there are still some problems that need to be improved, such as the incomplete corpus and so on, which will be further improved in the future.
【學位授予單位】:東北師范大學
【學位級別】:碩士
【學位授予年份】:2014
【分類號】:TP393.092
本文編號:2456587
[Abstract]:There are many ways of information transmission in information society. Weibo is a convenient way to communicate information, and the transmission of information has gone deep into every corner of our life. With tens of thousands of users on the Weibo platform, and often on Weibo, personal insights into the discussion of something or a hot topic are posted. Therefore, the analysis of a large number of corpus retained on the Weibo platform can find that the general emotion, emotion and value orientation of most people can provide the basis for the decision makers concerned about the related problems to analyze the problems. First of all, this paper summarizes the related research of emotional analysis of the existing corpus. Then, several commonly used affective classification models are compared, including similarity-based methods, Bayesian classifiers, support vector machines, and so on. Based on the analysis of the advantages and disadvantages of each model, finally, a widely accepted emotion classification method, conditional Random Field (CRFs);, is adopted at the end of the paper. Secondly, the Chinese sentences in the text are marked at the level of word granularity, and the experimental corpus is trained by the conditional random field model to form a training model. The trained model is used to judge the emotional tendency of the comment information. Finally, a classification mechanism of emotion strength is proposed, which makes emotional analysis not only confined to positive, neutral and negative cases, but also quantifies the original three aspects of the experimental results, so as to pass the quantized results. Rank the strength of the emotion. Through the analysis of the corpus by using CRFs, this paper shows that CRFs has a good classification effect for affective sentences, and the experimental results basically verify the feasibility of the mechanism of emotional intensity classification proposed by the author. Quantitative results can provide data support for decision makers. However, there are still some problems that need to be improved, such as the incomplete corpus and so on, which will be further improved in the future.
【學位授予單位】:東北師范大學
【學位級別】:碩士
【學位授予年份】:2014
【分類號】:TP393.092
【參考文獻】
相關期刊論文 前10條
1 郭雯,葛朝陽,吳曉波;基于客戶認知價值的CRM戰(zhàn)略[J];商業(yè)研究;2003年08期
2 馮奇峰,李言;客戶的認知投入與保持投入模型研究[J];計算機集成制造系統(tǒng);2005年09期
3 何鳳英;;基于語義理解的中文博文傾向性分析[J];計算機應用;2011年08期
4 張玉芳;莫凌琳;熊忠陽;耿曉斐;;基于條件隨機場的科研論文信息分層抽取[J];計算機應用研究;2009年10期
5 唐慧豐;譚松波;程學旗;;基于監(jiān)督學習的中文情感分類技術比較研究[J];中文信息學報;2007年06期
6 徐軍;丁宇新;王曉龍;;使用機器學習方法進行新聞的情感自動分類[J];中文信息學報;2007年06期
7 劉康;趙軍;;基于層疊CRFs模型的句子褒貶度分析研究[J];中文信息學報;2008年01期
8 金昌虎;;在線WOM內(nèi)容和效果的關系:產(chǎn)品知識和相關的影響[J];沈陽大學學報;2006年01期
9 郭國慶;楊學成;張楊;;口碑傳播對消費者態(tài)度的影響:一個理論模型[J];管理評論;2007年03期
10 孫艷;周學廣;付偉;;無監(jiān)督的主題情感混合模型研究[J];西安交通大學學報;2013年01期
本文編號:2456587
本文鏈接:http://www.sikaile.net/guanlilunwen/ydhl/2456587.html
最近更新
教材專著