基于深度學(xué)習(xí)的文本分類關(guān)鍵問(wèn)題研究
發(fā)布時(shí)間:2022-01-25 15:01
文本分類由來(lái)已久,近年來(lái),隨著人工智能和機(jī)器學(xué)習(xí)的迅速發(fā)展,文本分類也出現(xiàn)了很多新方法。隨著技術(shù)的發(fā)展,一方面,文本語(yǔ)料的數(shù)據(jù)質(zhì)量和數(shù)量發(fā)生了巨大的變化,大規(guī)模語(yǔ)料的積累為更復(fù)雜的模型提供了必要的數(shù)據(jù)保障。另一方面,計(jì)算機(jī)的計(jì)算性能的提升為大規(guī)模語(yǔ)料的計(jì)算和分析提供了有力的計(jì)算資源保障。隨著機(jī)器學(xué)習(xí)和深度學(xué)習(xí)的推進(jìn),深度學(xué)習(xí)的方法在各個(gè)領(lǐng)域都表現(xiàn)出強(qiáng)大的優(yōu)勢(shì)。本文將在深度學(xué)習(xí)的基礎(chǔ)上探討文本分類中的基本研究問(wèn)題。介紹了不同的深度學(xué)習(xí)方法,如卷積神經(jīng)網(wǎng)絡(luò)(Convolutional Neural Network,CNN)和長(zhǎng)短期記憶(Long Short-Term Memory,LSTM)。我們提出了分別利用 CNN 和 LSTM,并利用樸素貝葉斯(Naive Bayes,NB)作為對(duì)比方法,,以PyCharm是開(kāi)發(fā)平臺(tái),在文本情感分類的公開(kāi)數(shù)據(jù)集上做了實(shí)驗(yàn),并對(duì)實(shí)驗(yàn)結(jié)果進(jìn)行了分析。結(jié)果表明,所提出的方法比基準(zhǔn)方法取得了更好的效果。
【文章來(lái)源】:華北電力大學(xué)(北京)北京市 211工程院校 教育部直屬院校
【文章頁(yè)數(shù)】:60 頁(yè)
【學(xué)位級(jí)別】:碩士
【文章目錄】:
摘要
ABSTRACT
CHAPTER 1 INTRODUCTION
1.1 Text Classification
1.1.1 Definition
1.1.2 Basic Concepts of Text Classification
1.1.3 Text Classification Processes
1.1.4 Applications of Text Classification
1.2 Deep Learning
1.2.1 Definition
1.2.2 History of Deep Learning
1.2.3 Applications of Deep Learning in Text Mining
1.3 Literature Review
1.4 Research Motivation
1.5 Thesis Layout
CHAPTER 2 DEEP LEARNING TECHNIQUES
2.1 Data Preprocessing
2.1.1 Stemming
2.1.2 Word Segmentation
2.2 Text Representation
2.2.1 Word Embedding
2.2.2 One-hot Vector
2.3 Classification
2.3.1 Convolutional Neural Network (CNN)
2.3.2 Convolutional Neural Network (CNN)Applications
2.3.3 Long Short-Term Memory(LSTM)
2.3.4 Gated Recurrent Unit (GRU)
2.3.5 Long Short-Term Memory(LSTM)Applications
2.4 Evaluation
2.4.1 Confusion Matrix
2.4.2 Accuracy
2.4.3 Precision
2.4.4 Recall
2.4.5 False Positive rate(FP), True Negative rate (TN)and False Negative rate(FN)
2.4.6 F-Measure
CHAPTER 3 CONVOLUTIONAL NEURAL NETWORK(CNN)-BASED TEXT CLASSIFICATION
3.1 Dataset
3.2 Baseline Method: Naive Bayes (NB)
3.2.1 Definition
3.2.2 Environment
3.2.3 Experiment
3.2.4 Results and Analysis
3.3 Proposed Method 1: Convolutional Neural Network (CNN)
3.3.1 Definition
3.3.2 Environment
3.3.3 Model
3.3.4 Experiment
3.3.5 Results and Analysis
CHAPTER 4 LONG SHORT-TERM MEMORY(LSTM)-BASED TEXT CLASSIFICATION
4.1 Definition
4.2 Environment
4.3 Model
4.4 Experiment
4.5 Results and Analysis
CHAPTER 5 CONCLUSIONS AND FUTURE WORKS
5.1 Conclusions
5.2 Future Works
REFERENCES
APPENDIX A-Some Codes From The Baseline Method: Naive Bayes (NB)
APPENDIX B-Codes From The Convolutional Neural Network (CNN)Model
APPENDIX C-Codes From The Long Short-Term Memory (LSTM) Model
ACKNOWLEGEMENT
本文編號(hào):3608745
【文章來(lái)源】:華北電力大學(xué)(北京)北京市 211工程院校 教育部直屬院校
【文章頁(yè)數(shù)】:60 頁(yè)
【學(xué)位級(jí)別】:碩士
【文章目錄】:
摘要
ABSTRACT
CHAPTER 1 INTRODUCTION
1.1 Text Classification
1.1.1 Definition
1.1.2 Basic Concepts of Text Classification
1.1.3 Text Classification Processes
1.1.4 Applications of Text Classification
1.2 Deep Learning
1.2.1 Definition
1.2.2 History of Deep Learning
1.2.3 Applications of Deep Learning in Text Mining
1.3 Literature Review
1.4 Research Motivation
1.5 Thesis Layout
CHAPTER 2 DEEP LEARNING TECHNIQUES
2.1 Data Preprocessing
2.1.1 Stemming
2.1.2 Word Segmentation
2.2 Text Representation
2.2.1 Word Embedding
2.2.2 One-hot Vector
2.3 Classification
2.3.1 Convolutional Neural Network (CNN)
2.3.2 Convolutional Neural Network (CNN)Applications
2.3.3 Long Short-Term Memory(LSTM)
2.3.4 Gated Recurrent Unit (GRU)
2.3.5 Long Short-Term Memory(LSTM)Applications
2.4 Evaluation
2.4.1 Confusion Matrix
2.4.2 Accuracy
2.4.3 Precision
2.4.4 Recall
2.4.5 False Positive rate(FP), True Negative rate (TN)and False Negative rate(FN)
2.4.6 F-Measure
CHAPTER 3 CONVOLUTIONAL NEURAL NETWORK(CNN)-BASED TEXT CLASSIFICATION
3.1 Dataset
3.2 Baseline Method: Naive Bayes (NB)
3.2.1 Definition
3.2.2 Environment
3.2.3 Experiment
3.2.4 Results and Analysis
3.3 Proposed Method 1: Convolutional Neural Network (CNN)
3.3.1 Definition
3.3.2 Environment
3.3.3 Model
3.3.4 Experiment
3.3.5 Results and Analysis
CHAPTER 4 LONG SHORT-TERM MEMORY(LSTM)-BASED TEXT CLASSIFICATION
4.1 Definition
4.2 Environment
4.3 Model
4.4 Experiment
4.5 Results and Analysis
CHAPTER 5 CONCLUSIONS AND FUTURE WORKS
5.1 Conclusions
5.2 Future Works
REFERENCES
APPENDIX A-Some Codes From The Baseline Method: Naive Bayes (NB)
APPENDIX B-Codes From The Convolutional Neural Network (CNN)Model
APPENDIX C-Codes From The Long Short-Term Memory (LSTM) Model
ACKNOWLEGEMENT
本文編號(hào):3608745
本文鏈接:http://www.sikaile.net/kejilunwen/zidonghuakongzhilunwen/3608745.html
最近更新
教材專著