基于LSTM模型的中文圖書多標(biāo)簽分類研究
發(fā)布時(shí)間:2018-04-25 20:44
本文選題:LSTM模型 + 深度學(xué)習(xí) ; 參考:《數(shù)據(jù)分析與知識(shí)發(fā)現(xiàn)》2017年07期
【摘要】:【目的】利用LSTM模型和字嵌入的方法構(gòu)建分類系統(tǒng),提出一種中文圖書分類中多標(biāo)簽分類的解決方案。【方法】引入深度學(xué)習(xí)算法,利用字嵌入方法和LSTM模型構(gòu)建分類系統(tǒng),對(duì)題名、主題詞等字段組成的字符串進(jìn)行學(xué)習(xí)以訓(xùn)練模型,并采用構(gòu)建多個(gè)二元分類器的方法解決多標(biāo)簽分類問(wèn)題,選擇3所高校5個(gè)類別的書目數(shù)據(jù)進(jìn)行實(shí)驗(yàn)!窘Y(jié)果】從整體準(zhǔn)確率、各類別精度、召回率、F1值多個(gè)指標(biāo)進(jìn)行分析,本文提出的模型均有良好表現(xiàn),有較強(qiáng)的實(shí)際應(yīng)用價(jià)值!揪窒蕖繑(shù)據(jù)僅涉及中圖分類法5個(gè)類別,考慮的分類粒度較粗等!窘Y(jié)論】基于LSTM模型的中文圖書分類系統(tǒng)具有預(yù)處理簡(jiǎn)單、增量學(xué)習(xí)、可遷移性高等優(yōu)點(diǎn),具備可行性和實(shí)用性。
[Abstract]:[objective] to construct a classification system by using LSTM model and word embedding method, and to put forward a solution of multi-label classification in Chinese book classification. [methods] an in-depth learning algorithm is introduced, and a classification system is constructed by word embedding method and LSTM model. In order to train the model, we use the method of constructing multiple binary classifiers to solve the problem of multi-label classification. The bibliographic data of five categories of three colleges and universities are selected to carry on the experiment. [results] from the overall accuracy, the precision of each category, the recall rate and the F1 value, the model presented in this paper has good performance. It has strong practical application value. [limitation] data only involve 5 categories of middle graph classification, and consider the classification granularity is coarser. [conclusion] the Chinese book classification system based on LSTM model has simple preprocessing and incremental learning. It has the advantages of high mobility, feasibility and practicability.
【作者單位】: 南京大學(xué)信息管理學(xué)院;江蘇省數(shù)據(jù)工程與知識(shí)服務(wù)重點(diǎn)實(shí)驗(yàn)室(南京大學(xué));
【基金】:國(guó)家自然科學(xué)基金項(xiàng)目“面向?qū)W術(shù)資源的TSD與TDC測(cè)度及分析研究”(項(xiàng)目編號(hào):71503121) 中央高;究蒲袠I(yè)務(wù)費(fèi)重點(diǎn)項(xiàng)目“我國(guó)圖書情報(bào)學(xué)科知識(shí)結(jié)構(gòu)及演化動(dòng)態(tài)研究”(項(xiàng)目編號(hào):20620140645)的研究成果之一
【分類號(hào)】:TP181;TP391.1
,
本文編號(hào):1802892
本文鏈接:http://www.sikaile.net/kejilunwen/ruanjiangongchenglunwen/1802892.html
最近更新
教材專著