基于DBSCAN算法與句間關系的熱點話題發(fā)現(xiàn)研究
發(fā)布時間:2018-03-18 05:15
本文選題:信息用戶 切入點:熱點話題 出處:《圖書情報工作》2017年12期 論文類型:期刊論文
【摘要】:[目的 /意義]在大數(shù)據(jù)時代面對海量的數(shù)據(jù)用戶有時會束手無策。因此,越來越多的學者們開始關注互聯(lián)網(wǎng)熱點話題發(fā)現(xiàn)的算法,幫助用戶快速獲取熱點話題。[方法 /過程]基于DBSCAN算法,通過動態(tài)調(diào)整參數(shù)來優(yōu)化算法,實現(xiàn)熱點話題發(fā)現(xiàn)。根據(jù)句法結構與句間關系分析構建熱點話題過濾模型,過濾包含熱點詞項的一般話題。[結果 /結論]采用主流網(wǎng)站新聞數(shù)據(jù)集進行實驗,利用錯檢率、漏檢率等評價指標對算法的有效性進行檢驗,實驗結果證明改進算法性能有所提升,能夠為信息用戶提供科學研究網(wǎng)絡數(shù)據(jù)的高效途徑。
[Abstract]:[purpose / significance] in big data's time faced with massive data users will sometimes be helpless. Therefore, more and more scholars are beginning to pay attention to the Internet hot topic discovery algorithm, [methods / procedures] based on DBSCAN algorithm, the algorithm is optimized by dynamically adjusting parameters to realize hot topic discovery. Based on the analysis of syntactic structure and sentence relationship, a hot topic filtering model is constructed. Filtering general topics containing hot words. [results / conclusions] using mainstream website news data set to test the effectiveness of the algorithm, using error detection rate, missed detection rate and other evaluation indicators, The experimental results show that the improved algorithm can improve the performance of the algorithm and provide an efficient way for information users to study network data scientifically.
【作者單位】: 長春理工大學圖書館;長春市農(nóng)業(yè)信息中心;
【分類號】:G254
,
本文編號:1628151
本文鏈接:http://www.sikaile.net/tushudanganlunwen/1628151.html
教材專著