天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 碩博論文 > 信息類博士論文 >

主題事件挖掘及動(dòng)態(tài)演化分析研究

發(fā)布時(shí)間:2018-07-25 16:19
【摘要】:主題事件挖掘和演化分析是將人們感興趣的事件以結(jié)構(gòu)化的形式呈現(xiàn)出來(lái),抽取事件發(fā)生的關(guān)鍵信息,如時(shí)間、地點(diǎn)、人物等,并進(jìn)行整理和分析以發(fā)現(xiàn)事件之間的關(guān)聯(lián)關(guān)系和發(fā)展形勢(shì),使關(guān)注者能夠更明確和快速地了解事件。主題事件的挖掘主要包括時(shí)序分析、信息檢索、自動(dòng)文摘、話題檢測(cè)與追蹤、事件檢測(cè)、突發(fā)檢測(cè)、異常點(diǎn)檢測(cè)等。前期基礎(chǔ)工作需要進(jìn)行數(shù)據(jù)采集,即獲取事件的相關(guān)數(shù)據(jù)并進(jìn)行結(jié)構(gòu)化或半結(jié)構(gòu)化的處理。本文將從句子到篇章,再到多篇章展開(kāi)研究,處理的對(duì)象是面向主題的事件,主要工作就是對(duì)主題事件進(jìn)行深層次的理解,也就是面向多篇文檔的主題事件抽取和事件分析。主題事件抽取包括面向句子或短語(yǔ)的事件信息識(shí)別,包括時(shí)間、地點(diǎn)、人物、淺層語(yǔ)義分析等;面向文檔的事件信息識(shí)別,主要包括時(shí)間、關(guān)鍵動(dòng)作、地點(diǎn)、人物等,以及面向多文檔的主題事件的信息融合。事件分析包括子主題的動(dòng)態(tài)演化分析、人物影響力分析和異常點(diǎn)檢測(cè)等。本文涵蓋了主題事件挖掘的四個(gè)要點(diǎn),且在不同的研究問(wèn)題中各有側(cè)重。(1)研究主題事件的信息抽取和時(shí)序特征。單純的以句子為單位的事件論元并不能反映主題事件的發(fā)生情況,本研究以主題事件為研究對(duì)象,同時(shí)具有動(dòng)作意義的元事件又是組成主題事件的必要單位,包含句子范圍內(nèi)的事件抽取,篇章內(nèi)的事件抽取,多篇章的事件抽取。本文提出了一個(gè)面向主題事件的時(shí)間識(shí)別模型,將面向句子或短語(yǔ)的時(shí)間識(shí)別轉(zhuǎn)化為面向篇章的時(shí)間識(shí)別,從而識(shí)別主題事件片段的時(shí)間。該模型采用參考時(shí)間動(dòng)態(tài)選擇機(jī)制對(duì)時(shí)間表達(dá)式進(jìn)行規(guī)范化。通常事件元素與動(dòng)詞所支配的論元成分有一定的對(duì)應(yīng)關(guān)系,因此本研究中結(jié)合事件抽取和淺層語(yǔ)義分析,將事件元素與語(yǔ)義角色標(biāo)注相對(duì)應(yīng),改善了純粹基于關(guān)鍵詞或靜態(tài)參考時(shí)間機(jī)制的主題事件片段的時(shí)間識(shí)別的性能。(2)基于動(dòng)量表示和股票價(jià)格分析指標(biāo)進(jìn)行人物影響力分析。本研究將結(jié)合事件的要素以及突發(fā)檢測(cè)理念來(lái)研究人物在整個(gè)事件發(fā)展過(guò)程中的影響力。運(yùn)用物理模型來(lái)定義和構(gòu)造人物影響力的動(dòng)態(tài)性,結(jié)合人物的社會(huì)要素,而不只是靠到達(dá)率來(lái)分析,避免了人物停用詞出現(xiàn)頻率過(guò)高的情況。利用股票分析指標(biāo)來(lái)特征化和分析人物影響力的動(dòng)量特征,同時(shí)考慮多個(gè)平滑異同移動(dòng)平均線(Moving Average Convergence Divergence,MACD)技術(shù)指標(biāo)的聯(lián)合作用,避免了某個(gè)指標(biāo)高而沒(méi)有突發(fā)狀況的突發(fā)檢測(cè)技術(shù)。以此來(lái)分析事件中的要素,以及這些要素在主題事件發(fā)展過(guò)程的參與作用。(3)研究動(dòng)態(tài)增量式策略在主題事件的子主題演化分析中的運(yùn)用。傳統(tǒng)的主題探測(cè)與追蹤是實(shí)現(xiàn)對(duì)新聞媒體信息流中新話題的自動(dòng)識(shí)別以及對(duì)已知話題的動(dòng)態(tài)跟蹤。這些話題可能是沒(méi)有什么關(guān)聯(lián)的獨(dú)立話題,或者可能并不是對(duì)同一個(gè)事件的描述。本研究根據(jù)子主題演化作為動(dòng)態(tài)數(shù)據(jù)流的特點(diǎn),結(jié)合Single-Pass聚類方法、兼類思想以及動(dòng)態(tài)增量思想,進(jìn)行子主題的探測(cè)與追蹤,以實(shí)時(shí)地跟蹤事件發(fā)展的動(dòng)態(tài)。并根據(jù)子主題的時(shí)序性和動(dòng)態(tài)性,對(duì)算法在閾值選擇,相似度平滑和時(shí)間要素方面進(jìn)行了分析。(4)研究統(tǒng)計(jì)理論和模糊集理論協(xié)同作用的異常點(diǎn)檢測(cè)問(wèn)題。異常點(diǎn)檢測(cè)也是一種基于時(shí)序的分析,它考慮了數(shù)據(jù)流的時(shí)序性和動(dòng)態(tài)性。異常點(diǎn)是數(shù)據(jù)集中與其他數(shù)據(jù)顯著不同的數(shù)據(jù),有些異常點(diǎn)可以被認(rèn)為是噪聲,而有些卻是關(guān)鍵信息,比如事件發(fā)展中的異常點(diǎn)往往揭示了事件的關(guān)鍵時(shí)期或轉(zhuǎn)折點(diǎn)。異常點(diǎn)檢測(cè)技術(shù)通常具有需要大量的標(biāo)注數(shù)據(jù),數(shù)據(jù)的統(tǒng)計(jì)分布特征未知,需要多個(gè)參數(shù),控制限確定困難和數(shù)據(jù)本身的模糊性等問(wèn)題。本文針對(duì)這些問(wèn)題,基于統(tǒng)計(jì)過(guò)程控制理論定義了異常點(diǎn)和異常度的概念,根據(jù)異常點(diǎn)本身是個(gè)復(fù)雜概念的特征,運(yùn)用模糊理論和統(tǒng)計(jì)方法相結(jié)合的技術(shù)進(jìn)行事件中的異常點(diǎn)檢測(cè)。該方法可以不需要任何的標(biāo)注數(shù)據(jù),并且是和分布無(wú)關(guān)的,通過(guò)加強(qiáng)式模糊化過(guò)程和優(yōu)化模型進(jìn)行參數(shù)的確定。
[Abstract]:Thematic event mining and evolutionary analysis is a structured form of events that people are interested in, extracting key information from events, such as time, place, and character, and sorting out and analyzing the relationship and development situation between events, so that the participants can understand events more clearly and quickly. Mining mainly includes time series analysis, information retrieval, automatic abstracting, topic detection and tracking, event detection, burst detection, anomaly detection and so on. Early basic work needs data acquisition, that is, to obtain related data of events and to carry out structured or semi-structured. This paper will study from sentence to text, and then to a number of chapters. The object of processing is subject oriented events. The main task is to understand the theme events deeply, that is, thematic event extraction and event analysis oriented to multiple documents. Event extraction includes sentence or phrase oriented event information identification, including time, ground point, character, shallow semantic analysis, etc.; document oriented events. Information recognition mainly includes time, key actions, locations, characters, and information fusion of subject events oriented to multiple documents. Event analysis includes dynamic evolution analysis of subtopics, character influence analysis and anomaly detection. This paper covers four key points for thematic event mining and focuses on different research issues. 1) study the information extraction and timing characteristics of subject events. The simple sentence based event argument does not reflect the occurrence of thematic events. This study takes thematic events as the research object, and the action meaning meta events are the necessary single positions for the theme events, including the event extraction within the sentence scope, and the text in the text. In this paper, a time recognition model for thematic events is proposed in this paper, which transforms the time recognition of the sentence or phrase into the time recognition for the text, thus identifying the time of the subject event fragment. The model uses the reference time dynamic selection mechanism to standardize the time expression. There is a certain correspondence between the event elements and the elements of the verbs dominated by the verb, so in this study, the event extraction and the shallow semantic analysis are combined to correspond the event elements to the semantic role tagging, and the performance of the time recognition of the subject pieces, which are based on the pure keyword or the static reference time mechanism, is improved. (2) based on the momentum. This study will combine the elements of the event and the idea of sudden detection to study the influence of the characters in the course of the development of the whole event. The physical model is used to define and construct the dynamic character of the characters' influence, combining the social elements of the characters, not only by the rate of arrival. By using stock analysis indicators to characterize and analyze the momentum characteristics of people's influence, the combination of several Moving Average Convergence Divergence (MACD) technical indicators is used to avoid a high index and no sudden situation. In order to analyze the factors in the event and the participation of these elements in the development process of the theme events. (3) study the application of dynamic incremental strategy in the subtopic evolution analysis of the theme events. Dynamic tracking of knowledge topics. These topics may be independent topics, or may not be the description of the same event. This study is based on the characteristics of the subtopic evolution as a dynamic data stream, combined with the Single-Pass clustering method, both ideas and dynamic increments, for the detection and tracking of subtopics. According to the timing and dynamics of subtopics, the algorithm is analyzed in terms of threshold selection, similarity smoothness and time factors. (4) the problem of anomaly detection in the synergistic effect of statistical theory and fuzzy set theory is studied. Anomaly detection is also a kind of time series analysis, which takes into account the data The time sequence and dynamics of flow. Outliers are data which are significantly different from other data. Some outliers can be considered noise, and some are key information. For example, the exception point in the event development often reveals the critical period or turning point of the event. Anomaly detection technology usually requires a large number of tagged data. The statistical distribution characteristics of the data are unknown, and many parameters are needed, the control limit is difficult to determine and the fuzziness of the data itself. In this paper, based on the theory of statistical process control, this paper defines the concept of abnormal points and abnormality. According to the characteristics of the anomaly point itself, the combination of the fuzzy theory and the statistical method is combined. The technique performs the anomaly detection in the event. This method can not require any annotation data and is independent of the distribution. The parameters are determined by the enhanced fuzzification process and the optimization model.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP391.1

【相似文獻(xiàn)】

相關(guān)期刊論文 前10條

1 周道林;;論過(guò)程感知信息系統(tǒng)中過(guò)程的動(dòng)態(tài)演化[J];信息系統(tǒng)工程;2012年09期

2 趙倫;肖鏃;;計(jì)算機(jī)軟件動(dòng)態(tài)演化技術(shù)概述[J];計(jì)算機(jī)光盤軟件與應(yīng)用;2013年13期

3 鄧?yán)?吳健;胡正國(guó);;一種嵌入式系統(tǒng)的動(dòng)態(tài)演化方法[J];計(jì)算機(jī)應(yīng)用;2007年11期

4 王愛(ài)萍;閭國(guó)年;黃家柱;鄭新奇;林冰仙;;面向動(dòng)態(tài)演化的城鎮(zhèn)地價(jià)評(píng)估系統(tǒng)[J];計(jì)算機(jī)工程;2008年14期

5 李玉龍;李長(zhǎng)云;;軟件動(dòng)態(tài)演化技術(shù)[J];計(jì)算機(jī)技術(shù)與發(fā)展;2008年09期

6 陳春華;馬勤;陳雅莉;;基于動(dòng)態(tài)演化的水文數(shù)據(jù)轉(zhuǎn)貯技術(shù)研究與應(yīng)用[J];水文;2011年S1期

7 賈向陽(yáng);應(yīng)時(shí);張韜;余曉峰;;一個(gè)支持業(yè)務(wù)過(guò)程動(dòng)態(tài)演化的可反射框架[J];計(jì)算機(jī)工程;2006年10期

8 陳洪龍;李仁發(fā);;面向服務(wù)對(duì)象的動(dòng)態(tài)演化機(jī)制[J];計(jì)算機(jī)應(yīng)用;2010年07期

9 賴格英,于革;古氣候動(dòng)力模擬動(dòng)態(tài)演化的可視化研究與實(shí)現(xiàn)[J];計(jì)算機(jī)應(yīng)用;2004年06期

10 沈思;鄭昌興;;基于動(dòng)態(tài)演化模式的詞表組織設(shè)計(jì)與實(shí)現(xiàn)[J];計(jì)算機(jī)光盤軟件與應(yīng)用;2013年24期

相關(guān)會(huì)議論文 前5條

1 趙會(huì)群;孫晶;魏瑩;王文文;;服務(wù)體系結(jié)構(gòu)的動(dòng)態(tài)演化方法研究[A];CCF NCSC 2011——第二屆中國(guó)計(jì)算機(jī)學(xué)會(huì)服務(wù)計(jì)算學(xué)術(shù)會(huì)議論文集[C];2011年

2 王弟海;龔六堂;;持續(xù)性不平等的原因及其動(dòng)態(tài)演化綜述[A];經(jīng)濟(jì)學(xué)(季刊)第7卷第2期[C];2008年

3 張涵信;沈孟育;;基于動(dòng)態(tài)演化的最優(yōu)化方法[A];近代空氣動(dòng)力學(xué)研討會(huì)論文集[C];2005年

4 鄭江淮;張曉云;;從國(guó)際代工到國(guó)際研發(fā):價(jià)值鏈攀升的動(dòng)態(tài)演化[A];社會(huì)主義經(jīng)濟(jì)理論研究集萃——從經(jīng)濟(jì)大國(guó)走向經(jīng)濟(jì)強(qiáng)國(guó)的戰(zhàn)略思維(2011)[C];2011年

5 高瑩瑩;何楓;沈孟育;;非定常動(dòng)態(tài)演化伴隨方法在翼型氣動(dòng)設(shè)計(jì)中的應(yīng)用[A];北京力學(xué)會(huì)第19屆學(xué)術(shù)年會(huì)論文集[C];2013年

相關(guān)博士學(xué)位論文 前6條

1 李風(fēng)環(huán);主題事件挖掘及動(dòng)態(tài)演化分析研究[D];哈爾濱工業(yè)大學(xué);2016年

2 苗又山;大規(guī)模動(dòng)態(tài)演化圖的存儲(chǔ)與分析系統(tǒng)研究[D];中國(guó)科學(xué)技術(shù)大學(xué);2015年

3 姚毅;中國(guó)城鄉(xiāng)貧困動(dòng)態(tài)演化的理論與實(shí)證研究[D];西南財(cái)經(jīng)大學(xué);2010年

4 陳洪龍;面向?qū)ο蟆獦?gòu)件的軟件動(dòng)態(tài)演化技術(shù)研究[D];湖南大學(xué);2011年

5 謝仲文;一種需求驅(qū)動(dòng)、以體系結(jié)構(gòu)為視圖的面向軟件動(dòng)態(tài)演化的模型與方法[D];云南大學(xué);2012年

6 趙旭劍;中文新聞話題動(dòng)態(tài)演化及其關(guān)鍵技術(shù)研究[D];中國(guó)科學(xué)技術(shù)大學(xué);2012年

相關(guān)碩士學(xué)位論文 前10條

1 王華;軟件動(dòng)態(tài)演化良性化建模與評(píng)估方法研究[D];湖南工業(yè)大學(xué);2015年

2 魏秋彥;環(huán)境變化對(duì)軟件動(dòng)態(tài)演化的作用機(jī)理研究[D];湖南工業(yè)大學(xué);2015年

3 薛彤;微博輿情動(dòng)態(tài)演化特性及多主體仿真研究[D];南京航空航天大學(xué);2015年

4 蔣旭東;面向動(dòng)態(tài)演化的軟件行為相關(guān)性問(wèn)題分析研究[D];云南大學(xué);2016年

5 江晶;基于競(jìng)爭(zhēng)優(yōu)勢(shì)動(dòng)態(tài)演化的高新技術(shù)企業(yè)可持續(xù)發(fā)展研究[D];武漢理工大學(xué);2007年

6 楊軼波;我國(guó)大學(xué)衍生企業(yè)的動(dòng)態(tài)演化分析[D];上海交通大學(xué);2010年

7 張丹;基于SCA的動(dòng)態(tài)演化模型SO-DSAM的研究與應(yīng)用[D];西北大學(xué);2011年

8 曾惠芳;基于高階挖掘的動(dòng)態(tài)演化規(guī)律研究[D];暨南大學(xué);2011年

9 蘇衛(wèi)華;復(fù)雜網(wǎng)絡(luò)社區(qū)發(fā)現(xiàn)及其動(dòng)態(tài)演化研究[D];太原理工大學(xué);2010年

10 仇書(shū)禮;面向服務(wù)的構(gòu)件動(dòng)態(tài)演化方法及其實(shí)現(xiàn)[D];哈爾濱工業(yè)大學(xué);2011年

,

本文編號(hào):2144370

資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/shoufeilunwen/xxkjbs/2144370.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶e6a06***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com