天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

工商網(wǎng)上違法廣告智能識(shí)別關(guān)鍵技術(shù)研究與實(shí)現(xiàn)

發(fā)布時(shí)間:2019-03-17 15:25
【摘要】:隨著科技的進(jìn)步和社會(huì)的發(fā)展,網(wǎng)絡(luò)經(jīng)營(yíng)和網(wǎng)上消費(fèi)越來(lái)越受到廣告經(jīng)營(yíng)者和消費(fèi)者的青睞,互聯(lián)網(wǎng)廣告在經(jīng)濟(jì)社會(huì)領(lǐng)域中發(fā)揮著不可替代的作用。但是帶來(lái)巨大便利的同時(shí)也帶來(lái)了很多問(wèn)題:虛假宣傳、夸大療效、保證治愈等誤導(dǎo)、欺騙消費(fèi)者的現(xiàn)象。因此對(duì)互聯(lián)網(wǎng)廣告進(jìn)行有效的監(jiān)督和監(jiān)管具有非常重要的意義。 本文面向工商監(jiān)管領(lǐng)域,對(duì)網(wǎng)絡(luò)違法文本廣告智能識(shí)別的關(guān)鍵技術(shù)進(jìn)行研究與實(shí)現(xiàn)。不同類(lèi)別的違法廣告有不同的處理方式,首先使用改進(jìn)的文本分類(lèi)算法對(duì)文本廣告進(jìn)行分類(lèi)。通過(guò)挖掘維基百科知識(shí),向文檔中添加語(yǔ)義特征,改善向量空間模型的效果。然后基于擴(kuò)充的維基百科語(yǔ)義特征,提出新的文檔相似度計(jì)算方法,通過(guò)聚類(lèi)過(guò)程為置信度高的未標(biāo)注樣本打上標(biāo)記,以此來(lái)擴(kuò)充標(biāo)注樣本的數(shù)量,提高廣告文本分類(lèi)效果。 在違法廣告的識(shí)別上,針對(duì)包含禁用詞類(lèi)型的廣告,對(duì)傳統(tǒng)的關(guān)鍵詞匹配技術(shù)進(jìn)行改進(jìn),提出基于上下文的邏輯關(guān)鍵詞匹配技術(shù)。針對(duì)包含違法描述句子型的廣告,結(jié)合廣告文本較短以及語(yǔ)義缺失等特點(diǎn),提出基于潛在概率語(yǔ)義分析的違法廣告識(shí)別模型。實(shí)驗(yàn)表明,本文提出的算法可以提高違法廣告識(shí)別的效果。 設(shè)計(jì)并實(shí)現(xiàn)了工商違法廣告智能識(shí)別系統(tǒng)。闡述了系統(tǒng)目標(biāo)與總體設(shè)計(jì),并介紹了違法廣告識(shí)別模型的訓(xùn)練過(guò)程,系統(tǒng)數(shù)據(jù)的獲取以及系統(tǒng)提供給用戶(hù)的任務(wù)管理和違法報(bào)告管理平臺(tái)。
[Abstract]:With the progress of science and technology and the development of society, network management and online consumption are more and more favored by advertising operators and consumers. Internet advertising plays an irreplaceable role in the economic and social fields. But it brings a lot of problems at the same time: false propaganda, exaggerating curative effect, ensuring cure and misleading, deceiving consumers. Therefore, the effective supervision and supervision of Internet advertising has very important significance. This paper focuses on the research and realization of the key technology of intelligent identification of network illegal text advertising in the field of industrial and commercial supervision. Different types of illegal advertisements have different processing methods. Firstly, the improved text classification algorithm is used to classify the text advertisements. By mining Wikipedia knowledge, semantic features are added to the document to improve the effect of vector space model. Then, based on the extended Wikipedia semantic features, a new method of document similarity calculation is proposed. Through the clustering process, the unlabeled samples with high confidence are marked, so as to expand the number of labeled samples and improve the classification effect of advertising texts. In the recognition of illegal advertisement, the traditional keyword matching technology is improved for the advertisement containing prohibited word type, and the context-based logical keyword matching technology is proposed. Aiming at the advertisement which contains illegal description sentence pattern, this paper proposes an illegal advertisement recognition model based on latent probability semantic analysis, which combines the characteristics of short advertisement text and semantic missing. Experiments show that the algorithm proposed in this paper can improve the effect of illegal advertising recognition. Design and implement the industry and commerce illegal advertising intelligent identification system. This paper expounds the target and overall design of the system, and introduces the training process of the identification model of illegal advertisement, the acquisition of system data, and the task management and illegal report management platform provided by the system to users.
【學(xué)位授予單位】:浙江大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類(lèi)號(hào)】:TP391.1

【參考文獻(xiàn)】

相關(guān)期刊論文 前1條

1 蘇金樹(shù);張博鋒;徐昕;;基于機(jī)器學(xué)習(xí)的文本分類(lèi)技術(shù)研究進(jìn)展[J];軟件學(xué)報(bào);2006年09期

,

本文編號(hào):2442428

資料下載
論文發(fā)表

本文鏈接:http://www.sikaile.net/wenyilunwen/guanggaoshejilunwen/2442428.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶(hù)c3ae8***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com