當(dāng)前位置：主頁(yè) > 醫(yī)學(xué)論文 > 實(shí)驗(yàn)醫(yī)學(xué)論文 >

廣義線性模型的穩(wěn)健估計(jì)及其醫(yī)學(xué)應(yīng)用

發(fā)布時(shí)間：2018-04-30 11:40

本文選題：廣義線性模型 + 穩(wěn)健估計(jì)　；參考：《山西醫(yī)科大學(xué)》2009年碩士論文

【摘要】： 廣義線性模型(generalized linear model,GLM)是一類應(yīng)用范圍較廣的模型,它可以滿足應(yīng)變量為連續(xù)和離散數(shù)據(jù)的建模,特別是后者,如屬性數(shù)據(jù),計(jì)數(shù)數(shù)據(jù)。這在應(yīng)用上,尤其是生物、醫(yī)學(xué)、經(jīng)濟(jì)和社會(huì)數(shù)據(jù)的統(tǒng)計(jì)分析上,有著重要意義。但是其經(jīng)典模型擬合方法最大似然估計(jì)(MLE)容易受離群點(diǎn)的影響,甚至得出錯(cuò)誤結(jié)論。因此,研究能有效對(duì)抗離群點(diǎn)的穩(wěn)健估計(jì)方法將具有重要意義。本文回顧和比較了四種適用于廣義線性模型的穩(wěn)健估計(jì)方法:Mallows擬似然估計(jì)、條件無(wú)偏影響約束估計(jì)(CUBIF)、Mallows降權(quán)杠桿點(diǎn)估計(jì)和一致性錯(cuò)分模型估計(jì)。首先在穩(wěn)健回歸估計(jì)基本理論的基礎(chǔ)上對(duì)這四種估計(jì)方法的基本思想和穩(wěn)健性質(zhì)進(jìn)行了詳細(xì)的闡述。其中后兩種方法只能適用于Logistic回歸模型。在模擬分析中,對(duì)Mallows擬似然估計(jì)考慮了帽矩陣、MVE和MCD三種針對(duì)x方向降權(quán)的尺度,對(duì)Mallows降權(quán)杠桿點(diǎn)估計(jì)考慮Carroll和Huber兩種降權(quán)函數(shù)。模擬分析基于兩種常見(jiàn)的廣義線性模型即Logistic回歸和Poisson回歸進(jìn)行了設(shè)計(jì),然后對(duì)每種模型建立的模擬樣本中分別構(gòu)建y方向、x和y方向兩種不同類型和不同比例的離群點(diǎn)情況,探討了適用于各自模型的各種估計(jì)方法對(duì)抗不同類型和比例離群點(diǎn)的能力。通過(guò)模擬研究我們得到以下結(jié)論: 1.相比較于經(jīng)典的MLE,這一類穩(wěn)健估計(jì)方法在一定程度上可以更好的對(duì)抗離群值產(chǎn)生的影響,描述最佳擬合大部分?jǐn)?shù)據(jù)的結(jié)構(gòu);可以更清楚地識(shí)別離群值、模型中的強(qiáng)影響點(diǎn)與模型偏離的結(jié)構(gòu);當(dāng)數(shù)據(jù)中沒(méi)有影響點(diǎn)時(shí),其估計(jì)與經(jīng)典MLE估計(jì)一樣好,但是當(dāng)MLE條件不滿足時(shí),穩(wěn)健估計(jì)結(jié)果要遠(yuǎn)遠(yuǎn)優(yōu)于MLE。 2.在Logistic回歸模型和Poisson回歸模型情況下,Mallows擬似然估計(jì)基于MVE和MCD的降權(quán)方法都表現(xiàn)了較其他估計(jì)方法更強(qiáng)的對(duì)抗離群點(diǎn)的能力。而基于帽矩陣的降權(quán)方法則由于帽矩陣本身的不穩(wěn)健性導(dǎo)致了其較低的失效點(diǎn)。 3.Mallows降權(quán)杠桿點(diǎn)估計(jì)方法由于其權(quán)函數(shù)是基于x方向離群點(diǎn),所以在單純的1%的y方向的離群點(diǎn)時(shí)即失去效用,但是在x和y方向同時(shí)異常時(shí)有很好的對(duì)抗性離群點(diǎn)的能力,不過(guò)由于其權(quán)函數(shù)對(duì)x方向離群點(diǎn)觀測(cè)賦權(quán)重為0達(dá)到規(guī)避離群觀測(cè)的特性,在離群點(diǎn)比例增大時(shí),極容易導(dǎo)致logistic回歸模型完美分割導(dǎo)致估計(jì)無(wú)解情況的發(fā)生,而且其降權(quán)過(guò)程會(huì)損失樣本的大量信息。 4.一致性錯(cuò)分模型估計(jì)表現(xiàn)要差于前兩種方法,但相對(duì)MLE來(lái)說(shuō)具有較好的穩(wěn)健性,不過(guò)其缺點(diǎn)在于可能造成正常觀測(cè)點(diǎn)的強(qiáng)制降權(quán)作用。 5.CUBIF本身思想為影響約束估計(jì),可以同時(shí)考慮x和y方向的異常情況,不過(guò)其表現(xiàn)要劣于其他穩(wěn)健估計(jì)方法。最后本文通過(guò)兩個(gè)實(shí)例,探討了這些方法的實(shí)際應(yīng)用。
[Abstract]:Generalized linear model (GLM) is a class of models with a wide range of applications, which can satisfy the modeling of continuous and discrete variables, especially the latter, such as attribute data and counting data. This is of great significance in the application, especially in the statistical analysis of biological, medical, economic and social data. However, the maximum likelihood estimation (MLEs) of the classical model fitting method is easily affected by outliers, and even the wrong conclusions are obtained. Therefore, it is of great significance to study robust estimation methods for outliers. In this paper, we review and compare four methods of robust estimation for generalized linear models: the * Mallows quasi-likelihood estimator, the conditional unbiased influence constraint estimation, the weighted leverage point estimation and the consistent misdivision model estimation. On the basis of the basic theory of robust regression estimation, the basic ideas and robust properties of these four estimation methods are described in detail. The latter two methods can only be applied to Logistic regression model. In the simulation analysis, the Mallows quasi-likelihood estimation takes into account three kinds of scales for the reduction of weights in the x direction of the cap matrix MVE and MCD, and Carroll and Huber for the Mallows weight reduction lever point estimation. The simulation analysis is based on two common generalized linear models, namely, Logistic regression and Poisson regression. Then, the outliers of y direction x and y direction are constructed in the simulated samples of each model. The ability of various estimation methods suitable for each model to deal with different types and proportions of outliers is discussed. Through the simulation study, we get the following conclusions: 1. Compared with the classical MLEs, this kind of robust estimation method can better resist the influence of outliers to some extent, describe the structure of the best fitting most data, and identify outliers more clearly. When there is no influence point in the data, the estimation is as good as the classical MLE estimation, but when the MLE condition is not satisfied, the robust estimation result is much better than that of the MLE. 2. In the case of Logistic regression model and Poisson regression model, the weight reduction methods based on MVE and MCD both show stronger ability to resist outliers than other estimation methods. The weight reduction method based on hat matrix leads to lower failure point due to the unrobustness of cap matrix itself. Because the weight function of 3.Mallows 's weight reduction lever point estimation method is based on the outliers in the x direction, it loses its effectiveness when the outlier is only 1% in the y direction, but it has a good ability to resist outliers when the x and y directions are abnormal at the same time. However, due to the fact that its weight function gives a weight of 0 to the observation of outliers in the x direction to avoid outliers, when the proportion of outliers increases, it is easy to lead to the perfect segmentation of the logistic regression model and the occurrence of the estimation without solution. And the process of weight reduction will lose a lot of information of the sample. 4. The estimation performance of consistent misdivision model is worse than that of the first two methods, but it has better robustness than MLE, but its disadvantage is that it may result in the forced weight reduction of normal observation points. The idea of 5.CUBIF itself is to influence the constrained estimation, so we can consider the anomaly in the direction of x and y at the same time, but its performance is inferior to that of other robust estimation methods. Finally, this paper discusses the practical application of these methods through two examples.
【學(xué)位授予單位】：山西醫(yī)科大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2009
【分類號(hào)】：R311

【參考文獻(xiàn)】

相關(guān)期刊論文前3條

1 陳希孺;廣義線性模型(一)[J];數(shù)理統(tǒng)計(jì)與管理;2002年05期

2 陳希孺;廣義線性模型(五)[J];數(shù)理統(tǒng)計(jì)與管理;2003年03期

3 張彥榮;王彤;;關(guān)于Logistic回歸中最大似然估計(jì)的可估性[J];現(xiàn)代預(yù)防醫(yī)學(xué);2009年10期

，

本文編號(hào)：1824416

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://www.sikaile.net/yixuelunwen/shiyanyixue/1824416.html

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

廣義線性模型的穩(wěn)健估計(jì)及其醫(yī)學(xué)應(yīng)用