當(dāng)前位置：主頁(yè) > 醫(yī)學(xué)論文 > 預(yù)防醫(yī)學(xué)論文 >

多組比較的傾向性評(píng)分模型構(gòu)建及匹配法的研究和應(yīng)用

發(fā)布時(shí)間：2018-05-08 20:04

本文選題：傾向性評(píng)分匹配 + 最鄰近匹配法　；參考：《第二軍醫(yī)大學(xué)》2014年博士論文

【摘要】：研究背景：隨著信息技術(shù)的不斷發(fā)展，觀察性研究無(wú)論是在數(shù)量上還是在研究準(zhǔn)確性上都在不斷增加和提高。大樣本的觀察性研究在醫(yī)學(xué)研究當(dāng)中發(fā)揮著越來(lái)越重要的作用。但在觀察性研究中，由于研究對(duì)象所在的組別不是隨機(jī)分配的，而是自然存在的，因此具有某些特征的研究對(duì)象更傾向于進(jìn)入處理組或?qū)φ战M，導(dǎo)致不同組間存在混雜偏倚。傾向性評(píng)分法(propensity score, PS)是解決觀察性研究中存在混雜偏倚的常用研究方法。該方法便于理解、研究步驟標(biāo)準(zhǔn)化程度高，近些年在非隨機(jī)化大樣本的觀察性研究當(dāng)中被廣泛應(yīng)用。傾向性評(píng)分法的應(yīng)用主要包括匹配法、分層法和回歸校正法等，以匹配法最具優(yōu)勢(shì)，應(yīng)用范圍也最為廣泛。傾向性評(píng)分匹配法主要包括最鄰近匹配法、卡鉗匹配法和馬氏距離匹配法等幾種方法。目前，對(duì)于傾向性評(píng)分匹配法的應(yīng)用上還有一些問(wèn)題尚未得到解決。例如，對(duì)于在傾向性評(píng)分模型中應(yīng)放入何種類型的協(xié)變量，目前仍存在著爭(zhēng)議；何種匹配方法更具優(yōu)勢(shì)目前尚未得到定論；另外，目前傾向性評(píng)分匹配法主要用于分組因素為二分類的觀察性研究資料，很少有研究將其用于分組因素為多分類的觀察性研究資料中。研究目的：構(gòu)建分組因素為有序三分類的傾向性評(píng)分匹配方法。通過(guò)模擬研究篩選納入到傾向性評(píng)分模型中的協(xié)變量，比較多種匹配方法在分組因素為有序三分類情況下優(yōu)劣，通過(guò)調(diào)整參數(shù)確定不同數(shù)據(jù)特征下最具優(yōu)勢(shì)的匹配方式，同時(shí)在分組因素為有序三分類的情況下對(duì)不同傾向性評(píng)分應(yīng)用方法進(jìn)行比較，最后將模擬研究中建立的最優(yōu)傾向性評(píng)分匹配方法應(yīng)用到實(shí)際數(shù)據(jù)分析中。研究方法：本研究采用蒙特卡洛法模擬數(shù)據(jù)集。分組因素模擬為有序三分類，并分別調(diào)整不同組間的樣本量比例為1:1:1、2:3:5、1:2:3和1:4:5。根據(jù)協(xié)變量與分組因素和結(jié)局的關(guān)系模擬不同類型的協(xié)變量，包括與分組因素和結(jié)局均相關(guān)聯(lián)的協(xié)變量、與分組因素相關(guān)聯(lián)的協(xié)變量、與結(jié)局相關(guān)聯(lián)的協(xié)變量和與分組因素和結(jié)局均不相關(guān)聯(lián)的協(xié)變量。通過(guò)在傾向性評(píng)分模型中納入不同類型的協(xié)變量，確定在分組因素為有序三分類情況下傾向性評(píng)分模型中應(yīng)納入的協(xié)變量類型。根據(jù)分組因素為二分類的傾向性評(píng)分匹配方法的基本思想，構(gòu)建分組因素為有序三分類的傾向性評(píng)分匹配法，包括最鄰近匹配法、卡鉗匹配法和馬氏距離匹配法，并通過(guò)SAS宏程序?qū)崿F(xiàn)各種匹配方法。在不同匹配方法中設(shè)定不同匹配參數(shù)，如匹配比例、卡鉗值等，通過(guò)比較不同匹配方法和設(shè)定不同匹配參數(shù)確定不同數(shù)據(jù)特征下最具優(yōu)勢(shì)的匹配方式。另外，還將利用模擬數(shù)據(jù)比較不同傾向性評(píng)分應(yīng)用方法，包括匹配法、分層法、回歸校正法和匹配后回歸校正法。采用有序logistic回歸分析法計(jì)算分組因素為有序三分類的研究對(duì)象的傾向性評(píng)分值。在傾向性評(píng)分匹配前后需要對(duì)放入傾向性評(píng)分模型中的協(xié)變量進(jìn)行均衡性檢驗(yàn)。本研究采用標(biāo)準(zhǔn)化差異法(standardized differences, SD)來(lái)評(píng)價(jià)不同組間協(xié)變量的均衡性。通過(guò)預(yù)實(shí)驗(yàn)得到，當(dāng)分組因素為有序三分類時(shí)，，不同組間標(biāo)準(zhǔn)化差異的絕對(duì)值的最大值大于0.1時(shí)，三組間的協(xié)變量尚未達(dá)到均衡。當(dāng)完成傾向性評(píng)分匹配后，還要對(duì)模型的偏性和精度進(jìn)行評(píng)價(jià)。本研究采用相對(duì)偏倚(relative bias, RB)來(lái)評(píng)價(jià)模型的偏性，RB的絕對(duì)值越小，表明模型的偏性就越��；采用平均誤差均方(mean squarederror, MSE)來(lái)評(píng)價(jià)模型的精度，MSE越小，表明模型的精度越高。最后，將模擬研究建立的分組因素為有序三分類的傾向性評(píng)分匹配方法應(yīng)用到實(shí)例分析中。實(shí)例分析部分的數(shù)據(jù)來(lái)源于第二軍醫(yī)大學(xué)承擔(dān)的“中國(guó)大陸胃腸道疾病流行病學(xué)調(diào)查”的數(shù)據(jù)。本研究利用問(wèn)卷中調(diào)查對(duì)象的一般信息、體格檢查問(wèn)卷和SF-36健康調(diào)查問(wèn)卷中的數(shù)據(jù)，評(píng)價(jià)腹部肥胖與健康相關(guān)的生活質(zhì)量(health-related quality oflife, HRQOL)之間的關(guān)系。人口學(xué)信息包括性別、年齡、身高、體重、教育水平、職業(yè)和慢性病發(fā)病情況等。腹部特征定義為“正常腰圍”、“輕度腹部肥胖”和“重度腹部肥胖”三類。健康相關(guān)的生活質(zhì)量采用中文版的健康測(cè)量簡(jiǎn)表(SF-36)進(jìn)行評(píng)價(jià)。以腹部特征為分組因素，健康相關(guān)的生活質(zhì)量的各個(gè)維度得分為結(jié)局，篩選人口學(xué)信息中的變量為協(xié)變量，構(gòu)建傾向性評(píng)分模型。利用模擬研究建立的傾向性評(píng)分匹配方法控制混雜因素對(duì)結(jié)局的影響，從而評(píng)價(jià)腹部肥胖對(duì)健康相關(guān)的生活質(zhì)量的影響。研究結(jié)果：（1）協(xié)變量篩選：在分組因素為有序三分類的情況下，當(dāng)傾向性評(píng)分模型中納入與結(jié)局相關(guān)聯(lián)的協(xié)變量時(shí)，可獲得相對(duì)較高的匹配比例，并且估計(jì)的處理效應(yīng)的偏性相對(duì)最小，精度最高。當(dāng)逐步從模型中剔除一個(gè)協(xié)變量后，如果該協(xié)變量與分組因素和結(jié)局變量均相關(guān)聯(lián)，會(huì)極大增加處理效應(yīng)估計(jì)值的偏性，降低其精度，說(shuō)明與分組因素和結(jié)局變量均相關(guān)聯(lián)的協(xié)變量需全部納入，同時(shí)再納入與結(jié)局相關(guān)聯(lián)但與分組因素不相關(guān)聯(lián)的協(xié)變量可進(jìn)一步減小處理效應(yīng)估計(jì)的偏性，增大處理效應(yīng)估計(jì)的精度。因此，在分組因素為有序三分類的情況下，傾向性評(píng)分模型中需納入與結(jié)局相關(guān)聯(lián)的協(xié)變量，無(wú)論其是否與分組因素相關(guān)聯(lián)。（2）匹配方法構(gòu)建和比較：本研究構(gòu)建了分組因素為有序三分類的傾向性評(píng)分匹配方法，包括最鄰近匹配法、卡鉗匹配法和馬氏距離法，并對(duì)不同匹配方法進(jìn)行比較。在不同組間樣本量比例下，卡鉗匹配法的效果均達(dá)到最好。當(dāng)組間樣本量比例為1:1:1時(shí)，采用卡鉗匹配法（卡鉗值設(shè)為0.005）進(jìn)行1:1:1匹配效果最好；當(dāng)組間樣本量比例為2:3:5時(shí)，采用卡鉗匹配法（卡鉗值設(shè)為0.01）進(jìn)行1:1:1匹配效果最好；當(dāng)組間樣本量比例為1:2:3時(shí)，采用卡鉗匹配法（卡鉗值設(shè)為0.01）進(jìn)行1:1:1匹配效果最好；組間樣本量比例為1:4:5時(shí)，采用卡鉗匹配法（卡鉗值設(shè)為0.01）進(jìn)行1:2:2匹配效果最好。（3）不同傾向性評(píng)分應(yīng)用方法比較：不同傾向性評(píng)分方法均能極大地降低處理效應(yīng)估計(jì)值的偏性，提高處理效應(yīng)估計(jì)值的精度。無(wú)論組間樣本量比例如何，匹配法和匹配后回歸校正法的效果均優(yōu)于其他方法。當(dāng)組間樣本量比例為1:1:1時(shí)，回歸校正法優(yōu)于分層法；當(dāng)組間樣本量的比例逐漸拉大時(shí)，分層法優(yōu)于回歸校正法。（4）實(shí)例研究：經(jīng)傾向性評(píng)分匹配后，所有與結(jié)局相關(guān)聯(lián)的協(xié)變量均在不同腹部特征組間達(dá)到了均衡，因此可以直接評(píng)價(jià)腹部肥胖對(duì)健康相關(guān)的生活質(zhì)量的作用。結(jié)果表明，在體能維度上，重度腹部肥胖組的人群得分均顯著低與正常腰圍組，而輕度腹部肥胖組的人群得分顯著高于正常腰圍組。而在社會(huì)功能維度上，只有重度腹部肥胖組的人群在得分上顯著低于正常腰圍組人群，輕度腹部肥胖組人群與正常腰圍組人群在得分上無(wú)統(tǒng)計(jì)學(xué)差別。研究結(jié)論：在分組因素為有序三分類的情況下，傾向性評(píng)分模型中應(yīng)納入與結(jié)局相關(guān)聯(lián)的協(xié)變量。在進(jìn)行傾向性評(píng)分匹配時(shí)，采用卡鉗匹配法進(jìn)行匹配效果最好，卡鉗值和匹配比例根據(jù)組間樣本量比例進(jìn)行調(diào)整。在不同傾向性評(píng)分應(yīng)用方法中，以匹配法和匹配后回歸校正法的效果最好。與傳統(tǒng)多因素統(tǒng)計(jì)方法相比，本研究建立的分組因素為有序三分類的傾向性評(píng)分匹配方法可通過(guò)控制混雜因素定量評(píng)價(jià)不同組間連續(xù)型結(jié)局變量的差異。
[Abstract]:Background of Study :

With the development of information technology , observational studies have been increasing and improving both in quantity and in research accuracy . The observational study of large samples plays a more and more important role in medical research .
What kind of matching method is more advantageous and has not yet been finalized ;
In addition , the current tendency score matching method is mainly used for observational study data of grouping factors into two categories , and few researches have been used in observational study data for grouping factors into multi - classification .

Purpose of study :

In this paper , we construct the matching method of propensity score in order three classification , and compare multiple matching methods under the condition of grouping factor into ordered three classification , and compare the best advantage in different data characteristics by adjusting the parameters , and then compare the application methods of different inclination scores under the condition of grouping factors as ordered three categories , and finally apply the optimal propensity score matching method established in the simulation study to the actual data analysis .

Study method :

In this study , the data set is simulated by Monte Carlo method . The grouping factors are modeled as ordered three categories , and the proportion of sample size between different groups is 1 : 1 : 1 , 2 : 3 : 5 , 1 : 2 : 3 and 1 : 4 : 5 .

By means of sequential logistic regression analysis , we calculated the tendency score value of the grouped factors into the ordered three categories . By pre - experiment , the equilibrium between different groups was evaluated by standardized differences ( SD ) . When grouping factors were ordered three categories , the covariables between the three groups had not yet reached equilibrium . When the tendency score was completed , the bias and accuracy of the model were evaluated . The smaller the absolute value of RB , the smaller the bias of the model was shown .
The smaller the mean squarederror ( MSE ) is used to evaluate the accuracy of the model , the smaller the MSE , the higher the accuracy of the model .

Finally , the relationship between obesity and health - related quality of life ( HRQOL ) was evaluated by using the data from the general information , physical examination questionnaire and SF - 36 health questionnaire . The data from the questionnaire included sex , age , height , weight , education level , occupational and chronic diseases . The health - related quality of life was defined as " normal waist circumference " , " mild abdominal obesity " and " severe abdominal obesity " .

Results of the study :

( 1 ) Covariate screening : In the case of grouping factors into an ordered three classification , a relatively high matching ratio can be obtained when the covariables associated with the outcome are included in the propensity score model , and the accuracy is the highest . If the covariables are associated with both the grouping factor and the outcome variable , the accuracy of the processing effect estimate can be greatly increased , and the covariables associated with the outcome variables and the outcome variables can be further reduced , so that the accuracy of the processing effect estimation is increased . Therefore , in the case of the grouping factors being ordered three categories , the covariables associated with the outcomes need to be included in the propensity score model regardless of whether or not it is associated with the grouping factor .

( 2 ) Construction and comparison of matching method : This study constructed the matching method of propensity score based on grouping factors as ordered three classification , including the most adjacent matching method , the caliper matching method and the Markov distance method . The effect of the caliper matching method is the best when the sample size ratio of the groups is 1 : 1 : 1 . When the sample size ratio is 1 : 1 : 1 , the matching effect of the caliper matching method is 1 : 1 : 1 .
When the sample size ratio of the group is 2 : 3 : 5 , the matching effect of 1 : 1 : 1 is best done by using the caliper matching method ( the caliper value is set to 0.01 ) .
When the sample size ratio of the group is 1 : 2 : 3 , the matching effect of 1 : 1 : 1 is best done by using the caliper matching method ( the caliper value is set to 0.01 ) .
When the sample size ratio between groups is 1 : 4 : 5 , the matching effect of 1 : 2 : 2 is the best by adopting the caliper matching method ( the caliper value is set to 0.01 ) .

( 3 ) Compared with other methods , the method of different propensity score can greatly reduce the deviation of treatment effect estimation value and improve the accuracy of treatment effect estimation value . The regression correction method is superior to other methods , regardless of the proportion of sample size , the matching method and the post - matching regression correction method .
When the proportion of sample size in the group gradually increases , the stratification method is superior to the regression correction method .

( 4 ) Case study : After the matching of the propensity score , all the covariables associated with the outcome were balanced among the different abdominal characteristic groups , so it was possible to directly evaluate the effect of abdominal obesity on the health - related quality of life . The results showed that the scores of the patients with severe abdominal obesity were significantly lower than those in the normal waist group .

Conclusions of the study :

In the case of grouping factors as ordered three classification , the covariables associated with the outcomes should be included in the propensity score model . The best results are compared with the traditional multi - factor statistical methods . The grouping factors established in this study are the best results compared with the traditional multi - factor statistical methods .

【學(xué)位授予單位】：第二軍醫(yī)大學(xué)
【學(xué)位級(jí)別】：博士
【學(xué)位授予年份】：2014
【分類號(hào)】：R181.2

【引證文獻(xiàn)】

相關(guān)期刊論文前1條

1 鄧峰;屈蒙;楊培榮;王紅林;楊彪;高建民;;寶雞市農(nóng)村居民高血壓糖尿病社區(qū)干預(yù)效果分析[J];中國(guó)公共衛(wèi)生管理;2016年05期

本文編號(hào)：1862851

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://www.sikaile.net/yixuelunwen/yufangyixuelunwen/1862851.html

上一篇：烏魯木齊市市售牛奶中抗生素殘留調(diào)查及生鮮乳理化分析
下一篇：基于不同政策場(chǎng)景下上海市空氣污染治理政策健康效益分析

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

多組比較的傾向性評(píng)分模型構(gòu)建及匹配法的研究和應(yīng)用