基于多標(biāo)簽體檢數(shù)據(jù)的疾病風(fēng)險(xiǎn)分析方法研究
[Abstract]:Health check-up is a very important part of disease prevention. Doctors can analyze the underlying symptoms on the basis of individual health check-up results, and then provide health guidance to them. According to the analysis of the health examination results, the traditional treatment method is to give the whole health condition and disease risk analysis for the experienced doctors according to the physical examination results of each part of the body. With the increasing of the data, As well as the mixed experience of doctors and so on, the artificial analysis method can not meet the increasing demand for physical examination in terms of efficiency and accuracy. With the development of data mining technology, artificial intelligence and machine learning methods have been widely used in medical assistant diagnosis and disease risk analysis. Data preprocessing is one of the important links in machine learning. In medical physical examination data, there are often individual differences in the results of physical examination. For a certain feature, the standard deviation of the distribution of the characteristic values of the whole population is relatively large, and the number below the mean value is far higher than the number above the mean value, which shows that the distribution of the data is extremely uneven. However, the traditional method of data normalization is not a good way to avoid this problem. This problem can be solved by mathematical transformation and the convergence speed and precision of the model can be improved to a certain extent. The main work of this paper is as follows: (1) the FN (Fusion normalization) method is proposed to stabilize the features and normalize the eigenvalues to (0,1); 2. Aiming at the multi-label problem, this paper establishes three combination models based on SVM,GBDT,LR classifier, SVMs,GBDTs,LRs, to deal with medical multi-label data. 3. In view of the imbalance of data caused by the number of normal population is larger than that of abnormal population, according to the ratio of label data, the method of setting different punishment factors for different labels is adopted to deal with the problem. This data set contains 62 features such as gender, fasting blood glucose, hypertension, diabetes, and fatty liver. The data types in the dataset are character type and numeric type. The experimental results show that the accuracy of the: FN (Fusion normalization) method in combination model SVMs,GBDTs,LRs is improved to some extent compared with the non-normalized data, the Max_min normalization method and the standard normalization method.
【學(xué)位授予單位】:鄭州大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:R194.3;TP18
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 董健;鄧國(guó)輝;李金武;;基于二維傅里葉變換實(shí)現(xiàn)圖像變換的研究[J];福建電腦;2015年09期
2 東珍;;健康體檢數(shù)據(jù)分析肥胖及相關(guān)疾病——以中央民族大學(xué)退休教工為例[J];中央民族大學(xué)學(xué)報(bào)(自然科學(xué)版);2015年01期
3 王霄;周李威;陳耿;朱玉全;;一種基于標(biāo)簽相關(guān)性的多標(biāo)簽分類算法[J];計(jì)算機(jī)應(yīng)用研究;2014年09期
4 米國(guó)蓮;王春艷;司潤(rùn)輝;陶麗;;健康體檢人群體重指數(shù)與高血壓和高血糖關(guān)系的調(diào)查分析[J];河北醫(yī)藥;2013年19期
5 李思男;李寧;李戰(zhàn)懷;;多標(biāo)簽數(shù)據(jù)挖掘技術(shù):研究綜述[J];計(jì)算機(jī)科學(xué);2013年04期
6 鄭曦;時(shí)榮海;姚道闊;卓瑪次仁;唐杰;賀燕;;拉薩1370名藏族群眾高血壓患病情況及影響因素的Logistic回歸分析[J];公共衛(wèi)生與預(yù)防醫(yī)學(xué);2013年01期
7 王燕華;;某高校教職員工健康體檢數(shù)據(jù)分析[J];華南國(guó)防醫(yī)學(xué)雜志;2012年06期
8 馬正甲;;健康體檢中脂肪肝檢驗(yàn)結(jié)果與相關(guān)的影響因素研究[J];醫(yī)學(xué)檢驗(yàn)與臨床;2012年06期
9 劉博;常玲;盧云濤;;高校教職工體檢人群高血壓危險(xiǎn)因素的病例對(duì)照研究[J];中國(guó)全科醫(yī)學(xué);2012年26期
10 趙文華;寧光;;2010年中國(guó)慢性病監(jiān)測(cè)項(xiàng)目的內(nèi)容與方法[J];中華預(yù)防醫(yī)學(xué)雜志;2012年05期
,本文編號(hào):2440983
本文鏈接:http://www.sikaile.net/kejilunwen/zidonghuakongzhilunwen/2440983.html