線性回歸模型中響應(yīng)值的選取對(duì)二分類問(wèn)題的影響
發(fā)布時(shí)間:2018-05-29 04:07
本文選題:二分類問(wèn)題 + 線性回歸模型 ; 參考:《華北電力大學(xué)(北京)》2016年碩士論文
【摘要】:本文主要在多元線性回歸模型下,研究了不同響應(yīng)值以及不同的臨界值的選取對(duì)兩個(gè)總體分類問(wèn)題的影響。首先我們?nèi)∨袆e規(guī)則中的臨界值為響應(yīng)值的均值及中點(diǎn),并在這兩種情況下,分別討論了不同響應(yīng)值的選取對(duì)平衡及不平衡數(shù)據(jù)二分類問(wèn)題的影響。同時(shí),我們將判別規(guī)則中的臨界值取為響應(yīng)值的均值,并將響應(yīng)變量賦值為三組不同的值,這時(shí)得到的判別結(jié)果與經(jīng)典判別分析方法如:距離判別法、Bayes判別法對(duì)比分析,找到它們之間的聯(lián)系及優(yōu)缺點(diǎn)。此外,我們還使響應(yīng)值取定,并探討用三種臨界值得到的三種判別規(guī)則對(duì)數(shù)據(jù)分類判別,依據(jù)錯(cuò)判概率最小原則,選出最合適的臨界值。在理論研究的基礎(chǔ)上,我們用r語(yǔ)言以及5-fold Cross-Validation準(zhǔn)則,對(duì)響應(yīng)變量分別取三組值,并將臨界值賦值為響應(yīng)值的均值的三種情況下,對(duì)平衡、不平衡模擬數(shù)據(jù)及真實(shí)數(shù)據(jù)WDBC進(jìn)行分析,得到了與文章理論相符的模擬結(jié)果。另外,我們還對(duì)響應(yīng)變量分別賦為三組不同的值,臨界值分別取0或響應(yīng)值的均值或響應(yīng)值的中點(diǎn)的九種情況,將它們所對(duì)應(yīng)的錯(cuò)判概率進(jìn)行了程序模擬,得到了與理論證明一致的模擬結(jié)果,而且找到了這9種情況之間的聯(lián)系,并選出了使得錯(cuò)判率較小的臨界值,以便更好地對(duì)新的數(shù)據(jù)分類。
[Abstract]:In this paper, the effects of different response values and different critical values on the two population classification problems are studied under the multivariate linear regression model. First, we take the critical value in the discriminant rule as the mean and the middle point of the response value, and in these two cases, we discuss the influence of the selection of different response values on the two-classification problem of equilibrium and unbalanced data, respectively. At the same time, the critical value in the discriminant rule is taken as the mean value of the response value, and the response variable is assigned to three groups of different values. The result obtained is compared with the classical discriminant analysis method such as the distance discriminant method and Bayes discriminant method. Find out the relationship between them and their advantages and disadvantages. In addition, we also determine the response value, and discuss the classification of data by using the three kinds of critical values, and select the most appropriate critical value according to the principle of minimum misjudgment probability. On the basis of theoretical research, we use r language and 5-fold Cross-Validation criterion to take three sets of values for response variables, and assign the critical value to the three cases of mean value of response value. The unbalanced simulation data and the real data are analyzed by WDBC, and the simulation results are in agreement with the theory of the paper. In addition, the response variables are assigned to three groups of different values, and the critical values are taken as nine cases of the mean value of the response value or the midpoint of the response value, respectively, and the corresponding misjudgment probability is simulated by the program. The simulation results consistent with the theoretical proof are obtained, and the relationship between the nine cases is found, and the critical value which makes the error rate smaller is selected to better classify the new data.
【學(xué)位授予單位】:華北電力大學(xué)(北京)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:O212.1
【相似文獻(xiàn)】
相關(guān)碩士學(xué)位論文 前1條
1 楊巖麗;線性回歸模型中響應(yīng)值的選取對(duì)二分類問(wèn)題的影響[D];華北電力大學(xué)(北京);2016年
,本文編號(hào):1949424
本文鏈接:http://www.sikaile.net/kejilunwen/yysx/1949424.html
最近更新
教材專著