![]() |
Dr. Kuang-Fu Cheng 鄭光甫 |
迴歸模型之適合度檢定
線性或非線性迴歸模型在各種領域中已被廣泛運用,其基本假設為:觀測n組資料值
目前已知大多的文獻著重於假設m(·)屬於某函數族(family)
其中迴歸係數q為未知參數,而需要估計或檢定等。惟在作統計推論時,為避免有錯誤結論,皆應先檢定上述假設的適當性,即檢定:
此即所謂的適合度檢定(goodness
of fit)。我們發現統計發展已有相當時日,然直到1980年代,這個問題才有系統的研究發展。
我們(Cheng
and Wu(1998))考慮下列的檢定問題:
其中
g(·) 為一已知函數,而b則為一維度p之未知參數的列向量。我們提出一種新的統計檢定方法,此種檢定方法的性質取決於某種權重函數(weight
function)的選擇。我們證明最佳的權重函數選擇應與區域對立假設(local
alternative)之方向有關。
在我們稍早發表的論文中(Cheng and Wu(1994a,1994b)),我們曾考慮在虛擬概似函數(quasi-likelihood)的設定下,針對類似問題,利用”information
equivalence”的關係,提出一新的檢定統計方法;且經由模擬發現:此統計量之檢定力(power)甚至優於被廣泛運用的離異統計量方法(deviance)。
最近的研究,我們著重在生物統計上最被廣泛運用的羅吉斯迴歸模型的適合度檢定問題。我們發現若考慮某種隨機效果模型,提出所謂的近似score統計檢定方法,其與以”information equivalence”關係為基礎建立的統計方法有密切關係;藉由模擬也發現,此種近似score統計方法更增進其檢定力。這種score統計法更可被應用在case-control資料下的分析方法(Cheng
and Chen (2001))。我們證明其檢定效率較其他方法(Qin
& Zhang (1997),Biometrika;Zhang(1999),Biometrika)在大樣本情況下均較佳。最近這種新方法更被擴展到two-stage
case-control 資料的分析。擴展理論到two-stage case-control
的資料分析率甚廣。其中較重要者,包括 Semiparametric
MLE
的理論研究。這些研究包括古典的存在理論及大樣本分配性質等。這些性質證明和第二階段的抽樣設計有關。
|
Goodness-of-Fit Tests for Regression
Models In many fields of
application, linear or nonlinear regression models play an important role.
Basically, one assumes that n observations
Much of the existing literature is concerned with parametric modeling in that m(·) is assumed to belong to a given family
of functions, where
Interestingly enough, a systematic study of such problems only started
in the late 1980s. We (Cheng and Wu(1998))
considered the test of
“Information equivalence” testing is a general test procedure. Recently, we found that such a procedure for testing the logistic regression model, the most popular model, can be characterized by using the approximated score of some random effects models. This enables us to further improve the “information equivalence” test. We remark here that such score type statistic can also be applied in case-control data analysis(Cheng & Chen (2001)).We showed such test procedure is asymptotically more efficient than the tests proposed by (Qin & Zhang (1997),Biometrika;Zhang(1999),Biometrika).The goodness-of-fit test problem was also studied with two-stage case-control data.The theoretical development of the properties of such test is much more involved.It includes the semiparametric MLE and the corresponding asymptotic distribution. |
Selected Recent Publications: 1.
Cheng, K. F., and Wu, J. W. (1994a), “Testing Goodness of Fit for
a Parametric Family of Link Functions.” JASA 89, 657-664.
2.
Cheng,
K. F., and Wu, J. W. (1994b),
“Adjusted least squares estimates for regression coefficients with
censored data.” JASA 89, 1483-1491.
3. Cheng, K. F., and Chu, C. K. (1995), Nonparametric Regression Estimates Using Misclassified Responses, Biometrika 82, 315-325. 4.
Cheng,
K. F., Hsueh, H. M. and Chieh, T. H. (1998), Goodness of Fit Tests with
Misclassified Data, Comm.
in Statistics. 27, 1379-1394.
5.
Cheng,
K. F., and Wu, J. W. (1998), An Optimal Test for the Mean Function
Hypothesis, Statistica Sinica, 8, 477-487.
6.
Cheng,
K. F., and Hsueh, H. M. (2000), Correcting Bias Due to
Misclassification in the Estimation of Logistic Regression Models,
accepted by Statistics and Probability Letters subject to minor
revision.
7. Cheng, K. F., and Cheng, L.(2001),Testing goodness-of-fit of a logistic regression model with case-control data,to appear in Biometrics.
|