Hostname: page-component-745bb68f8f-b95js Total loading time: 0 Render date: 2025-01-07T18:55:57.888Z Has data issue: false hasContentIssue false

Comparing the Predictive Powers of Alternative Multiple Regression Models

Published online by Cambridge University Press:  01 January 2025

Michael R. Hagerty
Affiliation:
Graduate School of Management, University of California, Davis
V. Srinivasan*
Affiliation:
Graduate School of Business, Stanford University
*
Requests for reprints should be addressed to V. Srinivasan, Graduate School of Business, Stanford University, Stanford, CA 94305-5015.

Abstract

Mean squared error of prediction is used as the criterion for determining which of two multiple regression models (not necessarily nested) is more predictive. We show that an unrestricted (or true) model with t parameters should be chosen over a restricted (or misspecified) model with m parameters if (Pt2−Pm2)>(1−Pt2)(t−m)/n, where Pt2 and Pm2 are the population coefficients of determination of the unrestricted and restricted models, respectively, and n is the sample size. The left-hand side of the above inequality represents the squared bias in prediction by using the restricted model, and the right-hand side gives the reduction in variance of prediction error by using the restricted model. Thus, model choice amounts to the classical statistical tradeoff of bias against variance. In practical applications, we recommend that P2 be estimated by adjusted R2. Our recommendation is equivalent to performing the F-test for model comparison, and using a critical value of 2−(m/n); that is, if F>2−(m/n), the unrestricted model is recommended; otherwise, the restricted model is recommended.

Type
Original Paper
Copyright
Copyright © 1991 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The authors thank three reviewers and the editor for their useful comments on an earlier version of this manuscript.

References

Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Petrov, B. N., & Csaki, F. (Eds.), Second international symposium on information theory (pp. 267281). Budapest: Akademiai Kiado.Google Scholar
Amemiya, T. (1980). Selection of regressors. International Economic Review, 21, 331354.CrossRefGoogle Scholar
Amemiya, T. (1985). Advanced econometrics, Cambridge, MA: Harvard University Press.Google Scholar
Barten, A. P. (1962). Note on unbiased estimation of the squared multiple correlation coefficient. Statistica Neerlandica, 16, 151163.CrossRefGoogle Scholar
Cramer, J. S. (1987). Mean and variance of R 2 in small and moderate samples. Journal of Econometrics, 35, 253266.CrossRefGoogle Scholar
Gaver, K. M., & Geisel, M. S. (1974). Discriminating among alternative models: Bayesian and non-Bayesian methods. In Zarembka, P. (Eds.), Frontiers in econometrics (pp. 4980). New York: Academic Press.Google Scholar
Hocking, R. R. (1976). The analysis and selection of variables in linear regression. Biometrics, 32, 149.CrossRefGoogle Scholar
Johnston, J. (1984). Econometric methods 3rd ed., New York: McGraw-Hill.Google Scholar
Judge, G., Bock, M. (1978). The statistical implications of pre-test and Stein-rule estimators in econometrics, Amsterdam: North-Holland.Google Scholar
Krishnaiah, P. R. (1982). Selection of variables under univariate regression models. In Krishnaiah, P. R. & Kanal, L. N. (Eds.), Handbook of statistics, Vol. 2 (pp. 805820). Amsterdam: North-Holland.Google Scholar
Lord, F. M. (1950). Efficiency of prediction when a regression equation from one sample is used in a new sample, Princeton, NJ: Educational Testing Service.CrossRefGoogle Scholar
Lovell, M. C. (1983). Data mining. Review of Economics and Statistics, 65, 112.CrossRefGoogle Scholar
Mallows, C. L. (1973). Some comments on C p. Technometrics, 15, 661676.Google Scholar
Montgomery, D. B., & Morrison, D. G. (1973). A note on adjusting R 2. The Journal of Finance, 28, 10091013.Google Scholar
Srinivasan, V. (1977). A theoretical comparison of the predictive power of the multiple regression and equal weighting procedures, Stanford, CA: Stanford University, Graduate School of Business.Google Scholar
Theil, H. (1957). Specification errors and the estimation of economic relationships. Revue de l'Institut International de Statistique, 25, 4151.CrossRefGoogle Scholar
Thompson, M. L. (1978). Selection of variables in multiple regression: Part II. Chosen procedures, computations, and examples. International Statistical Review, 46, 129146.CrossRefGoogle Scholar