Hostname: page-component-745bb68f8f-d8cs5 Total loading time: 0 Render date: 2025-01-14T00:08:10.510Z Has data issue: false hasContentIssue false

Automatic selection of reliability estimates for individual regression predictions

Published online by Cambridge University Press:  01 March 2010

Zoran Bosnić*
Affiliation:
University of Ljubljana, Faculty of Computer and Information Science, Tržaška 25, Ljubljana, Slovenia
Igor Kononenko*
Affiliation:
University of Ljubljana, Faculty of Computer and Information Science, Tržaška 25, Ljubljana, Slovenia

Abstract

In machine learning and its risk-sensitive applications (e.g. medicine, engineering, business), the reliability estimates for individual predictions provide more information about the individual prediction error (the difference between the true label and regression prediction) than the average accuracy of predictive model (e.g. relative mean squared error). Furthermore, they enable the users to distinguish between more and less reliable predictions. The empirical evaluations of the existing individual reliability estimates revealed that the successful estimates’ performance depends on the used regression model and on the particular problem domain. In the current paper, we focus on that problem as such and propose and empirically evaluate two approaches for automatic selection of the most appropriate estimate for a given domain and regression model: the internal cross-validation approach and the meta-learning approach. The testing results of both approaches demonstrated an advantage in the performance of dynamically chosen reliability estimates to the performance of the individual reliability estimates. The best results were achieved using the internal cross-validation procedure, where reliability estimates significantly positively correlated with the prediction error in 73% of experiments. In addition, the preliminary testing of the proposed methodology on a medical domain demonstrated the potential for its usage in practice.

Type
Articles
Copyright
Copyright © Cambridge University Press 2010

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aha, D. W. 1992. Generalizing from case studies: A case study. In Proceedings of the Ninth International Workshop on Machine Learning (ML 1992), Aberdeen, Scotland, UK, 1–10.Google Scholar
Asuncion, A., Newman, D. J. 2007. UCI machine learning repository, http://www.ics.uci.edu/~mlearn/MLRepository.html, Irvine, CA: University of California, School of Information and Computer Science.Google Scholar
Birattari, M., Bontempi, H., Bersini, H. 1998. Local learning for data analysis. In Proceedings of the 8th Belgian-Dutch Conference on Machine Learning, Wageningen, The Netherlands, 55–61.Google Scholar
Blum, A., Mitchell, T. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the 11th Annual Conference on Computational Learning Theory, Madison, Wisconsin, 92–100.Google Scholar
Bosnić, Z., Kononenko, I. 2007. Estimation of individual prediction reliability using the local sensitivity analysis. Applied Intelligence 29(3), 187203.Google Scholar
Bosnić, Z., Kononenko, I. 2008a. Estimation of regressor reliability. Journal of Intelligent Systems 17(1/3), 297311.CrossRefGoogle Scholar
Bosnić, Z., Kononenko, I. 2008b. Comparison of approaches for estimating reliability of individual regression predictions. Data & Knowledge Engineering 67(3), 504516.Google Scholar
Bosnić, Z., Kononenko, I., Robnik-Šikonja, M., Kukar, M. 2003. Evaluation of prediction reliability in regression using the transduction principle. In Proceedings of Eurocon 2003, Zajc, B. & Tkalčič, M. (eds), 99103. IEEE (Institute of Electrical and Electronics Engineering, Inc.)Google Scholar
Bousquet, O., Elisseeff, A. 2002. Stability and generalization. Journal of Machine Learning Research 2, 499526.Google Scholar
Breierova, L., Choudhari, M. 1996. An introduction to sensitivity analysis. MIT System Dynamics in Education Project.Google Scholar
Breiman, L. 1996. Bagging predictors. Machine Learning 24(2), 123140.CrossRefGoogle Scholar
Breiman, L. 2001. Random forests. Machine Learning 45(1), 532.CrossRefGoogle Scholar
Breiman, L., Friedman, J. H., Olshen, R. A., Stone, C. J. 1984. Classification and Regression Trees. Wadsworth International Group.Google Scholar
Carney, J., Cunningham, P. 1999. Confidence and prediction intervals for neural network ensembles. In Proceedings of IJCNN’99, The International Joint Conference on Neural Networks, Washington, USA, 1215–1218.Google Scholar
Caruana, R. 1997. Multitask learning. Machine Learning 28(1), 4175.CrossRefGoogle Scholar
Chang, C., Lin, C. 2001. LIBSVM: A Library for Support Vector Machines. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm.Google Scholar
Christiannini, N., Shawe-Taylor, J. 2000. Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press.CrossRefGoogle Scholar
Cohn, D. A., Atlas, L., Ladner, R. 1990. Training connectionist networks with queries and selective sampling. In Advances in Neural Information Processing Systems, Touretzky, D. (ed.) 2, 566573. Morgan Kaufman.Google Scholar
Cohn, D. A., Ghahramani, Z., Jordan, M. I. 1995. Active learning with statistical models. In Advances in Neural Information Processing Systems, Tesauro, G., Touretzky, D. & Leen, T. (eds) 7, 705712. The MIT Press.Google Scholar
Crowder, M. J., Kimber, A. C., Smith, R. L., Sweeting, T. J. 1991. Statistical Concepts in Reliability. Statistical Analysis of Reliability Data. Chapman & Hall.CrossRefGoogle Scholar
de Sa, V. 1993. Learning classification with unlabeled data. In Proc. NIPS’93, Neural Information Processing Systems, Cowan, J. D., Tesauro, G. & Alspector, J. (eds), 112119. Morgan Kaufmann Publishers.Google Scholar
DesJardins, M., Gordon Diana, F. 1995. Evaluation and Selection of Biases in Machine Learning. Machine Learning 20, 522.Google Scholar
Department of Statistics at Carnegie Mellon University 2005. Statlib – Data, Software and News from the Statistics Community. http://lib.stat.cmu.edu/.Google Scholar
Elidan, G., Ninio, M., Friedman, N., Schuurmans, D. 2002. Data perturbation for escaping local maxima in learning. In Proceedings of the Eighteenth National Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications of Artificial Intelligence, July 28 - August 1, 2002, Edmonton, Alberta, Canada, 132–139. AAAI Press.Google Scholar
Freund, Y., Schapire, R. E. 1997. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119139.CrossRefGoogle Scholar
Gama, J., Brazdil, P. 1995. Characterization of classification algorithms. In Progress in Artificial Intelligence, 7th Portuguese Conference on Artificial Intelligence, EPIA-95, Pinto-Ferreira, C. & Mamede, N. (eds), 189–200. Springer-Verlag.Google Scholar
Gammerman, A., Vovk, V., Vapnik, V. 1998. Learning by transduction. In Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence, Madison, Wisconsin, 148–155.Google Scholar
Giacinto, G., Roli, F. 2001. Dynamic classifier selection based on multiple classifier behaviour. Pattern Recognition 34(9), 18791881.CrossRefGoogle Scholar
Goldman, S., Zhou, Y. 2000. Enhancing supervised learning with unlabeled data. In Proc. 17th International Conf. on Machine Learning, Morgan Kaufmann, San Francisco, CA, 327–334.Google Scholar
Hastie, T., Tibshirani, R. 1990. Generalized Additive Models. Chapman and Hall.Google Scholar
Heskes, T. 1997. Practical confidence and prediction intervals. In Advances in Neural Information Processing Systems, Mozer, M. C., Jordan, M. I. & Petsche, T. (eds), 9, 176182. The MIT Press.Google Scholar
Jeon, B., Landgrebe, D. A. 1994. Parzen density estimation using clustering-based branch and bound. IEEE Transactions on Pattern Analysis and Machine Intelligence, 950954.CrossRefGoogle Scholar
Kearns, M. J., Ron, D. 1997. Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. In Computational Learning Theory, Freund Y. & Shapire R. (eds), 152162, Morgan Kaufmann.Google Scholar
Kleijnen, J. 2001. Experimental designs for sensitivity analysis of simulation models. Tutorial at the Eurosim 2001 Conference.Google Scholar
Kononenko, I., Kukar, M. 2007. Machine Learning and Data Mining: Introduction to Principles and Algorithms. Horwood Publishing Limited.CrossRefGoogle Scholar
Krieger, A. M., Green, P. E. 1999. A cautionary note on using internal cross validation to select the number of clusters. Psychometrika 64, 341353.CrossRefGoogle Scholar
Kukar, M., Kononenko, I. 2002. Reliable classifications with machine learning. In Proc. Machine Learning: ECML-2002, Elomaa, T., Manilla, H. & Toivonen, H. (eds), 219231. Springer-Verlag.CrossRefGoogle Scholar
Li, M., Vitányi, P. 1993. An Introduction to Kolmogorov Complexity and its Applications. Springer-Verlag.CrossRefGoogle Scholar
Linden, A., Weber, F. 1992. Implementing inner drive by competence reflection. In Proceedings of the 2nd International Conference on Simulation of Adaptive Behavior, Hawaii, 321–326.Google Scholar
Merz, C. J. 1996. Dynamical selection of learning algorithms. In Learning from Data: Artificial Intelligence and Statistics, Fisher, D. & Lenz, H. J. (eds), 110. Springer-Verlag.Google Scholar
Michie, D., Spiegelhalter, D. J., Taylor, C. C. (eds) 1994. Analysis of results. In Machine Learning, Neural and Statistical Classification, 176212. Ellis Horwood.Google Scholar
Mitchell, T. 1999. The role of unlabelled data in supervised learning. In Proceedings of the 6th International Colloquium of Cognitive Science, San Sebastian, Spain.Google Scholar
Nouretdinov, I., Melluish, T., Vovk, V. 2001. Ridge regression confidence machine. In Proc. 18th International Conf. on Machine Learning, Morgan Kaufmann, San Francisco, CA, 385–392.Google Scholar
Pratt, L., Jennings, B. 1998. A survey of connectionist network reuse through transfer. Learning to Learn, Norwell, MA, USA, ISBN: 0-7923-8047-9, 1943. Kluwer Academic Publishers.Google Scholar
R Development Core Team 2006. A Language and Environment for Statistical Computing. R Foundation for Statistical Computing.Google Scholar
Rumelhart, D. E., Hinton, G. E., Williams, R. J. 1986. Learning Internal Representations by Error Propagation. MIT Press, 318–362.Google Scholar
Saunders, C., Gammerman, A., Vovk, V. 1999. Transduction with confidence and credibility. In Proceedings of IJCAI’99, 2, 722–726.Google Scholar
Schaal, S., Atkeson, C. G. 1994. Assessing the quality of learned local models. In Advances in Neural Information Processing Systems, Cowan, J. D., Tesauro, G. & Alspector, J. (eds), 160167. Morgan Kaufmann Publishers.Google Scholar
Schaal, S., Atkeson, C. G. 1998. Constructive incremental learning from only local information. Neural Computation 10(8), 20472084.CrossRefGoogle ScholarPubMed
Schaffer, C. 1993. Selecting a classification method by cross-validation. In Fourth International Workshop on Artificial Intelligence & Statistics, 15–25.Google Scholar
Schmidhuber, J., Storck, J. 1993. Reinforcement Driven Information Acquisition in Nondeterministic Environments. Technical Report. Fakultat fur Informatik, Technische Universit at Munchen.Google Scholar
Schmidhuber, J, Zhao, J., Wiering, M. 1996. Simple principles of metalearning, Technical Report IDSIA-69-96, Istituto Dalle Molle Di Studi Sull Intelligenza Artificiale, 1–23.Google Scholar
Seeger, M. 2000. Learning with Labeled and Unlabeled Data. Technical report. http://www.dai.ed.ac.uk/seeger/papers.html.Google Scholar
Silverman, B. W. 1986. Density Estimation for Statistics and Data Analysis. Monographs on Statistics and Applied Probability. Chapman and Hall.Google Scholar
Smola, A. J., Schölkopf, B. 1998. A Tutorial on Support Vector Regression. NeuroCOLT2 Technical Report NC2-TR-1998-030.Google Scholar
Tibshirani, R., Knight, K. 1999. Model search and inference by bootstrap bumping. Journal of Computational and Graphical Statistics 8, 671686.Google Scholar
Torgo, L. 2003. Data Mining with R: Learning by Case Studies. University of Porto, LIACC-FEP.Google Scholar
Tsuda, K., Rätsch, G., Mika, S., Müller, K. 2001. Learning to predict the leave-one-out error of kernel based classifiers. In Lecture Notes in Computer Science, 227331. Springer Berlin/Heidelberg.Google Scholar
Vapnik, V. 1995. The Nature of Statistical Learning Theory. Springer.CrossRefGoogle Scholar
Vilalta, R., Drissi, Y. 2002. A perspective view and survey of metalearning. Artificial Intelligence Review 18(2), 7795.CrossRefGoogle Scholar
Wand, M. P., Jones, M. C. 1995. Kernel Smoothing. Chapman and Hall.CrossRefGoogle Scholar
Weigend, A., Nix, D. 1994. Predictions with confidence intervals (local error bars). In Proceedings of the International Conference on Neural Information Processing (ICONIP’94), Seoul, Korea, 847–852.Google Scholar
Whitehead, S. D. 1991. A complexity analysis of cooperative mechanisms in reinforcement learning. In AAAI, 607–613.Google Scholar
Wolpert, D. H. 1992. Stacked generalization. In Neural Networks, Amari S. Grossberg S. & Taylor J. G. (eds) 5, 241259. Pergamon Press.Google Scholar
Wood, S. N. 2006. Generalized Additive Models: An Introduction with R, Chapman & Hall/CRC.CrossRefGoogle Scholar
Woods, K., Kegelmeyer, W. P., Bowyer, K. 1997. Combination of multiple classifiers using local accuracy estimates. IEEE Transactions on PAMI 19(4), 405410.CrossRefGoogle Scholar