Book contents
- Frontmatter
- Contents
- Preface
- Acronyms
- 1 Introduction
- 2 Machine Learning and Statistics Overview
- 3 Performance Measures I
- 4 Performance Measures II
- 5 Error Estimation
- 6 Statistical Significance Testing
- 7 Datasets and Experimental Framework
- 8 Recent Developments
- 9 Conclusion
- Appendix A Statistical Tables
- Appendix B Additional Information on the Data
- Appendix C Two Case Studies
- Bibliography
- Index
4 - Performance Measures II
Published online by Cambridge University Press: 05 August 2011
- Frontmatter
- Contents
- Preface
- Acronyms
- 1 Introduction
- 2 Machine Learning and Statistics Overview
- 3 Performance Measures I
- 4 Performance Measures II
- 5 Error Estimation
- 6 Statistical Significance Testing
- 7 Datasets and Experimental Framework
- 8 Recent Developments
- 9 Conclusion
- Appendix A Statistical Tables
- Appendix B Additional Information on the Data
- Appendix C Two Case Studies
- Bibliography
- Index
Summary
Our discussion in the last chapter focused on performance measures that relied solely on the information obtained from the confusion matrix. Consequently it did not take into consideration measures that either incorporate information in addition to that conveyed by the confusion matrix or account for classifiers that are not discrete. In this chapter, we extend our discussion to incorporate some of these measures. In particular, we focus on measures associated with scoring classifiers. A scoring classifier typically outputs a real-valued score on each instance. This real-valued score need not necessarily be the likelihood of the test instance over a class, although such probabilistic classifiers can be considered to be a special case of scoring classifiers. The scores output by the classifiers over the test instances can then be thresholded to obtain class memberships for instances (e.g., all examples with scores above the threshold are labeled as positive, whereas those with scores below it are labeled as negative). Graphical analysis methods and the associated performance measures have proven to be very effective tools in studying both the behavior and the performance of such scoring classifiers. Among these, the receiver operating characteristic (ROC) analysis has shown significant promise and hence has gained considerable popularity as a graphical measure of choice. We discuss ROC analysis in significant detail. We also discuss some alternative graphical measures that can be applied depending on the domain of application and assessment criterion of interest.
- Type
- Chapter
- Information
- Evaluating Learning AlgorithmsA Classification Perspective, pp. 111 - 160Publisher: Cambridge University PressPrint publication year: 2011