Book contents
7 - Statistical Analysis
from Part II - Evaluation for Classification
Published online by Cambridge University Press: 07 November 2024
Summary
In Chapter 7, the history of statistical analysis is reviewed and its legacy discussed. Four situations of interest to machine learning evaluation are subsequently discussed within different statistical paradigms: the comparison of two classifiers on a single domain; the comparison of multiple classifiers on a single domain; the comparison of two classifiers on multiple domains; and the comparison of multiple classifiers on multiple domains. The three statistical paradigms considered for each of these situations are the null hypothesis statistical testing (NHST) setting; an enhanced Fisher-flavored methodology that adds the notions of confidence intervals, effect size, and power analysis to NHST; and a newer approach based on Bayesian reasoning.
- Type
- Chapter
- Information
- Machine Learning EvaluationTowards Reliable and Responsible AI, pp. 154 - 208Publisher: Cambridge University PressPrint publication year: 2024