Book contents
- Frontmatter
- Contents
- List of contributors
- Preface
- Frequently used notations and symbols
- 1 Algebraic and geometric methods in statistics
- Part I Contingency tables
- 2 Maximum likelihood estimation in latent class models for contingency table data
- 3 Algebraic geometry of 2×2 contingency tables
- 4 Model selection for contingency tables with algebraic statistics
- 5 Markov chains, quotient ideals and connectivity with positive margins
- 6 Algebraic modelling of category distinguishability
- 7 The algebraic complexity of maximum likelihood estimation for bivariate missing data
- 8 The generalised shuttle algorithm
- Part II Designed experiments
- Part III Information geometry
- Part IV Information geometry and algebraic statistics
- Part V On-line supplements
2 - Maximum likelihood estimation in latent class models for contingency table data
from Part I - Contingency tables
Published online by Cambridge University Press: 27 May 2010
- Frontmatter
- Contents
- List of contributors
- Preface
- Frequently used notations and symbols
- 1 Algebraic and geometric methods in statistics
- Part I Contingency tables
- 2 Maximum likelihood estimation in latent class models for contingency table data
- 3 Algebraic geometry of 2×2 contingency tables
- 4 Model selection for contingency tables with algebraic statistics
- 5 Markov chains, quotient ideals and connectivity with positive margins
- 6 Algebraic modelling of category distinguishability
- 7 The algebraic complexity of maximum likelihood estimation for bivariate missing data
- 8 The generalised shuttle algorithm
- Part II Designed experiments
- Part III Information geometry
- Part IV Information geometry and algebraic statistics
- Part V On-line supplements
Summary
Abstract
Statistical models with latent structure have a history going back to the 1950s and have seen widespread use in the social sciences and, more recently, in computational biology and in machine learning. Here we study the basic latent class model proposed originally by the sociologist Paul F. Lazarfeld for categorical variables, and we explain its geometric structure. We draw parallels between the statistical and geometric properties of latent class models and we illustrate geometrically the causes of many problems associated with maximum likelihood estimation and related statistical inference. In particular, we focus on issues of non-identifiability and determination of the model dimension, of maximisation of the likelihood function and on the effect of symmetric data. We illustrate these phenomena with a variety of synthetic and real-life tables, of different dimension and complexity. Much of the motivation for this work stems from the ‘100 Swiss Francs’ problem, which we introduce and describe in detail.
Introduction
Latent class (LC) or latent structure analysis models were introduced in the 1950s in the social science literature to model the distribution of dichotomous attributes based on a survey sample from a populations of individuals organised into distinct homogeneous classes on the basis of an unobservable attitudinal feature. See (Anderson 1954, Gibson 1955, Madansky 1960) and, in particular, (Henry and Lazarfeld 1968). These models were later generalised in (Goodman 1974, Haberman 1974, Clogg and Goodman 1984) as models for the joint marginal distribution of a set of manifest categorical variables, assumed to be conditionally independent given an unobservable or latent categorical variable, building upon the then recently developed literature on log-linear models for contingency tables.
- Type
- Chapter
- Information
- Algebraic and Geometric Methods in Statistics , pp. 27 - 62Publisher: Cambridge University PressPrint publication year: 2009
- 9
- Cited by