Book contents
- Frontmatter
- Contents
- List of contributors
- Preface
- Frequently used notations and symbols
- 1 Algebraic and geometric methods in statistics
- Part I Contingency tables
- 2 Maximum likelihood estimation in latent class models for contingency table data
- 3 Algebraic geometry of 2×2 contingency tables
- 4 Model selection for contingency tables with algebraic statistics
- 5 Markov chains, quotient ideals and connectivity with positive margins
- 6 Algebraic modelling of category distinguishability
- 7 The algebraic complexity of maximum likelihood estimation for bivariate missing data
- 8 The generalised shuttle algorithm
- Part II Designed experiments
- Part III Information geometry
- Part IV Information geometry and algebraic statistics
- Part V On-line supplements
7 - The algebraic complexity of maximum likelihood estimation for bivariate missing data
from Part I - Contingency tables
Published online by Cambridge University Press: 27 May 2010
- Frontmatter
- Contents
- List of contributors
- Preface
- Frequently used notations and symbols
- 1 Algebraic and geometric methods in statistics
- Part I Contingency tables
- 2 Maximum likelihood estimation in latent class models for contingency table data
- 3 Algebraic geometry of 2×2 contingency tables
- 4 Model selection for contingency tables with algebraic statistics
- 5 Markov chains, quotient ideals and connectivity with positive margins
- 6 Algebraic modelling of category distinguishability
- 7 The algebraic complexity of maximum likelihood estimation for bivariate missing data
- 8 The generalised shuttle algorithm
- Part II Designed experiments
- Part III Information geometry
- Part IV Information geometry and algebraic statistics
- Part V On-line supplements
Summary
Abstract
We study the problem of maximum likelihood estimation for general patterns of bivariate missing data for normal and multinomial random variables, under the assumption that the data is missing at random (MAR). For normal data, the score equations have nine complex solutions, at least one of which is real and statistically relevant. Our computations suggest that the number of real solutions is related to whether or not the MAR assumption is satisfied. In the multinomial case, all solutions to the score equations are real and the number of real solutions grows exponentially in the number of states of the underlying random variables, though there is always precisely one statistically relevant local maxima.
Introduction
A common problem in statistical analysis is dealing with missing data in some of the repeated measures of response variables. A typical instance arises during longitudinal studies in the social and biological sciences, when participants may miss appointments or drop out of the study altogether. Over very long term studies nearly all measurements will involve some missing data, so it is usually impractical to throw out these incomplete cases. Furthermore, the underlying cause for the missing data (e.g. a subject dies) might play an important role in inference with the missing data that will lead to false conclusions in the complete case analysis. Thus, specialised techniques are needed in the setting where some of the data is missing. A useful reference for this material is (Little and Rubin 2002), from which we will draw notation and definitions. See also (Dempster et al. 1977) and (Little and Rubin 1983) for reviews, and (Rubin 1976) for an early reference.
- Type
- Chapter
- Information
- Algebraic and Geometric Methods in Statistics , pp. 123 - 134Publisher: Cambridge University PressPrint publication year: 2009
- 4
- Cited by