Book contents
- Frontmatter
- Contents
- List of contributors
- Preface
- Frequently used notations and symbols
- 1 Algebraic and geometric methods in statistics
- Part I Contingency tables
- 2 Maximum likelihood estimation in latent class models for contingency table data
- 3 Algebraic geometry of 2×2 contingency tables
- 4 Model selection for contingency tables with algebraic statistics
- 5 Markov chains, quotient ideals and connectivity with positive margins
- 6 Algebraic modelling of category distinguishability
- 7 The algebraic complexity of maximum likelihood estimation for bivariate missing data
- 8 The generalised shuttle algorithm
- Part II Designed experiments
- Part III Information geometry
- Part IV Information geometry and algebraic statistics
- Part V On-line supplements
8 - The generalised shuttle algorithm
from Part I - Contingency tables
Published online by Cambridge University Press: 27 May 2010
- Frontmatter
- Contents
- List of contributors
- Preface
- Frequently used notations and symbols
- 1 Algebraic and geometric methods in statistics
- Part I Contingency tables
- 2 Maximum likelihood estimation in latent class models for contingency table data
- 3 Algebraic geometry of 2×2 contingency tables
- 4 Model selection for contingency tables with algebraic statistics
- 5 Markov chains, quotient ideals and connectivity with positive margins
- 6 Algebraic modelling of category distinguishability
- 7 The algebraic complexity of maximum likelihood estimation for bivariate missing data
- 8 The generalised shuttle algorithm
- Part II Designed experiments
- Part III Information geometry
- Part IV Information geometry and algebraic statistics
- Part V On-line supplements
Summary
Abstract
Bounds for the cell counts in multi-way contingency tables given a set of marginal totals arise in a variety of different statistical contexts including disclosure limitation. We describe the Generalised Shuttle Algorithm for computing integer bounds of multi-way contingency tables induced by arbitrary linear constraints on cell counts. We study the convergence properties of our method by exploiting the theory of discrete graphical models and demonstrate the sharpness of the bounds for some specific settings. We give a procedure for adjusting these bounds to the sharp bounds that can also be employed to enumerate all tables consistent with the given constraints. Our algorithm for computing sharp bounds and enumerating multi-way contingency tables is the first approach that relies exclusively on the unique structure of the categorical data and does not employ any other optimisation techniques such as linear or integer programming. We illustrate how our algorithm can be used to compute exact p-values of goodness-of-fit tests in exact conditional inference.
Introduction
Many statistical research problems involve working with sets of multi-way contingency tables defined by a set of constraints, e.g., marginal totals or structural zeros. Four inter-related aspects involve: (1) the computation of sharp integer bounds, (2) counting, (3) exhaustive enumeration and (4) sampling. Each of these areas or some combination of them play important roles in solving complex data analysis questions arising in seemingly unrelated fields. The computation of bounds is central to the task of assessing the disclosure risk of small cell counts (e.g., cells with entries of 1 or 2) when releasing marginals from a high-dimensional sparse contingency table - for example, see (Fienberg 1999, Dobra and Fienberg 2000) and (Dobra 2001).
- Type
- Chapter
- Information
- Algebraic and Geometric Methods in Statistics , pp. 135 - 156Publisher: Cambridge University PressPrint publication year: 2009
- 1
- Cited by