Skip to main content Accessibility help
×
Hostname: page-component-78c5997874-j824f Total loading time: 0 Render date: 2024-11-08T10:31:28.377Z Has data issue: false hasContentIssue false

8 - The generalised shuttle algorithm

from Part I - Contingency tables

Published online by Cambridge University Press:  27 May 2010

Paolo Gibilisco
Affiliation:
Università degli Studi di Roma 'Tor Vergata'
Eva Riccomagno
Affiliation:
Università degli Studi di Genova
Maria Piera Rogantin
Affiliation:
Università degli Studi di Genova
Henry P. Wynn
Affiliation:
London School of Economics and Political Science
Get access

Summary

Abstract

Bounds for the cell counts in multi-way contingency tables given a set of marginal totals arise in a variety of different statistical contexts including disclosure limitation. We describe the Generalised Shuttle Algorithm for computing integer bounds of multi-way contingency tables induced by arbitrary linear constraints on cell counts. We study the convergence properties of our method by exploiting the theory of discrete graphical models and demonstrate the sharpness of the bounds for some specific settings. We give a procedure for adjusting these bounds to the sharp bounds that can also be employed to enumerate all tables consistent with the given constraints. Our algorithm for computing sharp bounds and enumerating multi-way contingency tables is the first approach that relies exclusively on the unique structure of the categorical data and does not employ any other optimisation techniques such as linear or integer programming. We illustrate how our algorithm can be used to compute exact p-values of goodness-of-fit tests in exact conditional inference.

Introduction

Many statistical research problems involve working with sets of multi-way contingency tables defined by a set of constraints, e.g., marginal totals or structural zeros. Four inter-related aspects involve: (1) the computation of sharp integer bounds, (2) counting, (3) exhaustive enumeration and (4) sampling. Each of these areas or some combination of them play important roles in solving complex data analysis questions arising in seemingly unrelated fields. The computation of bounds is central to the task of assessing the disclosure risk of small cell counts (e.g., cells with entries of 1 or 2) when releasing marginals from a high-dimensional sparse contingency table - for example, see (Fienberg 1999, Dobra and Fienberg 2000) and (Dobra 2001).

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Save book to Kindle

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×