Hostname: page-component-586b7cd67f-2brh9 Total loading time: 0 Render date: 2024-12-03T00:23:12.805Z Has data issue: false hasContentIssue false

Optimal sample size for estimating the proportion of transgenic plants using the Dorfman model with a random confidence interval

Published online by Cambridge University Press:  22 March 2011

Osval Antonio Montesinos-López*
Affiliation:
Facultad de Telemática, Universidad de Colima, Bernal Díaz del Castillo No. 340, Col. Villas San Sebastián, C.P. 28045, Colima, Colima, México
Abelardo Montesinos-López
Affiliation:
Departamento de Estadística, Centro de Investigación en Matemáticas (CIMAT), Guanajuato, Guanajuato, México
José Crossa
Affiliation:
Biometrics and Statistics Unit of the Crop Informatics Laboratory (CRIL) of the Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, Mexico, D.F., Mexico
Kent Eskridge
Affiliation:
Department of Statistics, University of Nebraska, Lincoln, Nebraska, USA
Ricardo A. Sáenz
Affiliation:
CUICBAS, Facultad de Ciencias, Universidad de Colima, Bernal Díaz del Castillo 340, Colima, Colima, México28040
*
*Correspondence Email: [email protected]

Abstract

Group testing is a procedure in which groups that contain several units (plants) are analysed without having to inspect individual plants, with the purpose of estimating the prevalence of genetically modified plants (adventitious presence of unwanted transgenic plants, AP) in a population at a low cost, without losing precision. When pool (group) testing is used to estimate the proportion of AP (p), there are several procedures that can be used for computing the confidence interval (CI); however, they usually do not ensure precision in the estimation of p. This research proposes a formula for determining the required number of pools (g), given a pool size (k), for estimating the proportion of AP plants using the Dorfman model. The proposed formula ensures precision in the estimated proportion of AP because it guarantees that the width (W) of the CI will be equal to, or narrower than, the desired width (ω), with a probability of γ. This probability accounts for the stochastic nature of the sample variance of p. We give examples to show how to use the proposed sample-size formula. Simulated data were created and tables are presented showing the different scenarios that a researcher may encounter. The Monte Carlo method was used to study the coverage and the level of assurance achieved by the proposed sample sizes. An R program that reproduces the results in the tables and makes it easy for the researcher to create other scenarios is given in the Appendix.

Type
Research Article
Copyright
Copyright © Cambridge University Press 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Beal, S.L. (1989) Sample size determination for confidence intervals on the population mean and on the difference between two population means. Biometrics 45, 969977.CrossRefGoogle ScholarPubMed
Brown, L.D., Cai, T.T. and DasGupta, A. (2001) Interval estimation for a binomial proportion. Statistical Science 16, 101133.CrossRefGoogle Scholar
Brown, L.D., Cai, T.T. and DasGupta, A. (2002) Confidence interval for a binomial proportion and asymptotic expansions. Annals of Statistics 30, 160201.CrossRefGoogle Scholar
Dorfman, R. (1943) The detection of defective members of large populations. The Annals of Mathematical Statistics 14, 436440.CrossRefGoogle Scholar
Dyer, G.A., Serratos-Hernández, J.A., Perales, H.R., Gepts, P., Piñeyro-Nelson, A., Chavez, A., Salinas-Arreortua, N., Yúnez-Naude, A., Taylor, J.E. and Alvarez-Buylla, E.R. (2009) Dispersal of transgenes through maize seed systems in Mexico. PLoS ONE 4, e5734.CrossRefGoogle ScholarPubMed
Hepworth, G. (1996) Exact CIs for proportions estimated by group testing. Biometrics 52, 11341146.CrossRefGoogle Scholar
Hepworth, G. (2005) Confidence intervals for proportions estimated by group testing with groups of unequal size. Journal of Agricultural, Biological, and Environmental Statistics 10, 478497.CrossRefGoogle Scholar
Hernández-Suárez, C.M., Montesinos-López, O.A., McLaren, G. and Crossa, J. (2008) Probability models for detecting transgenic plants. Seed Science Research 18, 7789.CrossRefGoogle Scholar
Katholi, C.R. and Unnasch, T.R. (2006) Important experimental parameters for determining infection rates in arthropod vectors using pool screening approaches. American Journal of Tropical Medicine and Hygiene 74, 779785.CrossRefGoogle ScholarPubMed
Kelley, K. (2007) Sample size planning for the coefficient of variation from the accuracy in parameter estimation approach. Behavior Research Methods 39, 755766.CrossRefGoogle ScholarPubMed
Kelley, K. and Maxwell, S.E. (2003) Sample size for multiple regression: obtaining regression coefficients that are accurate, not simply significant. Psychological Methods 8, 305321.CrossRefGoogle Scholar
Kelley, K. and Rausch, J.R. (2006) Sample size planning for the standardized mean difference: accuracy in parameter estimation via narrow confidence intervals. Psychological Methods 11, 363385.CrossRefGoogle ScholarPubMed
Kelley, K., Maxwell, S.E. and Rausch, J.R. (2003) Obtaining power or obtaining precision: delineating methods of sample size planning. Evaluation & the Health Professions 26, 258287.CrossRefGoogle ScholarPubMed
Kupper, L.L.andHafner, K.B. (1989) How appropriate are popular sample size formulas? The American Statistician 43, 101105.Google Scholar
Mace, A.E. (1964) Sample size determination. New York, Reinhold.Google Scholar
Montesinos-López, O.A., Montesinos-López, A., Crossa, J., Eskridge, K. and Hernández-Suárez, C.M. (2010) Sample size for detecting and estimating the proportion of transgenic plants with narrow confidence intervals. Seed Science Research 20, 123136.CrossRefGoogle Scholar
Pan, Z. and Kupper, L. (1999) Sample size determination for multiple comparison studies treating confidence interval width as random. Statistics in Medicine 18, 14751488.3.0.CO;2-0>CrossRefGoogle ScholarPubMed
Piñeyro-Nelson, A., van Heerwaarden, J., Perales, H.R., Serratos-Hernández, J.A., Rangel, A., Hufford, M.B., Gepts, P., Garay-Arroyo, A., Rivera-Bustamante, R. and Álvarez-Buylla, E.R. (2009) Transgenes in Mexican maize: molecular evidence and methodological considerations for GMO detection in landrace populations. Molecular Ecology 18, 750761.CrossRefGoogle ScholarPubMed
Quist, D. and Chapela, I.H. (2001) Transgenic DNA introgressed into traditional maize landraces in Oaxaca, Mexico. Nature 414, 541543.CrossRefGoogle ScholarPubMed
R Development Core Team (2007) R: A language and environment for statistical computing [Computer software and manual]. R Foundation for Statistical Computing. Available at websitewww.r-project.org (accessed 18 February 2011).Google Scholar
Schaarschmidt, F. (2007) Experimental design for one-sided confidence interval or hypothesis tests in binomial group testing. Communications in Biometry and Crop Science 2, 3240.Google Scholar
Swallow, W.H. (1985) Group testing for estimating infection rates and probabilities of disease transmission. Phytopathology 75, 882889.CrossRefGoogle Scholar
Tebbs, J.M. and Bilder, C.R. (2004) Confidence interval procedures for the probability of disease transmission in multiple-vector-transfer designs. Journal of Agricultural, Biological, and Environmental Statistics 9, 7590.CrossRefGoogle Scholar
Tebbs, J.M., Bilder, C.R. and Moser, B.K. (2003) An empirical Bayes group-testing approach to estimating small proportions. Communications in Statistics: Theory and Methods 32, 983995.CrossRefGoogle Scholar
Thompson, K.H. (1962) Estimation of the proportion of vectors in a natural population of insects. Biometrics 18, 568587.CrossRefGoogle Scholar
Wang, H., Chow, S.C. and Chen, M. (2005) A Bayesian approach on sample size calculation for comparing means. Journal of Biopharmaceutical Statistics 15, 799807.CrossRefGoogle ScholarPubMed
Wang, Y. and Kupper, L.L. (1997) Optimal sample sizes for estimating the difference in means between two normal populations treating confidence interval length as a random variable. Communications in Statistics 26, 727741.CrossRefGoogle Scholar
Worlund, D.D. and Taylor, G. (1983) Estimation of disease incidence in fish populations. Canadian Journal of Fisheries and Aquatic Sciences 40, 21942197.CrossRefGoogle Scholar