Skip to main content Accessibility help
×
Hostname: page-component-586b7cd67f-2brh9 Total loading time: 0 Render date: 2024-11-30T20:24:11.246Z Has data issue: false hasContentIssue false

References

Published online by Cambridge University Press:  17 April 2022

Michael P. Fay
Affiliation:
National Institute of Allergy and Infectious Diseases
Erica H. Brittain
Affiliation:
National Institute of Allergy and Infectious Diseases
Get access

Summary

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'
Type
Chapter
Information
Statistical Hypothesis Testing in Context
Reproducibility, Inference, and Science
, pp. 404 - 419
Publisher: Cambridge University Press
Print publication year: 2022

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aalen, O., Borgan, O., and Gjessing, H. (2008), Survival and Event History Analysis, New York: Springer.Google Scholar
Aalen, O. O., Cook, R. J., and Røysland, K. (2015), “Does Cox analysis of a randomized survival study yield a causal treatment effect?Lifetime Data Analysis, 21, 579593.CrossRefGoogle ScholarPubMed
Agresti, A. (2013), Categorical Data Analysis, 3rd ed., Hoboken, NJ: John Wiley & Sons.Google Scholar
Agresti, A. and Caffo, B. (2000), “Simple and effective confidence intervals for proportions and differences of proportions result from adding two successes and two failures,” The American Statistician, 54, 280288. [66]Google Scholar
Agresti, A. and Min, Y. (2001), “On small-sample confidence intervals for parameters in discrete distributions,” Biometrics, 57, 963971. [114, 121]Google Scholar
Agresti, A. and Min, Y.. (2002), “Unconditional small-sample confidence intervals for the odds ratio,” Biostatistics, 3, 379386. [114, 121]Google Scholar
Ali, M. M. and Sharma, S. C. (1996), “Robustness to nonnormality of regression F-tests,” Journal of Econometrics, 71, 175205. [274]Google Scholar
Andersen, P., Borgan, O., Gill, R., and Keiding, N. (1993), Statistical Models Based on Counting Processes, New York: Springer. [304, 306, 308, 315, 316, 317, 323, 325]Google Scholar
Andersen, P. K. (2005), “Censored data,” Encyclopedia of Biostatistics, 2nd ed., 1, 722727. [324]Google Scholar
Anderson, J. M., Samake, S., Jaramillo-Gutierrez, , et al. (2011), “Seasonality and prevalence of Leishmania major infection in Phlebotomus duboscqi Neveu-Lemaire from two neighboring villages in central Mali,” PLoS Neglected Tropical Diseases, 5, e1139. [64]Google Scholar
Anderson-Bergman, C. (2017), “icenReg: regression models for interval censored data in R,” Journal of Statistical Software, 81, 123. [324]CrossRefGoogle Scholar
Angrist, J. D., Imbens, G. W., and Rubin, D. B. (1996), “Identification of causal effects using instrumental variables,” Journal of the American Statistical Association, 91, 444455. [287, 297, 300]Google Scholar
Baggerly, K. A., Morris, J. S., and Coombes, K. R. (2004), “Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments,” Bioinformatics, 20, 777785. [39]Google Scholar
Baiocchi, M., Cheng, J., and Small, D. S. (2014), “Instrumental variable methods for causal inference,” Statistics in Medicine, 33, 22972340. [297, 298, 300]Google Scholar
Baker, S. G. (1994), “The multinomial-Poisson transformation,” The Statistician, 43, 495504. [198]Google Scholar
Banerjee, M. and Wellner, J. A. (2005), “Confidence intervals for current status data,” Scandinavian Journal of Statistics, 32, 405424. [320]Google Scholar
Barber, R.F.andCandès, E. J. (2015), “Controlling the false discovery rate via knockoffs,” The Annals of Statistics, 43, 20552085. [273]Google Scholar
Barnhart, H. X., Haber, M. J., and Lin, L. I. (2007), “An overview on assessing agreement with continuous measurements,” Journal of Biopharmaceutical Statistics, 17, 529569. [101]CrossRefGoogle ScholarPubMed
Basu, D. (1980), “Randomization analysis of experimental data, the Fisher randomization test (with discussion),” Journal of the American Statistical Association, 75, 575595. [47]Google Scholar
Bauer, P., Bretz, F., Dragalin, V., König, F., and Wassmer, G. (2016), “Twenty-five years of confirmatory adaptive designs: opportunities and pitfalls,” Statistics in Medicine, 35, 325347. [352, 354, 356]Google Scholar
Bauer, P. and Köhne, K. (1994), “Evaluation of experiments with adaptive interim analyses,” Biometrics, 10291041. [352, 355, 356]Google Scholar
Begg, C. B. (1990), “On inferences from Wei’s biased coin design for clinical trials,” Biometrika, 77, 467473. [34]CrossRefGoogle Scholar
Benjamini, Y. (2010), “Simultaneous and selective inference: current successes and future challenges,” Biometrical Journal, 52, 708721. [239]CrossRefGoogle ScholarPubMed
Benjamini, Y.. (2016), “It’s not the P-values’ fault,” The American Statistician, 70, 12. [xi]Google Scholar
Benjamini, Y. and Hochberg, Y. (1995), “Controlling the false discovery rate: a practical and powerful approach to multiple testing,” Journal of the Royal Statistical Society: Series B (Methodological), 57, 289300. [241]Google Scholar
Benjamini, Y. and Yekutieli, D. (2001), “The control of the false discovery rate in multiple testing under dependency,” Annals of Statistics, 11651188. [241, 250]Google Scholar
Beran, R. (1997), “Diagnosing bootstrap success,” Annals of the Institute of Statistical Mathematics, 49, 124. [192]CrossRefGoogle Scholar
Berger, J. O., Bernardo, J. M., and Sun, D. (2009), “The formal definition of reference priors,” The Annals of Statistics, 905938. [392, 401]Google Scholar
Berger, J. O. and Sellke, T. (1987), “Testing a point null hypothesis: the irreconcilability of P values and evidence,” Journal of the American Statistical Association, 82, 112122. [402]Google Scholar
Berger, R. L. and Boos, D. D. (1994), “P values maximized over a confidence set for the nuisance parameter,” Journal of the American Statistical Association, 89, 10121016. [114]Google Scholar
Bernardo, J. M. (1979), “Reference posterior distributions for Bayesian inference,” Journal of the Royal Statistical Society: Series B (Methodological), 41, 113147. [392, 401]Google Scholar
Bernardo, J. M.. (2011), “Integrated objective Bayesian estimation and hypothesis testing,” Bayesian Statistics, 9, 168. [395, 401, 402]Google Scholar
Bickel, P. and Freedman, D. (1981), “Some asymptotic theory for the bootstrap,” Annals of Statistics, 9, 11961217. [192]Google Scholar
Bishop, Y., Fienberg, S., and Holland, P. (1975), Discrete Multivariate Analysis: Theory and Practice, Cambridge, MA: MIT Press. [211]Google Scholar
Blaker, H. (2000), “Confidence curves and improved exact confidence intervals for discrete distributions,” Canadian Journal of Statistics, 28, 783798. [55, 56, 63, 75, 111]Google Scholar
Bland, J. M. and Altman, D. G. (1999), “Measuring agreement in method comparison studies,” Statistical Methods in Medical Research, 8, 135160. [101]Google Scholar
Blyth, C. and Still, H. (1983), “Binomial confidence intervals,” Journal of the American Statistical Association, 78, 108116. [63]Google Scholar
Boos, D. D. and Stefanski, L. (2013), Essential Statistical Inference, New York: Springer. [170, 171, 173, 175, 183, 190, 191, 192, 193]Google Scholar
Boschloo, R. (1970), “Raised conditional level of significance for the 2 × 2-table when testing the equality of two probabilities,” Statistica Neerlandica, 24, 19. [112]Google Scholar
Box, G. E. and Cox, D. R. (1964), “An analysis of transformations,” Journal of the Royal Statistical Society: Series B (Methodological), 26, 211252. [274]Google Scholar
Box, G. E., Hunter, J. S., and Hunter, W. G. (2005), Statistics for Experimenters: Design, Innovation, and Discovery, vol. 2, New York: Wiley-Interscience. [274]Google Scholar
Box, G. E. and Watson, G. S. (1962), “Robustness to non-normality of regression tests,” Biometrika, 49, 93106. [274]CrossRefGoogle Scholar
Brazzale, A. R., Davison, A. C., and Reid, N. (2007), Applied Asymptotics: Case Studies in Small-Sample Statistics, vol. 23, Cambridge: Cambridge University Press. [256]Google Scholar
Breslow, N. (1972), “Contribution to the discussion of Cox (1972),” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 34, 216217. [308]Google Scholar
Breslow, N. and Chatterjee, N. (1999), “Design and analysis of two-phase studies with binary outcome applied to Wilms tumour prognosis,” Journal of the Royal Statistical Society: Series C (Applied Statistics), 48, 457468. [307]Google Scholar
Breslow, N. E. (1996), “Statistics in epidemiology: the case-control study,” Journal of the American Statistical Association, 91, 1428. [122]CrossRefGoogle ScholarPubMed
Bretz, F., Hothorn, T., and Westfall, P. (2011), Multiple Comparisons using R, Boca Raton, FL: CRC Press. [208, 209, 210, 211, 212, 244, 250, 251]Google Scholar
Bretz, F., Maurer, W., Brannath, W., and Posch, M. (2009), “A graphical approach to sequentially rejective multiple test procedures,” Statistics in Medicine, 28, 586604. [246, 247]Google Scholar
Brillinger, D. R. (1986), “The natural variability of vital rates and associated statistics (with discussion),” Biometrics, 42, 693734. [229]Google Scholar
Brittain, E. and Lin, D. (2005), “A comparison of intent-to-treat and per-protocol results in antibiotic non-inferiority trials,” Statistics in Medicine, 24, 110. [372]Google Scholar
Brittain, E. H., Fay, M. P., and Follmann, D. A. (2012), “A valid formulation of the analysis of noninferiority trials under random effects meta-analysis,” Biostatistics, 13, 637649. [227, 369, 370]Google Scholar
Brown, B. M. and Hettmansperger, T. P. (2002), “Kruskal–Wallis, multiple comparisons and Efron dice,” Australian & New Zealand Journal of Statistics, 44, 427438. [157, 159]Google Scholar
Brown, L. D., Cai, T. T., and DasGupta, A. (2001), “Interval estimation for a binomial proportion (with discussion),” Statistical Science, 16, 101133. [60]CrossRefGoogle Scholar
Brown, M. B. and Forsythe, A. B. (1974a), “372: the ANOVA and multiple comparisons for data with heterogeneous variances,” Biometrics, 30, 719724. [200, 211]Google Scholar
Brown, M. B. and Forsythe, A. B.. (1974b), “The small sample behavior of some statistics which test the equality of several means,” Technometrics, 16, 129132. [200]Google Scholar
Brunner, E., Konietschke, F., Pauly, M., and Puri, M. L. (2017), “Rank-based procedures in factorial designs: hypotheses about non-parametric treatment effects,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79, 14631485. [221]CrossRefGoogle Scholar
Brunner, E. and Munzel, U. (2000), “The nonparametric Behrens-Fisher problem: asymptotic theory and a small-sample approximation,” Biometrical Journal, 42, 1725. [95, 147, 157, 361]Google Scholar
Bühlmann, P., Kalisch, M., and Meier, L. (2014), “High-dimensional statistics with a view toward applications in biology,” Annual Review of Statistics and its Application, 1, 255278. [271]Google Scholar
Burnham, K. P. and Anderson, D. R. (2002), Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, 2nd ed., New York: Springer. [273]Google Scholar
Candès, E., Fan, Y., Janson, L., and Lv, J. (2018), “Panning for gold: ‘model-X’ knockoffs for high dimensional controlled variable selection,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 80, 551577. [273]Google Scholar
Carlin, B. P. and Louis, T. A. (2009), Bayesian Methods for Data Analysis, 3rd ed., Boca Raton, FL: CRC Press. [40, 402]Google Scholar
Carpenter, J. and Kenward, M. (2013), Multiple Imputation and its Application, Chichester: John Wiley & Sons. [339, 340]Google Scholar
Caruso, J. C. and Cliff, N. (1997), “Empirical size, coverage, and power of confidence intervals for Spearman’s Rho,” Educational and Psychological Measurement, 57, 637654. [97]Google Scholar
Casella, G. (1986), “Refining binomial confidence intervals,” The Canadian Journal of Statistics/La Revue Canadienne de Statistique, 14, 113129. [63]Google Scholar
Casella, G.. (1989), “Refining Poisson confidence intervals,” The Canadian Journal of Statistics/La Revue Canadienne de Statistique, 17, 4557. [75]Google Scholar
Casella, G. and Berger, R. L. (1987), “Reconciling Bayesian and frequentist evidence in the one-sided testing problem,” Journal of the American Statistical Association, 82, 106111. [402]CrossRefGoogle Scholar
Casella, G. and Berger, R. L.. (2002), Statistical Inference, 2nd ed., Pacific Grove, CA: Duxbury Press. [xi, 102, 123, 306, 374, 397, 402]Google Scholar
Chen, Y. J., DeMets, D. L., and Lan, K. G. (2004), “Increasing the sample size when the unblinded interim result is promising,” Statistics in Medicine, 23, 10231038. [354, 355]Google Scholar
Cheng, S., Wei, L., and Ying, Z. (1995), “Analysis of transformation models with censored data,” Biometrika, 82, 835845. [261]Google Scholar
Chow, S.-C., Shao, J., Wang, H., and Lokhnygina, Y. (2018), Sample Size Calculations in Clinical Research, 3rd ed., Boca Raton, FL: Chapman and Hall/CRC. [382, 386]Google Scholar
Chung, E. and Romano, J. P. (2016), “Asymptotically valid and exact permutation tests based on two-sample U-statistics,” Journal of Statistical Planning and Inference, 168, 97105. [157, 160]Google Scholar
Ciarleglio, M. M., Arendt, C. D., and Peduzzi, P. N. (2016), “Selection of the effect size for sample size determination for a continuous response in a superiority clinical trial using a hybrid classical and Bayesian procedure,” Clinical Trials, 13, 275285. [385]Google Scholar
Cole, S.R.andHernán, M. A. (2008), “Constructing inverse probability weights for marginal structural models,American Journal of Epidemiology, 168, 656664. [292, 338]Google Scholar
Coulibaly, Y. I., Dembele, B., Diallo, A. A., et al. (2009), “A randomized trial of doxycycline for Mansonella perstans infection,” New England Journal of Medicine, 361, 14481458. [43, 104]Google Scholar
Cox, D. (1972), “Regression models and life-tables (with discussion),” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 34, 187220. [315]Google Scholar
Cox, D. and Hinkley, D. (1974), Theoretical Statistics, London: Chapman and Hall. [364]Google Scholar
Cox, D. R. (1975), “Partial likelihood,” Biometrika, 62, 269276. [45]Google Scholar
Crow, E. (1956), “Confidence intervals for a proportion,” Biometrika, 43, 423435. [63]CrossRefGoogle Scholar
Cui, Y. and Hannig, J. (2019), “Nonparametric generalized fiducial inference for survival functions under censoring,” Biometrika, 106, 501518. [306]Google Scholar
D’AgostinoSr., R. B., Massaro, J. M., and Sullivan, L. M. (2003), “Non-inferiority trials: design concepts and issues–the encounters of academic consultants in statistics,” Statistics in Medicine, 22, 169186. [369, 373]Google Scholar
Dagum, C. (2006), “Income inequality measures,” Encyclopedia of Statistical Sciences DOI:10.1002/ 0471667196.ess6030.pub2. [80]CrossRefGoogle Scholar
Davidson, A. and Hinkley, D. (1997), Bootstrap Methods and Their Application, New York: Cambridge University Press. [181, 184, 190]Google Scholar
De Neve, J., Thas, O., and Gerds, T. A. (2019), “Semiparametric linear transformation models: Effect measures, estimators, and applications,” Statistics in Medicine, 38, 14841501. [274]Google Scholar
De Veaux, R. D. and Hand, D. J. (2005), “How to lie with bad data,” Statistical Science, 20, 231238. [47]Google Scholar
Demets, D. L. and Lan, K. G. (1994), “Interim analysis: the alpha spending function approach,” Statistics in Medicine, 13, 13411352. [356]Google Scholar
DerSimonian, R. and Kacker, R. (2007), “Random-effects model for meta-analysis of clinical trials: an update,” Contemporary Clinical Trials, 28, 105114. [227]Google Scholar
DerSimonian, R. and Laird, N. (1986), “Meta-analysis in clinical trials,” Controlled Clinical Trials, 7, 177188. [227]Google Scholar
Diaconis, P. and Efron, B. (1985), “Testing for independence in a two-way table: new interpretations of the chi-square statistic,” The Annals of Statistics, 13, 845874. [198]Google Scholar
DiCiccio, T. J. and Efron, B. (1996), “Bootstrap confidence intervals,” Statistical Science, 11, 189212. [185]Google Scholar
Ding, P., Feller, A., and Miratrix, L. (2016), “Randomization inference for treatment effect variation,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78, 655671. [300]Google Scholar
Druesne-Pecollo, N., Latino-Martel, P., Norat, T., et al. (2010), “Beta-carotene supplementation and cancer risk: a systematic review and metaanalysis of randomized controlled trials,” International Journal of Cancer, 127, 172184. [28, 47]Google Scholar
Dudewicz, E. and Mishra, S. (1988), Modern Mathematical Statistics, New York: Wiley. [82, 139]Google Scholar
Dudoit, S. and Van Der Laan, M. (2008), Multiple Testing Procedures with Applications to Genomics, New York: Springer. [245, 250]Google Scholar
Dunnett, C. W. (1955), “A multiple comparison procedure for comparing several treatments with a control,” Journal of the American Statistical Association, 50, 10961121. [208]Google Scholar
Edgington, E. S. (1987), Randomization Tests, 2nd ed., New York: Marcel Dekker. [34]Google Scholar
Efron, B. and Hinkley, D. V. (1978), “Assessing the accuracy of the maximum likelihood estimator: observed versus expected Fisher information,” Biometrika, 65, 457483. [171]Google Scholar
Efron, B. and Tibshirani, R. (1993), An Introduction to the Bootstrap, vol. 57, Boca Raton, FL: CRC press. [183, 184, 190]Google Scholar
Ellenberg, J. (2014), How Not to Be Wrong: The Power of Mathematical Thinking, New York: Penguin Press. [37]Google Scholar
Fagerland, M. W. and Hosmer, D. W. (2013), “A goodness-of-fit test for the proportional odds regression model,” Statistics in Medicine, 32, 22352249. [365]Google Scholar
Farrington, C. P. and Manning, G. (1990), “Test statistics and sample size formulae for comparative binomial trials with null hypothesis of non-zero risk difference or non-unity relative risk,” Statistics in Medicine, 9, 14471454. [116, 121]Google Scholar
Fay, M. P. (2010a), “Confidence intervals that match Fisher’s exact or Blaker’s exact tests,” Biostatistics, 11, 373374. [22, 122]Google Scholar
Fay, M. P. (1999a), “Approximate confidence intervals for rate ratios from directly standardized rates with sparse data,” Communications in Statistics-Theory and Methods, 28, 21412160. [230, 234]Google Scholar
Fay, M. P.. (1999b), “Comparing several score tests for interval censored data (Corr: 1999V18 p2681),” Statistics in Medicine, 18, 273285. [321]Google Scholar
Fay, M. P.. (2005), “Random marginal agreement coefficients: rethinking the adjustment for chance when measuring agreement,” Biostatistics, 6, 171180. [99, 100, 103]Google Scholar
Fay, M. P.. (2010b), “Two-sided exact tests and matching confidence intervals for discrete data,” R Journal, 2, 5358. [22, 75, 80]Google Scholar
Fay, M. P. and Brittain, E. H. (2016), “Finite sample pointwise confidence intervals for a survival distribution with right-censored data,” Statistics in Medicine, 35, 27262740. [115, 305, 306, 325]Google Scholar
Fay, M. P., Brittain, E. H., and Proschan, M. A. (2013), “Pointwise confidence intervals for a survival distribution for right censored data with small samples or heavy censoring,” Biostatistics, 14, 723736. [305, 306, 307, 322, 325]Google Scholar
Fay, M. P., Brittain, E. H., Shih, J. H., Follmann, D. A., and Gabriel, E. E. (2018a), “Causal estimands and confidence intervals associated with Wilcoxon-Mann-Whitney tests in randomized experiments,” Statistics in Medicine, 37, 29232937. [146, 157, 159, 160, 301]Google Scholar
Fay, M. P. and Feuer, E. J. (1997), “Confidence intervals for directly standardized rates: a method based on the gamma distribution,” Statistics in Medicine, 16, 791801. [230]Google Scholar
Fay, M. P. and Follmann, D. A. (2002), “Designing Monte Carlo implementations of permutation or bootstrap hypothesis tests,” The American Statistician, 56, 6370. [181]Google Scholar
Fay, M. P., Follmann, D. A., Lynn, F., et al. (2012), “Anthrax vaccine–induced antibodies provide cross-species prediction of survival to aerosol challenge,” Science Translational Medicine, 4, 151ra126. [47]CrossRefGoogle ScholarPubMed
Fay, M. P., Freedman, L. S., Clifford, C. K., and Midthune, D. N. (1997), “Effect of different types and amounts of fat on the development of mammary tumors in rodents: a review,” Cancer Research, 57, 39793988. [175]Google Scholar
Fay, M. P. and Graubard, B. I. (2001), “Small-sample adjustments for Wald-type tests using sandwich estimators,” Biometrics, 57, 11981206. [175, 315]Google Scholar
Fay, M. P., Graubard, B. I., Freedman, L. S., and Midthune, D. N. (1998), “Conditional logistic regression with sandwich estimators: application to a meta-analysis,” Biometrics, 54, 195208. [273]CrossRefGoogle ScholarPubMed
Fay, M. P., Halloran, M. E., and Follmann, D. A. (2007), “Accounting for variability in sample size estimation with applications to nonadherence and estimation of variance and effect size,” Biometrics, 63, 465474. [384, 385, 386, 387]Google Scholar
Fay, M. P. and Hunsberger, S. A. (2021), “Practical valid inferences for the two-sample binomial problem,” Statistics Surveys, 15, 72110. [16, 22, 121, 122]Google Scholar
Fay, M. P. and Kim, S. (2017), “Confidence intervals for directly standardized rates using mid-p gamma intervals,” Biometrical Journal, 59, 377387. [230, 232]Google Scholar
Fay, M. P. and Lumbard, K. (2021), “Confidence intervals for difference in proportions for matched pairs compatible with exact McNemar’s or sign tests,” Statistics in Medicine, 40, 11471159. [102]Google Scholar
Fay, M. P. and Malinovsky, Y. (2018), “Confidence intervals of the Mann-Whitney parameter that are compatible with the Wilcoxon-Mann-Whitney test,” Statistics in Medicine, 37, 39914006. [5, 128, 147, 149, 157, 160, 180, 234, 259, 316, 326, 387]CrossRefGoogle ScholarPubMed
Fay, M. P. and Proschan, M. A. (2010), “Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules,” Statistics Surveys, 4, 139. [82, 128, 157, 158, 160]Google Scholar
Fay, M. P., Proschan, M. A., and Brittain, E. (2015), “Combining one-sample confidence procedures for inference in the two-sample case,” Biometrics, 146156. [81, 143, 190, 400, 401]Google Scholar
Fay, M. P., Proschan, M. A., Brittain, E. H., and Tiwari, R. (2021), “Interpreting p-values and confidence intervals using well-calibrated null preference priors,” Statistical Science. https://imstat.org/journals-and-publications/statistical-science/statistical-science-future-papers/ [397, 399, 400, 401, 402]Google Scholar
Fay, M. P., Sachs, M. C., and Miura, K. (2018b), “Measuring precision in bioassays: rethinking assay validation,” Statistics in Medicine, 37, 519529. [79]Google Scholar
Fay, M. P. and Shaw, P. A. (2010), “Exact and asymptotic weighted logrank tests for interval censored data: the interval R package,” Journal of Statistical Software, 36, 134. [321, 325]Google Scholar
Fay, M. P. and Shih, J. H. (1998), “Permutation tests using estimated distribution functions,” Journal of the American Statistical Association, 93, 387396. [214, 221]Google Scholar
Fay, M. P. and Shih, J. H.. (2012), “Weighted logrank tests for interval censored data when assessment times depend on treatment,” Statistics in Medicine, 31, 37603772. [321]Google Scholar
Fay, M. P., Tiwari, R. C., Feuer, E. J., and Zou, Z. (2006), “Estimating average annual percent change for disease rates without assuming constant change,” Biometrics, 62, 847854. [234]Google Scholar
FDA. (2016), “Guidance for industry: non-inferiority clinical trials to establish effectiveness,” US Department of Health and Human Services and US Food and Drug Administration, Washington, DC. [369]Google Scholar
Finkelstein, D. M., Goggins, W. B., and Schoenfeld, D. A. (2002), “Analysis of failure time data with dependent interval censoring,” Biometrics, 58, 298304. [323]Google Scholar
Finniss, D. G., Kaptchuk, T. J., Miller, F., and Benedetti, F. (2010), “Biological, clinical, and ethical advances of placebo effects,” The Lancet, 375, 686695. [47]Google Scholar
Firth, D. (1993), “Bias reduction of maximum likelihood estimates,” Biometrika, 80, 2738. [258, 273]CrossRefGoogle Scholar
Fitzmaurice, G. M., Laird, N. M., and Ware, J. H. (2004), Applied Longitudinal Analysis, Hoboken, NJ: John Wiley & Sons. [275]Google Scholar
Fleiss, J. L., Levin, B., and Paik, M. C. (2003), Statistical Methods for Rates and Proportions, 3rd ed., Hoboken, NJ: John Wiley & Sons. [76]Google Scholar
Fleming, T. and Harrington, D. (1991), Counting Processes and Survival Analysis, New York: Wiley. [316, 323]Google Scholar
Follmann, D., Brittain, E., and Powers, J. H. (2013), “Discordant minimum inhibitory concentration analysis: a new path to licensure for anti-infective drugs,” Clinical Trials, 10, 876885. [369, 375]Google Scholar
Follmann, D. and Fay, M. (2010), “Exact inference for complex clustered data using within-cluster resampling,” Journal of Biopharmaceutical Statistics, 20, 850869. [188]Google Scholar
Follmann, D., Proschan, M., and Leifer, E. (2003), “Multiple outputation: inference for complex clustered data by averaging analyses from independent data,” Biometrics, 59, 420429. [188]Google Scholar
Freedman, L. S. (2008), “An analysis of the controversy over classical one-sided tests,” Clinical Trials, 5, 635640. [395, 396]Google Scholar
Freeman, G. and Halton, J. H. (1951), “Note on an exact treatment of contingency, goodness of fit and other problems of significance,” Biometrika, 38, 141149. [197]Google Scholar
Freidlin, B., Korn, E. L., Hunsberger, S., et al. (2007), “Proposal for the use of progression-free survival in unblinded randomized trials,” Journal of Clinical Oncology, 25, 21222126. [324]Google Scholar
Freireich, E. J., Gehan, E., Frei, E., et al. (1963), “The effect of 6-mercaptopurine on the duration of steroid-induced remissions in acute leukemia: a model for evaluation of other potentially useful therapy,” Blood, 21, 699716. [310]Google Scholar
Friedman, J., Hastie, T., and Tibshirani, R. (2010), “Regularization paths for generalized linear models via coordinate descent,” Journal of Statistical Software, 33, 1. [273]Google Scholar
Gail, M. H. (1974). “Power computations for designing comparative Poisson trials,” Biometrics, 30, 2, 231237. [388]Google Scholar
Gail, M. H., Lubin, J. H., and Rubinstein, L. V. (1981), “Likelihood calculations for matched case-control studies and survival studies with tied death times,” Biometrika, 68, 703707. [273]Google Scholar
Galton, F. (1886), “Regression towards mediocrity in hereditary stature.” Journal of the Anthropological Institute of Great Britain and Ireland, 15, 246263. [33]Google Scholar
Gautret, P., Lagier, J.-C., Parola, P., et al. (2020), “Hydroxychloroquine and azithromycin as a treatment of COVID-19: results of an open-label non-randomized clinical trial,” International Journal of Antimicrobial Agents, 56, 105949. [1, 2]Google Scholar
Gehan, E. A. (1965), “A generalized Wilcoxon test for comparing arbitrarily singly-censored samples,” Biometrika, 52, 203224. [310, 311, 317]Google Scholar
Gelman, A., Carlin, J., Stern, H., et al. (2013), Bayesian Data Analysis, 3rd ed., New York: CRC Press. [40, 392, 400, 401]Google Scholar
Gelman, A. (2005), “Analysis of variancewhy it is more important than ever (with discussion),” The Annals of Statistics, 33, 153. [212]CrossRefGoogle Scholar
Gentleman, R. and Geyer, C. (1994), “Maximum likelihood for interval censored data: consistency and computation,” Biometrika, 81, 618623. [320]Google Scholar
Gentleman, R. and Vandal, A. (2001), “Computational algorithms for censored-data problems using intersection graphs,” Journal of Compuational and Graphical Statistics, 10, 403421. [324]Google Scholar
Gentleman, R. and Vandal, A.. (2002), “Nonparametric estimation of the bivariate CDF for arbitrarily censored data,” Canadian Journal of Statistics, 30, 557571. [320]Google Scholar
Ghosh, B. (2006), “Sequential analysis”, in Encyclopedia of Statistical Sciences, eds. Kotz, S., Read, C. B., Balakrishnan, N., et al., Wiley Online Library. DOI:10.1002/0471667196.ess2398.pub2 [356]Google Scholar
Ghosh, M. (2011), “Objective priors: an introduction for frequentists,” Statistical Science, 26, 187202. [401]Google Scholar
Gibbons, J. (1971), Nonparametric Statistical Inference, New York: McGraw-Hill Book Company. [95, 96, 97]Google Scholar
Goeman, J. J. and Solari, , A. (2014), “Multiple hypothesis testing in genomics,” Statistics in Medicine, 33, 19461978. [245, 246]Google Scholar
Gould, S. and Norris, S. L. (2021), “Contested effects and chaotic policies: the 2020 story of (hydroxy) chloroquine for treating COVID-19,” Cochrane Database of Systematic Reviews, 2021, 3 (ED000151), 1–5. [2]Google Scholar
Graybill, F. A. (1976), Theory and Application of the Linear Model, Pacific Grove, CA: Wadsworth Publishing Company. [190, 254]Google Scholar
Greenland, S. (2017), “Invited commentary: the need for cognitive science in methodology,” American Journal of Epidemiology, 186, 639645. [6, 21]CrossRefGoogle ScholarPubMed
Greenland, . (2019), “Valid p-values behave exactly as they should: some misleading criticisms of p-values and their resolution with s-values,” American Statistician, 73, 106114. [9]Google Scholar
Grimes, D. A. and Schulz, K. F. (2002), “Uses and abuses of screening tests,” The Lancet, 359, 881884. [32]Google Scholar
Groeneboom, P., Jongbloed, G., and Wellner, J. (2008), “The support reduction algorithm for computing nonparametric function estimates in mixture models,” Scandinavian Journal of Statistics, 35, 385399. [324]Google Scholar
Guo, X., Pan, W., Connett, J. E., Hannan, P. J., and French, S. A. (2005), “Small-sample performance of the robust score test and its modifications in generalized estimating equations,” Statistics in Medicine, 24, 34793495. [175]Google Scholar
Hall, W. J. and Wellner, J. A. (1980), “Confidence bands for a survival curve from censored data,” Biometrika, 67, 133143. [306]Google Scholar
Halloran, M. E., Longini, I. M., and Struchiner, C. J. (2010), Design and Analysis of Vaccine Studies, New York: Springer. [41, 279, 281]Google Scholar
Hampel, F. R., Ronchetti, E. M., and Rousseeuw, P. J. (1986), Robust Statistics: The Approach Based on Influence Functions, New York: John Wiley & Sons. [153]Google Scholar
Hand, D. J. (1992), “On comparing two treatments,” The American Statistician, 46, 190192. [160]Google Scholar
Hand, D. J.. (1994), “Deconstructing statistical questions,” Journal of the Royal Statistical Society: Series A (Statistics in Society), 157, 317338. [47, 128]Google Scholar
Hanley, J. A. and McNeil, B. J. (1982), “The meaning and use of the area under a receiver operating characteristic (ROC) curve,” Radiology, 143, 2936. [130]Google Scholar
Harrell, F., Lee, K. L., and Mark, D. B. (1996), “Tutorial in biostatistics multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors,” Statistics in Medicine, 15, 361387. [130]Google Scholar
Harrington, D. P. and Fleming, T. R. (1982), “A class of rank test procedures for censored survival data,” Biometrika, 69, 553566. [317]Google Scholar
Hastie, T., Tibshirani, R., and Friedman, J. (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed., New York: Springer. [268, 273]Google Scholar
Haybittle, J. (1971), “Repeated assessment of results in clinical trials of cancer treatment,” The British Journal of Radiology, 44, 793797. [348]Google Scholar
Hayes, R. J. and Moulton, L. H. (2009), Cluster Randomised Trials, Boca Raton, FL: Chapman and Hall/CRC. [216, 232]Google Scholar
Heinze, G. (2006), “A comparative investigation of methods for logistic regression with separated or nearly separated data,” Statistics in Medicine, 25, 42164226. [258]Google Scholar
Hennekens, C. H., Buring, J. E., Manson, J. E., et al. (1996), “Lack of effect of long-term supplementation with beta carotene on the incidence of malignant neoplasms and cardiovascular disease,” New England Journal of Medicine, 334, 11451149. [47]Google Scholar
Hepworth, G. (1996), “Exact confidence intervals for proportions estimated by group testing,” Biometrics, 11341146. [64, 65]Google Scholar
Hepworth, G.. (2005), “Confidence intervals for proportions estimated by group testing with groups of unequal size,” Journal of Agricultural, Biological, and Environmental Statistics, 10, 478497. [64, 65]Google Scholar
Hernán, M. A., Alonso, A., Logan, R., et al. (2008), “Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease,” Epidemiology, 19, 766779. [47]Google Scholar
Hernán, M.A.andHernández-Díaz, S. (2012), “Beyond the intention-to-treat in comparative effectiveness research,” Clinical Trials, 9, 4855. [36]Google Scholar
Hernán, M. A. and Robins, J. M. (2020), Causal Inference: What If, Boca Raton, FL: Chapman and HallCRC. [292, 296, 300, 371]Google Scholar
Hirji, K. (2006), Exact Analysis of Discrete Data, New York: Chapman and Hall/CRC. [121]Google Scholar
Hochberg, Y. and Tamhane, A. C. (1987), Multiple Comparison Procedures, New York: Wiley. [201, 206, 207, 211, 250]Google Scholar
Hoffman, E. B., Sen, P. K., and Weinberg, C. R. (2001), “Within-cluster resampling,” Biometrika, 88, 420429. [187]Google Scholar
Hollander, M., Wolfe, D. A., and Chicken, E. (2014), Nonparametric Statistical Methods, 3rd ed., Hoboken, NJ: John Wiley & Sons. [91, 97, 101, 102, 204, 363]Google Scholar
Hommel, G. (1988), “A stagewise rejective multiple test procedure based on a modified Bonferroni test,” Biometrika, 75, 383386. [240]Google Scholar
Hosmer, D. and Lemeshow, S. (1980), “Goodness of fit statistics tests for the multiple regression model,” Communications in Statistics A, 9, 10431069. [365]Google Scholar
Hu, X., Jung, A., and Qin, G. (2020), “Interval estimation for the correlation coefficient,” The American Statistician, 74, 2936. [97, 101]Google Scholar
Huang, J., Lee, C., and Yu, Q. (2008), “A generalized log-rank test for interval-censored failure time data via multiple imputation,” Statistics in Medicine, 27, 32173226. [321]Google Scholar
Hudgens, M. G. (2005), “On nonparametric maximum likelihood estimation with interval censoring and left truncation,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67, 573587. [321]Google Scholar
Hudgens, M. G. and Halloran, M. E. (2008), “Toward causal inference with interference,” Journal of the American Statistical Association, 103, 832842. [289]Google Scholar
Hyndman, R. J. and Fan, Y. (1996), “Sample quantiles in statistical packages,” The American Statistician, 50, 361365. [361]Google Scholar
Ignatova, I., Deutsch, R. C., and Edwards, D. (2012), “Closed sequential and multistage inference on binary responses with or without replacement,” The American Statistician, 66, 163172. [349]Google Scholar
Imbens, G. W. and Rubin, D. B. (2015), Causal Inference in Statistics, Social, and Biomedical Sciences, New York: Cambridge University Press. [277, 278, 287, 288, 300]Google Scholar
Ioannidis, J. P. (2005), “Why most published research findings are false,” PLoS Medicine, 2, e124. [7]Google Scholar
Irwin, J. (1935), “Tests of significance for differences between percentages based on small numbers,” Metron, 12, 8494. [110]Google Scholar
Jefferys, W. H. (1990), “Bayesian analysis of random event generator data,” Journal of Scientific Exploration, 4, 153169. [394]Google Scholar
Jennison, C. and Turnbull, B. W. (2000), Group Sequential Methods with Applications to Clinical Trials, Boca Raton, FL: Chapman and Hall/CRC. [349, 356]Google Scholar
Jennison, C. and Turnbull, B. W.. (2007), “Adaptive seamless designs: selection and prospective testing of hypotheses,” Journal of Biopharmaceutical Statistics, 17, 11351161. [354]Google Scholar
Johnson, N. L., Kemp, A. W., and Kotz, S. (2005), Univariate Discrete Distributions, 3rd edition, New York: John Wiley & Sons. [123]Google Scholar
Johnson, N. L., Kotz, S., and Balakrishnan, N. (1995), Continuous Univariate Distributions, vol.2,New York: John Wiley & Sons. [97]Google Scholar
Kahneman, D. (2011), Thinking, Fast and Slow, New York: Farrar, Straus, and Giroux. [32]Google Scholar
Kalbfleisch, J. and Prentice, R. (2002), The Statistical Analysis of Failure Time Data, 2nd ed., NewYork: Wiley. [260, 306, 311, 315, 316, 323, 324]Google Scholar
Kang, J. D. and Schafer, J. L. (2007), “Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data (with discussion),” Statistical Science, 22, 523539 (discussion: 540580). [338]Google Scholar
Kaplan, E. and Meier, P. (1958), “Nonparametric estimation from incomplete observations,” Journal of the American Statistical Association, 53, 457481. [305]Google Scholar
Karlin, S. and Taylor, H. (1975), A First Course in Stochastic Processes, 2nd ed., New York: Academic Press. [82, 357]Google Scholar
Kass, R. E. and Raftery, A. E. (1995), “Bayes factors,” Journal of the American Statistical Association, 90, 773795. [393, 394, 401]Google Scholar
Kauermann, G. and Carroll, R. J. (2001), “A note on the efficiency of sandwich covariance matrix estimation,” Journal of the American Statistical Association, 96, 13871396. [175]Google Scholar
Kawaguchi, Atsushi, and Koch, Gary, G. (2015), “sanon: an R package for stratified analysis with nonparametric covariable adjustment,” Journal of Statistical Software, 67, 9. [226]Google Scholar
Kim, S., Fay, M. P., and Proschan, M. A. (2021), “Valid and approximately valid confidence intervals for current status data,” Journal of the Royal Statistical Society: Series B, DOI:10.1111/rssb.12422 [320]Google Scholar
Kirk, J. L. and Fay, M. P. (2014), “An introduction to practical sequential inferences via single-arm binary response studies using the binseqtest R package,” The American Statistician, 68, 230242. [349]Google Scholar
Konietschke, F., Hothorn, L. A., Brunner, E., et al. (2012), “Rank-based multiple test procedures and simultaneous confidence intervals,” Electronic Journal of Statistics, 6, 738759. [210, 212, 244]Google Scholar
Konietschke, F., Placzek, M., Schaarschmidt, F., and Hothorn, L. A. (2015), “nparcomp: an R software package for nonparametric multiple comparisons and simultaneous confidence intervals,” Journal of Statistical Software, 64, 9, 1–17. [212]Google Scholar
Koopmans, L. H., Owen, D. B., and Rosenblatt, J. (1964), “Confidence intervals for the coefficient of variation for the normal and log normal distributions,” Biometrika, 51, 2532. [80]Google Scholar
Korn, E. L. and Graubard, B. I. (1999), Analysis of Health Surveys, vol. 323, New York: John Wiley & Sons. [47, 64]Google Scholar
Koziol, J. A. and Jia, Z. (2009), “The concordance index C and the Mann–Whitney parameter Pr (X¿ Y) with randomly censored data,” Biometrical Journal, 51, 467474. [130]Google Scholar
Kuznetsova, A., Brockhoff, P. B., and Christensen, R. H. B. (2017), “lmerTest package: tests in linear mixed effects models,” Journal of Statistical Software, 82, 13, 1–26. [219]Google Scholar
Lachin, J. M. (1981), “Introduction to sample size determination and power analysis for clinical trials,” Controlled Clinical Trials, 2, 93113. [386]Google Scholar
Lan, K. G. and DeMets, D. L. (1983), “Discrete sequential boundaries for clinical trials,” Biometrika, 70, 659663. [346]Google Scholar
Lan, K. G. and Wittes, J. (1988), “The B-value: a tool for monitoring data,” Biometrics, 579585. [347]Google Scholar
Lan, K. G. and Wittes, J. T. (2012), “Some thoughts on sample size: a Bayesian-frequentist hybrid approach,” Clinical Trials, 9, 561569. [384]Google Scholar
Lang, Z. and Reiczigel, J. (2014), “Confidence limits for prevalence of disease adjusted for estimated sensitivity and specificity,” Preventive Veterinary Medicine, 113, 1322. [64]Google Scholar
Lehmann, E. (1975), Nonparametrics: Statistical Methods Based on Ranks, Oakland, CA: Holden-Day. [203, 226]Google Scholar
Lehmann, E.. (1999), Elements of Large Sample Theory, New York: Springer. [83, 101, 153, 166, 172, 190]Google Scholar
Lehmann, E. and Romano, J. (2005), Testing Statistical Hypotheses, 3rd ed., New York: Springer. [xii, xiii, 9, 20, 21, 71, 77, 81, 82, 151, 158, 190, 191, 363, 374]Google Scholar
Lilliefors, H. W. (1967), “On the Kolmogorov-Smirnov test for normality with mean and variance unknown,” Journal of the American Statistical Association, 62, 399402. [374]Google Scholar
Lin, D. Y. and Wei, L.-J. (1989), “The robust inference for the Cox proportional hazards model,” Journal of the American Statistical Association, 84, 10741078. [315]Google Scholar
Lin, L. I.-K. (1989), “A concordance correlation coefficient to evaluate reproducibility,” Biometrics, 45, 255268. [99]Google Scholar
Lindley, D. V. and Phillips, L. (1976), “Inference for a Bernoulli process (a Bayesian view),” The American Statistician, 30, 112119. [47]Google Scholar
Little, R. J. and Rubin, D. B. (2020), Statistical Analysis with Missing Data, 3rd ed., New York: John Wiley & Sons. [340]Google Scholar
Little, R. J., Wang, J., Sun, X., et al. (2016), “The treatment of missing data in a large cardiovascular clinical outcomes study,” Clinical Trials, 13, 344351. [339]Google Scholar
Liublinska, V. and Rubin, D. B. (2014), “Sensitivity analysis for a partially missing binary outcome in a two-arm randomized clinical trial,” Statistics in Medicine, 33, 41704185. [332]Google Scholar
Lloyd, C. J. (2008), “Exact p-values for discrete models obtained by estimation and maximization,” Australian & New Zealand Journal of Statistics, 50, 329345. [114]Google Scholar
Loughin, T. M. (2004), “A systematic comparison of methods for combining p-values from independent tests,” Computational Statistics & Data Analysis, 47, 467485. [352]Google Scholar
Lunceford, J. K. and Davidian, M. (2004), “Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study,” Statistics in Medicine, 23, 29372960. [39, 292]Google Scholar
Lydersen, S., Pradhan, V., Senchaudhuri, P., and Laake, P. (2007), “Choice of test for association in small sample unordered r × ctables,”Statistics in Medicine, 26, 43284343. [365]Google Scholar
Mann, H. B. and Whitney, D. R. (1947), “On a test of whether one of two random variables is stochastically larger than the other,” The Annals of Mathematical Statistics, 18, 5060. [143]Google Scholar
Mantel, N. (1966), “Evaluation of survival data and two new rank order statistics arising in its consideration,” Cancer Chemotherapy Reports, 50, 163170. [316]Google Scholar
Marcus, R., Eric, P., and Gabriel, K. R. (1976), “On closed testing procedures with special reference to ordered analysis of variance,” Biometrika, 63, 655660. [248]Google Scholar
Martín Andrés, A., Sánchez Quevedo, M., and Silva Mato, A. (1998), “Fisher’s mid-p-value arrangement in 2 × 2 comparative trials,” Computational Statistics & Data Analysis, 29, 107115. [112]Google Scholar
Mayo, D. G. (1996), Error and the Growth of Experimental Knowledge, Chicago, IL: University of Chicago Press. [46]Google Scholar
McCullagh, P. (1980), “Regression models for ordinal data,” Journal of the Royal Statistical Society: Series B (Methodological), 42, 109142. [151]Google Scholar
McCullagh, P. and Nelder, J. A. (1989), Generalized Linear Models, 2nd ed., London: Chapman and Hall. [133, 212, 217, 257, 259, 273]Google Scholar
Mee, R. W. (1990), “Confidence intervals for probabilities and tolerance regions based on a generalization of the Mann-Whitney statistic,” Journal of the American Statistical Association, 85, 793800. [130]Google Scholar
Mehta, C. R. and Patel, N. R. (1995), “Exact logistic regression: theory and examples,” Statistics in Medicine, 14, 21432160. [258, 273]Google Scholar
Mehta, C. R., Patel, N. R., and Gray, R. (1985), “Computing an exact confidence interval for the common odds ratio in several 2 × 2 contingency tables,” Journal of the American Statistical Association, 80, 969973. [123]Google Scholar
Mehta, J. and Srinivasan, R. (1970), “On the BehrensFisher problem,” Biometrika, 57, 649655. [139]Google Scholar
Meng, X.-L. (1994), “Posterior predictive p-values,” The Annals of Statistics, 22, 11421160. [402]CrossRefGoogle Scholar
Michael, H., Thornton, S., Xie, M., and Tian, L. (2019), “Exact inference on the random-effects model for meta-analyses with few studies,” Biometrics, 75, 485493. [227, 232, 233]Google Scholar
Miettinen, O. and Nurminen, M. (1985), “Comparative analysis of two rates,” Statistics in Medicine, 4, 213226. [121]Google Scholar
Morgan, S. L. and Winship, C. (2015), Counterfactuals and Causal Inference 2nd ed., New York: Cambridge University Press. [298, 300, 301]Google Scholar
Moser, B. K., Stevens, G. R., and Watts, C. L. (1989), “The two-sample t test versus Satterthwaite’s approximate F test,” Communications in Statistics – Theory and Methods, 18, 39633975. [139]Google Scholar
Mullen, G. E., Ellis, R. D., Miura, K., et al. (2008), “Phase 1 trial of AMA1-C1/Alhydrogel plus CPG 7909: an asexual blood-stage vaccine for Plasmodium falciparum malaria,” PLoS One, 3, e2940. [131]Google Scholar
Murphy, S., Rossini, A., and van der Vaart, A. W. (1997), “Maximum likelihood estimation in the proportional odds model,” Journal of the American Statistical Association, 92, 968976. [261, 315]Google Scholar
Murphy, S. A. and van der Vaart, A. W. (2000), “On profile likelihood,” Journal of the American Statistical Association, 95, 449465. [45, 188, 315]Google Scholar
National Research Council. (2010), The Prevention and Treatment of Missing Data in Clinical Trials, Washington, DC: National Academies Press. [327, 329, 339, 340]Google Scholar
Nel, D. d., van der Merwe, C. A., and Moser, B. (1990), “The exact distributions of the univariate and multivariate Behrens-Fisher statistics with a comparison of several solutions in the univariate case,” Communications in Statistics – Theory and Methods, 19, 279298. [139]Google Scholar
Neubert, K. and Brunner, E. (2007), “A studentized permutation test for the non-parametric Behrens–Fisher problem,” Computational Statistics & Data Analysis, 51, 51925204. [157]Google Scholar
Newcombe, R. G. (2006), “Confidence intervals for an effect size measure based on the Mann–Whitney statistic. Part 2: asymptotic methods and evaluation,” Statistics in Medicine, 25, 559573. [157]Google Scholar
Neyman, J. and Scott, E. L. (1948), “Consistent estimates based on partially consistent observations,” Econometrica, 16, 132. [263]Google Scholar
Ng, H. K. T., Filardo, G., and Zheng, G. (2008), “Confidence interval estimating procedures for standardized incidence rates,” Computational Statistics & Data Analysis, 52, 35013516. [230]Google Scholar
Oakes, D. (2016), “On the win-ratio statistic in clinical trials with multiple types of event,” Biometrika, 103, 742745. [260]Google Scholar
O’Brien, P. C. and Fleming, T. R. (1987), “A paired Prentice-Wilcoxon test for censored paired data,” Biometrics, 43, 169180. [103]Google Scholar
Oller, R., Gómez, G., Calle, M. L. (2007), “Interval censoring: identifiability and the constant-sum property,” Biometrika, 94, 6170. [319]Google Scholar
Owen, A. B. (2001), Empirical Likelihood, Boca Raton, FL: Chapman and Hall/CRC. [188]Google Scholar
Park, M. Y. and Hastie, T. (2007), “L1-regularization path algorithm for generalized linear models,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69, 659677. [273]Google Scholar
Paule, R. C. and Mandel, J. (1982), “Consensus values and weighting factors,” Journal of Research of the National Bureau of Standards, 87, 377385. [227]Google Scholar
Pauly, M., Asendorf, T., and Konietschke, F. (2016), “Permutation-based inference for the AUC: a unified approach for continuous and discontinuous data,” Biometrical Journal, 58, 13191337. [157, 160]Google Scholar
Pearl, J. (2009a), “Causal inference in statistics: an overview,” Statistics Surveys, 3, 96146. [300]Google Scholar
Pearl, J.. (2009b), Causality: Models, Reasoning, and Inference, 2nd ed., New York: Cambridge University Press. [24, 46, 222, 232, 277, 296, 300]Google Scholar
Pearl, J., Glymour, M., and Jewell, N. P. (2016), Causal Inference in Statistics: A Primer, Chichester: John Wiley & Sons. [270, 295, 296, 300]Google Scholar
Peikes, D. N., Moreno, L., and Orzol, S. M. (2008), “Propensity score matching: a note of caution for evaluators of social programs,” The American Statistician, 62, 222231. [26]Google Scholar
Perlman, M. and Wu, L. (1999), “The emperor’s new tests (with discussion),” Statistical Science, 14, 355381. [21]Google Scholar
Peto, R. and Peto, J. (1972), “Asymptotically efficient rank invariant test procedures,” Journal of the Royal Statistical Society A, 135, 185207. [316, 317, 325]Google Scholar
Peto, R., Pike, M., Armitage, P., et al. (1976), “Design and analysis of randomized clinical trials requiring prolonged observation of each patient. I. Introduction and design,” British Journal of Cancer, 34, 585. [348]Google Scholar
Plesser, H. E. (2018), “Reproducibility vs. replicability: a brief history of a confused terminology,” Frontiers in Neuroinformatics, 11, 76. [xi]Google Scholar
Popper, K. (1963), Conjectures and Refutations: The Growth of Scientific Knowledge, London: Routledge. [28]Google Scholar
Posch, M. and Bauer, P. (1999), “Adaptive two stage designs and the conditional error function,” Biometrical Journal: Journal of Mathematical Methods in Biosciences, 41, 689696. [356]Google Scholar
Pratt, J. W. (1959), “Remarks on zeros and ties in the Wilcoxon signed rank procedures,” Journal of the American Statistical Association, 54, 655667. [87, 89, 90]Google Scholar
Pratt, J. W.. (1964), “Robustness of some procedures for the two-sample location problem,” Journal of the American Statistical Association, 59, 665680. [157]Google Scholar
Prentice, R. L. (1978), “Linear rank tests with right censored data,” Biometrika, 65, 167179. [317]Google Scholar
Prentice, R. L., Langer, R., Stefanick, M. L., et al. (2005), “Combined postmenopausal hormone therapy and cardiovascular disease: toward resolving the discrepancy between observational studies and the Women’s Health Initiative clinical trial,” American Journal of Epidemiology, 162, 404414. [25, 47]Google Scholar
Prentice, R. L. and Marek, P. (1979), “A qualitative discrepancy between censored data rank tests,” Biometrics, 35, 861867. [317, 325]Google Scholar
PREVAIL II Writing Group. (2016), “A randomized, controlled trial of ZMapp for Ebola virus infection,” The New England Journal of Medicine, 375, 1448. [111]Google Scholar
Proschan, M. and Brittain, E. (2020), “A primer on strong versus weak control of familywise error rate,” Statistics in Medicine, 39, 14071413. [213]Google Scholar
Proschan, M., Brittain, E., and Kammerman, L. (2011), “Minimize the use of minimization with unequal allocation,” Biometrics, 67, 11351141. [34]Google Scholar
Proschan, M. and Follmann, D. (2008), “Cluster without fluster: the effect of correlated outcomes on inference in randomized clinical trials,” Statistics in Medicine, 27, 795809. [41]Google Scholar
Proschan, M. A. (1999), “Miscellanea. Properties of spending function boundaries,” Biometrika, 86, 466473. [357]Google Scholar
Proschan, M. A., Follmann, D. A., and Waclawiw, M. A. (1992), “Effects of assumption violations on type I error rate in group sequential monitoring,” Biometrics, 11311143. [348]Google Scholar
Proschan, M. A. and Hunsberger, S. A. (1995), “Designed extension of studies based on conditional power,” Biometrics, 51, 13151324. [352, 353, 355, 356]Google Scholar
Proschan, M. A., Lan, K. G., and Wittes, J. T. (2006), Statistical Monitoring of Clinical Trials: A Unified Approach, New York: Springer. [346, 347, 348, 349, 351, 356, 357]Google Scholar
Proschan, M. A., McMahon, R. P., Shih, J. H., et al. (2001), “Sensitivity analysis using an imputation method for missing binary data in clinical trials,” Journal of Statistical Planning and Inference, 96, 155165. [330]Google Scholar
Reiczigel, J., Földi, J., and Ózsvári, L. (2010), “Exact confidence limits for prevalence of a disease with an imperfect diagnostic test,” Epidemiology and Infection, 138, 16741678. [64, 65]Google Scholar
Robins, J., Breslow, N., and Greenland, S. (1986), “Estimators of the Mantel-Haenszel variance consistent in both sparse data and large-strata limiting models,” Biometrics, 42, 311323. [225, 233]Google Scholar
Röhmel, J. (2005), “Problems with existing procedures to calculate exact unconditional p-values for non-inferiority/superiority and confidence intervals for two binomials and how to resolve them,” Biometrical Journal, 47, 3747. [16]CrossRefGoogle Scholar
Röhmel, J. and Mansmann, U. (1999), “Unconditional non-asymptotic one-sided tests for independent binomial proportions when the interest lies in showing non-inferiority and/or superiority,” Biometrical Journal, 41, 149170. [113]Google Scholar
Rosenbaum, P. R. (2002), Observational Studies, 2nd ed., New York: Springer. [47, 292]Google Scholar
Rosenbaum, P. R.. (2010), Design of Observational Studies, New York: Springer. [47, 298, 300]Google Scholar
Rosendaal, F. R. (2020), “Review of: ‘hydroxychloroquine and azithromycin as a treatment of COVID-19: results of an open-label non-randomized clinical trial Gautret et al 2010’,” International Journal of Antimicrobial Agents, 56, 106063. [2]Google Scholar
Rothmann, M. D., Wiens, B. L., and Chan, I. S. (2012), Design and Analysis of Non-Inferiority Trials,Boca Raton, FL: Chapman and Hall/CRC. [370, 373]Google Scholar
Rubin, D. (2006), Matched Sampling for Casual Effects, New York: Cambridge University Press. [47]Google Scholar
Rubin, D. B. (1997), “Estimating causal effects from large data sets using propensity scores,” Annals of Internal Medicine, 127, 757763. [38]Google Scholar
Rubin, D. B. (1984), “Bayesianly justifiable and relevant frequency calculations for the applied statistician,” The Annals of Statistics, 12, 11511172. [402]Google Scholar
Sadoff, J., Gray, G., Vandebosch, A., et al. (2021), “Safety and efficacy of single-dose Ad26.COV2.S vaccine against Covid-19,” New England Journal of Medicine, 384, 21872201. [120]Google Scholar
Sagara, I., Ellis, R. D., Dicko, A., et al. (2009), “A randomized and controlled Phase 1 study of the safety and immunogenicity of the AMA1-C1/Alhydrogel R○+ CPG 7909 vaccine for Plasmodium falciparum malaria in semi-immune Malian adults,” Vaccine, 27, 72927298. [91, 92]Google Scholar
Samara, B. and Randles, R. H. (1988), “A test for correlation based on kendallfs tau,” Communications in Statistics – Theory and Methods, 17, 31913205. [97]Google Scholar
Samuelsen, S. O. (2003), “Exact inference in the proportional hazard model: possibilities and limitations,” Lifetime Data Analysis, 9, 239260. [315]Google Scholar
Sarkar, S. K. and Chang, C.-K. (1997), “The Simes method for multiple hypothesis testing with positively dependent test statistics,” Journal of the American Statistical Association, 92, 16011608. [250]Google Scholar
Schenker, N. and Gentleman, J. F. (2001), “On judging the significance of differences by examining the overlap between confidence intervals,” The American Statistician, 55, 182186. [194]Google Scholar
Schilling, M. and Doi, J. (2014), “A coverage probability approach to finding an optimal binomial confidence procedure,” American Statistician, 68, 133145. [63]Google Scholar
Schoenfeld, D. (1981), “The asymptotic properties of nonparametric tests for comparing survival distributions,” Biometrika, 68, 316319. [382]Google Scholar
Schouten, H. J. (1999), “Sample size formula with a continuous outcome for unequal group sizes and unequal variances,” Statistics in Medicine, 18, 8791. [381]Google Scholar
Schweder, T. and Hjort, N. L. (2016), Confidence, Likelihood, Probability: Statistical Inference with Confidence Distributions, New York: Cambridge University Press. [190]Google Scholar
Seaman, S. R. and Vansteelandt, S. (2018), “Introduction to double robust methods for incomplete data,” Statistical Science, 33, 184197. [338, 340]Google Scholar
Seber, G. A. (1984), Multivariate Observations, Hoboken, NJ: John Wiley & Sons. [192]Google Scholar
Self, S. G. and Liang, K.-Y. (1987), “Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions,” Journal of the American Statistical Association, 82, 605610. [170, 273]Google Scholar
Sen, B. and Banerjee, M. (2007), “A pseudolikelihood method for analyzing interval censored data,” Biometrika, 94, 7186. [320]Google Scholar
Sen, B. and Xu, G. (2015), “Model based bootstrap methods for interval censored data,” Computational Statistics & Data Analysis, 81, 121129. [320]Google Scholar
Sen, P. (1985), “Permutational Central Limit Theorems,” in Encyclopedia of Statistics, eds.Kotz, S. and , Johnson N. L., Hoboken, NJ: Wiley, vol. 6, pp. 683687. [192]Google Scholar
Serfling, R. and Mazumder, S. (2009), “Exponential probability inequality and convergence results for the median absolute deviation and its modifications,” Statistics & Probability Letters, 79, 17671773. [80]Google Scholar
Shao, J. and Tu, D. (1995), The Jackknife and Bootstrap, New York: Springer. [184]Google Scholar
Shapiro, S. S. and Wilk, M. B. (1965), “An analysis of variance test for normality (complete samples),” Biometrika, 52, 591611. [374]Google Scholar
Shaw, P. A. (2018), “Use of composite outcomes to assess risk–benefit in clinical trials,” Clinical Trials, 15, 352358. [327]Google Scholar
Simmons, J. P., Nelson, L. D., and Simonsohn, U. (2011), “False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant,” Psychological Science, 22, 13591366. [238]Google Scholar
Singal, A. G., Higgins, P. D., and Waljee, A. K. (2014), “A primer on effectiveness and efficacy trials,” Clinical and Translational Gastroenterology, 5, e45. [371]Google Scholar
Singh, B., Ryan, H., Kredo, T., Chaplin, M., and Fletcher, T. (2021), “Chloroquine or hydroxychloroquine for prevention and treatment of COVID-19,” The Cochrane Database of Systematic Reviews, 2, CD013587. [3]Google Scholar
Skou, S. T., Roos, E. M., Laursen, M. B., et al. (2015), “A randomized, controlled trial of total knee replacement,” New England Journal of Medicine, 373, 15971606. [237]Google Scholar
Snee, R. D. (1974), “Graphical display of two-way contingency tables,” The American Statistician, 28, 912. [198]Google Scholar
Sommer, A. and Zeger, S. L. (1991), “On estimating efficacy from clinical trials,” Statistics in Medicine, 10, 4552. [286, 288]Google Scholar
Steering Committee for PHS. (1989), “Final report on the aspirin component of the ongoing Physicians’ Health Study,” New England Journal of Medicine, 321, 129135. [30]Google Scholar
Sterne, T. E. (1954), “Some remarks on confidence or fiducial limits,” Biometrika, 41, 12, 275278. [55, 63, 64]Google Scholar
Strassburger, K. and Bretz, F. (2008), “Compatible simultaneous lower confidence bounds for the Holm procedure and other Bonferroni-based closed tests,” Statistics in Medicine, 27, 49144927. [240]Google Scholar
Stuart, E. A. (2010), “Matching methods for causal inference: A review and a look forward,” Statistical Science: A Review Journal of the Institute of Mathematical Statistics, 25, 1. [291]Google Scholar
Tamhane, A. C. and Gou, J. (2017), “Advances in p-value based multiple test procedures,” Journal of Biopharmaceutical Statistics, 1–18. [250]Google Scholar
Tan, W. (1982), “Sampling distributions and robustness of t, F and variance-ratio in two samples and ANOVA models with respect to departure from normality,” Communications in Statistics – Theory and Methods, 11, 24852511. [201]Google Scholar
Tang, R., Banerjee, M., Kosorok, M. R., et al. (2012), “Likelihood based inference for current status data on a grid: A boundary phenomenon and an adaptive inference procedure,” The Annals of Statistics, 40, 4572. [320]Google Scholar
Tarone, R. E. and Gart, J. J. (1980), “On the robustness of combined tests for trends in proportions,” Journal of the American Statistical Association, 75, 110116. [203]Google Scholar
Tchetgen Tchetgen, E. J. and VanderWeele, T. J. (2012), “On causal inference in the presence of interference,” Statistical Methods in Medical Research, 21, 5575. [289]Google Scholar
Thangavelu, K. and Brunner, E. (2007), “Wilcoxon–Mann–Whitney test for stratified samples and Efron’s paradox dice,” Journal of Statistical Planning and Inference, 137, 720737. [210]Google Scholar
Thas, O., Neve, J. D., Clement, L., and Ottoy, J.-P. (2012), “Probabilistic index models,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 74, 623671. [130, 261]Google Scholar
The Open Science Collaboration. (2015), “Estimating the reproducibility of psychological science,” Science, 349, aac4716. [3]Google Scholar
Therneau, T. (2015), A Package for Survival Analysis in S, r package version 2.38. https://CRAN.R-project.org/package=survival [315]Google Scholar
Therneau, T. M. and Grambsch, P. M. (2000), Modeling Survival Data: Extending the Cox Model, New York: Springer. [308, 323]Google Scholar
Tibshirani, R. (1996), “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society: Series B (Methodological), 58, 267288. [267]Google Scholar
Tibshirani, R.. (2011), “Regression shrinkage and selection via the lasso: a retrospective,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73, 273282. [273]Google Scholar
Tsiatis, A. (2006), Semiparametric Theory and Missing Data, New York: Springer. [340]Google Scholar
Tsiatis, A. A., Davidian, M., Zhang, M., and Lu, X. (2008), “Covariate adjustment for two-sample treatment comparisons in randomized clinical trials: a principled yet flexible approach,” Statistics in Medicine, 27, 46584677. [289]Google Scholar
Væth, M. (1985), “On the use of Wald’s test in exponential families,” International Statistical Review, 53, 199214. [172]Google Scholar
Vandal, A. C., Gentleman, R., and Liu, X. (2005), “Constrained estimation and likelihood intervals for censored data,” Canadian Journal of Statistics, 33, 7183. [320]Google Scholar
VanderWeele, T. (2015), Explanation in Causal Inference: Methods for Mediation and Interaction, New York: Oxford University Press. [296]Google Scholar
Veronese, P. and Melilli, E. (2015), “Fiducial and confidence distributions for real exponential families,” Scandinavian Journal of Statistics, 42, 471484. [399, 402]Google Scholar
Vonesh, E. and Chinchilli, V. M. (1997), Linear and Nonlinear Models for the Analysis of Repeated Measurements, New York: Marcel Dekker. [190]Google Scholar
Vos, P. and Hudson, S. (2008), “Problems with binomial two-sided tests and the associated confidence intervals,” Australian and New Zealand Journal of Statistics, 50, 8189. [56]Google Scholar
Wacholder, S., McLaughlin, J. K., Silverman, D. T., and Mandel, J. S. (1992), “Selection of controls in case-control studies: I. Principles,” American Journal of Epidemiology, 135, 10191028. [122]Google Scholar
Wald, A. (1947), Sequential Analysis, New York: Dover. [343, 356]Google Scholar
Wang, R., Lagakos, S., and Gray, R. (2010), “Testing and interval estimation for two-sample survival comparisons with small sample sizes and unequal censoring,” Biostatistics, 11, 676692. [122]Google Scholar
Wang, W. (2010), “On construction of the smallest one-sided confidence interval for the difference of two proportions,” The Annals of Statistics, 38, 12271243. [114]Google Scholar
Wang, W. and Shan, G. (2015), “Exact confidence intervals for the relative risk and the odds ratio,” Biometrics, 71, 985995. [114]Google Scholar
Wasserstein, R. L. and Lazar, N. A. (2016), “The ASA’s statement on p-values: context, process, and purpose,” The American Statistician, 70, 129133. [xi]Google Scholar
Wasserstein, R. L., Schirm, A. L., and Lazar, N. A. (2019), “Moving to a World Beyond ‘p <0.05’,” The American Statistician, 73, 119. [xi, 6]Google Scholar
Webster, W., Walsh, D., McEwen, S. E., and Lipson, A. (1983), “Some teratogenic properties of ethanol and acetaldehyde in C57BL/6J mice: implications for the study of the fetal alcohol syndrome,” Teratology, 27, 231243. [215]Google Scholar
Welch, B. and Peers, H. (1963), “On formulae for confidence points based on integrals of weighted likelihoods,” Journal of the Royal Statistical Society: Series B (Methodological), 25, 318329. [402]Google Scholar
Westfall, P. H. (1997), “Multiple testing of general contrasts using logical constraints and correlations,” Journal of the American Statistical Association, 92, 299306. [248]Google Scholar
Westfall, P. H., Tobias, R. D., Rom, D., Wolfinger, R. D., and Hochberg, Y. (1999), Multiple Comparisons and Multiple Tests Using the SAS System, Cary, NC: SAS Institute. [206]Google Scholar
Westfall, P. H. and Troendle, J. F. (2008), “Multiple testing with minimal assumptions,” Biometrical Journal, 50, 745755. [245]Google Scholar
Westfall, P. H. and Young, S. S. (1993), Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment, New York: John Wiley & Sons. [245, 250]Google Scholar
Whidden, C., Treleaven, E., Liu, J., et al. (2019), “Proactive community case management and child survival: protocol for a cluster randomised controlled trial,” BMJ Open, 9, e027487. [379]Google Scholar
Wilcoxon, F. (1945), “Individual comparisons by ranking methods,” Biometrics Bulletin, 1, 8083. [143]Google Scholar
Wittes, J. (2002), “Sample size calculations for randomized controlled trials,” Epidemiologic Reviews, 24, 3953. [382, 386]Google Scholar
Wittes, J., Barrett-Connor, E., Braunwald, E., et al. (2007), “Monitoring the randomized trials of the Women’s Health Initiative: the experience of the Data and Safety Monitoring Board,” Clinical Trials, 4, 218234. [25]Google Scholar
Wittes, J. and Brittain, E. (1990), “The role of internal pilot studies in increasing the efficiency of clinical trials,” Statistics in Medicine, 9, 6572. [386]Google Scholar
Wu, C. J. (1985), “Efficient sequential designs with binary data,” Journal of the American Statistical Association, 80, 974984. [387]Google Scholar
Xie, M.-g. and Singh, K. (2013), “Confidence distribution, the frequentist distribution estimator of a parameter: a review (with discussion),” International Statistical Review, 81, 3–77. [190]Google Scholar
Yan, X., Lee, S., and Li, N. (2009), “Missing data handling methods in medical device clinical trials,” Journal of Biopharmaceutical Statistics, 19, 10851098. [331]Google Scholar
Yates, F. (1984), “Tests of significance for 2 × 2 contingency tables,” Journal of the Royal Statistical Society: Series A (General), 147, 426463. [110, 111, 121]Google Scholar
Zeger, S. L., Liang, K.-Y., and Albert, P. S. (1988), “Models for longitudinal data: a generalized estimating equation approach,” Biometrics, 44, 10491060. [266, 275]Google Scholar
Zeileis, A. (2004), “Econometric computing with HC and HAC covariance matrix estimators,” Journal of Statistical Software, Articles, 11. [255, 273]Google Scholar
Zeileis, A., Kleiber, C., and Jackman, S. (2008), “Regression models for count data in R,” Journal of Statistical Software, 27, 1–25. [259]Google Scholar
Zhang, H., Lu, N., Feng, C., et al. (2011), “On fitting generalized linear mixed-effects models for binary responses using different statistical packages,” Statistics in Medicine, 30, 25622572. [218]Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×