Hostname: page-component-78c5997874-s2hrs Total loading time: 0 Render date: 2024-11-09T01:31:41.118Z Has data issue: false hasContentIssue false

The rescaled Pólya urn: local reinforcement and chi-squared goodness-of-fit test

Published online by Cambridge University Press:  18 October 2022

Giacomo Aletti*
Affiliation:
Università degli Studi di Milano
Irene Crimaldi*
Affiliation:
IMT School for Advanced Studies Lucca
*
*Postal address: ADAMSS Center, Università degli Studi di Milano, Milan, Italy.
**Postal address: IMT School for Advanced Studies Lucca, Lucca, Italy.

Abstract

Motivated by recent studies of big samples, this work aims to construct a parametric model which is characterized by the following features: (i) a ‘local’ reinforcement, i.e. a reinforcement mechanism mainly based on the last observations, (ii) a random persistent fluctuation of the predictive mean, and (iii) a long-term almost sure convergence of the empirical mean to a deterministic limit, together with a chi-squared goodness-of-fit result for the limit probabilities. This triple purpose is achieved by the introduction of a new variant of the Eggenberger–Pólya urn, which we call the rescaled Pólya urn. We provide a complete asymptotic characterization of this model, pointing out that, for a certain choice of the parameters, it has properties different from the ones typically exhibited by the other urn models in the literature. Therefore, beyond the possible statistical application, this work could be interesting for those who are concerned with stochastic processes with reinforcement.

Type
Original Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press on behalf of Applied Probability Trust

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aletti, G., Crimaldi, I. and Ghiglietti, A. (2017). Synchronization of reinforced stochastic processes with a network-based interaction. Ann. Appl. Prob. 27, 37873844.CrossRefGoogle Scholar
Aletti, G., Crimaldi, I. and Ghiglietti, A. (2019). Networks of reinforced stochastic processes: asymptotics for the empirical means. Bernoulli 25, 33393378.CrossRefGoogle Scholar
Aletti, G., Crimaldi, I. and Ghiglietti, A. (2020). Interacting reinforced stochastic processes: statistical inference based on the weighted empirical means. Bernoulli 26, 10981138.CrossRefGoogle Scholar
Aletti, G., Crimaldi, I. and Saracco, F. (2021). A model for the Twitter sentiment curve. PLOS ONE 16, 128.CrossRefGoogle Scholar
Aletti, G., Ghiglietti, A. and Paganoni, A. M. (2013). Randomly reinforced urn designs with prespecified allocations. J. Appl. Prob. 50, 486498.CrossRefGoogle Scholar
Aletti, G., Ghiglietti, A. and Rosenberger, W. F. (2018). Nonparametric covariate-adjusted response-adaptive design based on a functional urn model. Ann. Statist. 46, 38383866.CrossRefGoogle Scholar
Aletti, G., Ghiglietti, A. and Vidyashankar, A. N. (2018). Dynamics of an adaptive randomly reinforced urn. Bernoulli 24, 22042255.CrossRefGoogle Scholar
Bergh, D. (2015). Sample size and chi-squared test of fit—a comparison between a random sample approach and a chi-square value adjustment method using Swedish adolescent data. In Pacific Rim Objective Measurement Symposium (PROMS) 2014 Conference Proceedings, eds Q. Zhang and H. Yang, Springer, Berlin, Heidelberg, pp. 197211.CrossRefGoogle Scholar
Berti, P., Crimaldi, I., Pratelli, L. and Rigo, P. (2011). A central limit theorem and its applications to multicolor randomly reinforced urns. J. Appl. Prob. 48, 527546.CrossRefGoogle Scholar
Berti, P., Crimaldi, I., Pratelli, L. and Rigo, P. (2016). Asymptotics for randomly reinforced urns with random barriers. J. Appl. Prob. 53, 12061220.CrossRefGoogle Scholar
Bertoni, D. et al. (2018). Farmland use transitions after the CAP greening: a preliminary analysis using Markov chains approach. Land Use Policy 79, 789800.CrossRefGoogle Scholar
Caldarelli, G., Chessa, A., Crimaldi, I. and Pammolli, F. (2013). Weighted networks as randomly reinforced urn processes. Phys. Rev. E 87, 020106.CrossRefGoogle ScholarPubMed
Caron, F. et al. (2017). Generalized Pólya urn for time-varying Pitman–Yor processes. J. Machine Learning Res. 18, 132.Google Scholar
Chanda, K. C. (1999). Chi-squared tests of goodness-of-fit for dependent observations. In Asymptotics, Nonparametrics, and Time Series, CRC Press, Boca Raton, FL, pp. 743756.Google Scholar
Chen, M.-R. and Kuba, M. (2013). On generalized Pólya urn models. J. Appl. Prob. 50, 11691186.CrossRefGoogle Scholar
Chessa, A., Crimaldi, I., Riccaboni, M. and Trapin, L. (2014). Cluster analysis of weighted bipartite networks: a new copula-based approach. PLOS ONE 9, 112.CrossRefGoogle ScholarPubMed
Collevecchio, A., Cotar, C. and LiCalzi, M. (2013). On a preferential attachment and generalized Pólya’s urn model. Ann. Appl. Prob. 23, 12191253.CrossRefGoogle Scholar
Crimaldi, I. (2016). Central limit theorems for a hypergeometric randomly reinforced urn. J. Appl. Prob. 53, 899913.CrossRefGoogle Scholar
Crimaldi, I. (2016). Introduzione alla nozione di convergenza stabile e sue varianti. Unione Matematica Italiana, Bologna.Google Scholar
Crimaldi, I., Dai Pra, P., Louis, P.-Y. and Minelli, I. G. (2019). Synchronization and functional central limit theorems for interacting reinforced random walks. Stoch. Process. Appl. 129, 70101.CrossRefGoogle Scholar
Crimaldi, I., Dai Pra, P. and Minelli, I. G. (2016). Fluctuation theorems for synchronization of interacting Pólya’s urns. Stoch. Process. Appl. 126, 930947.CrossRefGoogle Scholar
Crimaldi, I., Letta, G. and Pratelli, L. (2007). A strong form of stable convergence. In Séminaire de Probabilités XL, Springer, Berlin, Heidelberg, pp. 203225.CrossRefGoogle Scholar
Dai Pra, P., Louis, P.-Y. and Minelli, I. G. (2014). Synchronization via interacting reinforcement. J. Appl. Prob. 51, 556568.CrossRefGoogle Scholar
Doeblin, W. and Fortet, R. (1937). Sur des chanes à liaisons complètes. Bull. Soc. Math. France 65, 132148.CrossRefGoogle Scholar
Eggenberger, F. and Pólya, G. (1923). Über die Statistik verketteter Vorgänge. Z. Angew. Math. Mech. 3, 279289.CrossRefGoogle Scholar
Gasser, T. (1975). Goodness-of-fit tests for correlated data. Biometrika 62, 563570.CrossRefGoogle Scholar
Ghiglietti, A. and Paganoni, A. M. (2014). Statistical properties of two-color randomly reinforced urn design targeting fixed allocations. Electron. J. Statist. 8, 708737.CrossRefGoogle Scholar
Ghiglietti, A., Vidyashankar, A. N. and Rosenberger, W. F. (2017). Central limit theorem for an adaptive randomly reinforced urn model. Ann. Appl. Prob. 27, 29563003.CrossRefGoogle Scholar
Gleser, L. J. and Moore, D. S. (1983). The effect of dependence on chi-squared and empiric distribution tests of fit. Ann. Statist. 11, 11001108.CrossRefGoogle Scholar
Guivarc’h, Y. and Hardy, J. (1988). Théorèmes limites pour une classe de chaînes de Markov et applications aux difféomorphismes d’Anosov. Ann. Inst. H. Poincaré Prob. Statist. 24, 7398.Google Scholar
Hairer, M. Ergodic properties of Markov processes. Available at http://www.hairer.org/notes/Markov.pdf.Google Scholar
Hall, P. and Heyde, C. C. (1980). Martingale Limit Theory and Its Application. Academic Press, New York.Google Scholar
Holmes, M. and Sakai, A. (2007). Senile reinforced random walks. Stoch. Process. Appl. 117, 15191539.CrossRefGoogle Scholar
Ieva, F., Paganoni, A. M., Pigoli, D. and Vitelli, V. (2013). Multivariate functional clustering for the morphological analysis of electrocardiograph curves. J. R. Statist. Soc. C [Appl. Statist.] 62, 401418.CrossRefGoogle Scholar
Ionescu Tulcea, C. T. and Marinescu, G. (1950). Théorie ergodique pour des classes d’opérations non complètement continues. Ann. Math. 52, 140147.CrossRefGoogle Scholar
Knoke, D., Bohrnstedt, G. W. and Potter Mee, A. (2002). Statistics for Social Data Analysis. F. E. Peacock Publishers, Itasca, IL.Google Scholar
Laruelle, S. and Pagés, G. (2013). Randomized urn models revisited using stochastic approximation. Ann. Appl. Prob. 23, 14091436.CrossRefGoogle Scholar
Mahmoud, H. M. (2009). Pólya Urn Models. CRC Press, Boca Raton, FL.Google Scholar
Métivier, M. (1982). Semimartingales. Walter de Gruyter, Berlin.CrossRefGoogle Scholar
Meyn, S. and Tweedie, R. L. (2009). Markov Chains and Stochastic Stability, 2nd edn. Cambridge University Press.CrossRefGoogle Scholar
Micheletti, A. et al. (2019). A weighted $\chi^2$ test to detect the presence of a major change point in non-stationary Markov chains. Submitted.Google Scholar
Norman, M. F. (1972). Markov Processes and Learning Models. Academic Press, New York.Google Scholar
Pan, W. (2002). Goodness-of-fit tests for GEE with correlated binary data. Scand. J. Statist. 29, 101110.CrossRefGoogle Scholar
Pei, Y., Tang, M.-L. and Guo, J. (2008). Testing the equality of two proportions for combined unilateral and bilateral data. Commun. Statist. Simul. Comput. 37, 15151529.CrossRefGoogle Scholar
Pemantle, R. (2007). A survey of random processes with reinforcement. Prob. Surveys 4, 179.CrossRefGoogle Scholar
Radlow, R. and Alf, E. F., Jr. (1975). An alternate multinomial assessment of the accuracy of the $\chi^2$ test of goodness of fit. J. Amer. Statist. Assoc. 70, 811813.Google Scholar
Rao, J. N. K. and Scott, A. J. (1981). The analysis of categorical data from complex sample surveys: chi-squared tests for goodness of fit and independence in two-way tables. J. Amer. Statist. Assoc. 76, 221230.CrossRefGoogle Scholar
Rényi, A. (1963). On stable sequences of events. Sankhyā A 25, 293 302.Google Scholar
Robbins, H. and Siegmund, D. (1971). A convergence theorem for non negative almost supermartingales and some applications. In Optimizing Methods in Statistics, Academic Press, New York, pp. 233257.Google Scholar
Sahasrabudhe, N. (2016). Synchronization and fluctuation theorems for interacting Friedman urns. J. Appl. Prob. 53, 12211239.CrossRefGoogle Scholar
Sherman, J. and Morrison, W. J. (1950). Adjustment of an inverse matrix corresponding to a change in one element of a given matrix. Ann. Math. Statist. 21, 124127.CrossRefGoogle Scholar
Tang, M.-L., Pei, Y.-B., Wong, W.-K. and Li, J.-L. (2012). Goodness-of-fit tests for correlated paired binary data. Statist. Methods Med. Res. 21, 331345.CrossRefGoogle ScholarPubMed
Tharwat, A. (2018). Independent component analysis: an introduction. Appl. Comput. Informat. Google Scholar
Williams, D. (1991). Probability with Martingales. Cambridge University Press.CrossRefGoogle Scholar
Xu, D. and Tian, Y. (2015). A comprehensive survey of clustering algorithms. Ann. Data Sci. 2, 165193.CrossRefGoogle Scholar