Hostname: page-component-745bb68f8f-l4dxg Total loading time: 0 Render date: 2025-01-08T11:08:12.276Z Has data issue: false hasContentIssue false

Bayesian Plackett–Luce Mixture Models for Partially Ranked Data

Published online by Cambridge University Press:  01 January 2025

Cristina Mollica*
Affiliation:
Sapienza Università di Roma
Luca Tardella
Affiliation:
Sapienza Università di Roma
*
Correspondence should be made to Cristina Mollica, Dipartimento di Scienze Statistiche, Sapienza Università di Roma, Piazzale A. Moro 5, 00185 Rome, Italy. Email: [email protected]

Abstract

The elicitation of an ordinal judgment on multiple alternatives is often required in many psychological and behavioral experiments to investigate preference/choice orientation of a specific population. The Plackett–Luce model is one of the most popular and frequently applied parametric distributions to analyze rankings of a finite set of items. The present work introduces a Bayesian finite mixture of Plackett–Luce models to account for unobserved sample heterogeneity of partially ranked data. We describe an efficient way to incorporate the latent group structure in the data augmentation approach and the derivation of existing maximum likelihood procedures as special instances of the proposed Bayesian method. Inference can be conducted with the combination of the Expectation-Maximization algorithm for maximum a posteriori estimation and the Gibbs sampling iterative procedure. We additionally investigate several Bayesian criteria for selecting the optimal mixture configuration and describe diagnostic tools for assessing the fitness of ranking distributions conditionally and unconditionally on the number of ranked items. The utility of the novel Bayesian parametric Plackett–Luce mixture for characterizing sample heterogeneity is illustrated with several applications to simulated and real preference ranked data. We compare our method with the frequentist approach and a Bayesian nonparametric mixture model both assuming the Plackett–Luce model as a mixture component. Our analysis on real datasets reveals the importance of an accurate diagnostic check for an appropriate in-depth understanding of the heterogenous nature of the partial ranking data.

Type
Original Paper
Copyright
Copyright © 2016 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Electronic supplementary material The online version of this article (doi:10.1007/s11336-016-9530-0) contains supplementary material, which is available to authorized users.

References

Alvo, M., & Yu, P. L. (2014). Statistical methods for ranking data, Berlin: SpringerCrossRefGoogle Scholar
Ando, T. (2007). Bayesian predictive information criterion for the evaluation of hierarchical Bayesian and empirical Bayes models. Biometrika, 94, (2), 443458CrossRefGoogle Scholar
Bulteel, K., & Wilderjans, T. F., & Tuerlinckx, F., & Ceulemans, E. (2013). CHull as an alternative to AIC and BIC in the context of mixtures of factor analyzers. Behavior Research Methods, 45, (3), 782791CrossRefGoogle Scholar
Caron, F., & Doucet, A. (2012). Efficient Bayesian inference for generalized Bradley–Terry models. Journal of Computational and Graphical Statistics, 21, (1), 174196CrossRefGoogle Scholar
Caron, F., Teh, Y. W., & Murphy, T. B. (2012). Bayesian nonparametric Plackett-Luce models for the analysis of clustered ranked data. Technical Report 8143, Project-Team ALEA.Google Scholar
Caron, F., & Teh, Y. W., & Murphy, T. B. (2014). Bayesian nonparametric Plackett-Luce models for the analysis of preferences for college degree programmes. The Annals of Applied Statistics, 8, (2), 11451181CrossRefGoogle Scholar
Celeux, G., & Hurn, M., & Robert, C. P. (2000). Computational and inferential difficulties with mixture posterior distributions. Journal of the American Statistical Association, 95, (451), 957970CrossRefGoogle Scholar
Celeux, G., & Soromenho, G. (1996). An entropy criterion for assessing the number of clusters in a mixture model. Journal of Classification, 13, (2), 195212CrossRefGoogle Scholar
Dabic, M., & Hatzinger, R. Hatzinger, R., & Dittrich, R., & Salzberger, T. (2009). Zielgruppenadaequate Ablaeufe in Konfigurationssystemen - Eine empirische Studie im Automobilmarkt - Partial Rankings. Praeferenzanalyse mit R: Anwendungen aus Marketing, Behavioural Finance und Human Resource Management, Wien: Facultas.Google Scholar
Dahl, D. B. Do, K-A, & Müller, P., & Vannucci, M. (2006). Model-based clustering for expression data via a Dirichlet process mixture model. Bayesian inference for gene expression and proteomics, New York: Springer 201218CrossRefGoogle Scholar
Diaconis, P. W. (1987). Spectral analysis for ranked data. Technical Report 282, Department of Statistics, Stanford University, Stanford.Google Scholar
Gelman, A., & Carlin, J. B., & Stern, H. S., & Rubin, D. B. (2004). Bayesian data analysis, 2Boca Raton: Chapman & Hall/CRC.Google Scholar
Gelman, A., & Hwang, J., & Vehtari, A. (2014). Understanding predictive information criteria for Bayesian models. Statistics and Computing, 24, (6), 9971016CrossRefGoogle Scholar
Gelman, A., & Meng, X-L, & Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica, 6, (4), 733760.Google Scholar
Gormley, I. C., & Murphy, T. B. (2006). Analysis of Irish third-level college applications data. Journal of the Royal Statistical Society: Series A, 169, (2), 361379CrossRefGoogle Scholar
Gormley, I. C., & Murphy, T. B. (2008). A mixture of experts model for rank data with applications in election studies. Annals of Applied Statistics, 2, (4), 14521477CrossRefGoogle Scholar
Gormley, I. C., & Murphy, T. B. (2009). A grade of membership model for rank data. Bayesian Analysis, 4, (2), 265295CrossRefGoogle Scholar
Gormley, I. C. & Murphy, T. B. (2010). Clustering ranked preference data using sociodemographic covariates. In Hess, S., Daly, A., (Eds.), Choice modelling: The state-of-the-art and the state-of-practice. Proceedings from the Inaugural International Choice Modelling Conference (pp. 543–569). Emerald.Google Scholar
Guiver, J., & Snelson, E. (2009). Bayesian inference for Plackett-Luce ranking models. In Bottou, L., & Littman, M., (Eds.), Proceedings of the 26th International Conference on Machine Learning—ICML 2009 (pp. 377–384). Omnipress.CrossRefGoogle Scholar
Hatzinger, R., & Dittrich, R. (2012). prefmod: An R package for modeling preferences based on paired comparisons, rankings, or ratings. Journal of Statistical Software, 48, (10), 131CrossRefGoogle Scholar
Hunter, D. R. (2004). MM algorithms for generalized Bradley–Terry models. Annals of Statistics, 32, (1), 384406CrossRefGoogle Scholar
Jacques, J., & Biernacki, C. (2014). Model-based clustering for multivariate partial ranking data. Journal of Statistical Planning and Inference, 149, 201217CrossRefGoogle Scholar
Luce, R. D. (1959). Individual choice behavior: A theoretical analysis, New York: Wiley.Google Scholar
Lukočienė, O., & Vermunt, J. K. (2009). Determining the number of components in mixture models for hierarchical data. Advances in data analysis, data handling and business intelligence, Berlin: Springer 241249.CrossRefGoogle Scholar
Marden, J. I. (1995). Analyzing and modeling rank data (Vol. 64). Monographs on Statistics and Applied Probability, Boca Raton: Chapman & Hall.Google Scholar
Marin, J-M, & Mengersen, K., & Robert, C. P. (2005). Bayesian modelling and inference on mixtures of distributions. Handbook of Statistics, 25, 459507CrossRefGoogle Scholar
McCullagh, P., & Yang, J. (2008). et al. How many clusters?. Bayesian Analysis, 3, (1), 101120CrossRefGoogle Scholar
Miller, J. W., & Harrison, M. T. (2013). A simple example of Dirichlet process mixture inconsistency for the number of components. In Neural Information Processing Systems - NIPS, 2013, 199206.Google Scholar
Miller, J. W., & Harrison, M. T. (2014). Inconsistency of Pitman–Yor process mixtures for the number of components. The Journal of Machine Learning Research, 15, (1), 33333370.Google Scholar
Mollica, C., & Tardella, L. (2014). Epitope profiling via mixture modeling of ranked data. Statistics in Medicine, 33, (21), 37383758CrossRefGoogle ScholarPubMed
Papastamoulis, P. (2016). label. switching: An R package for dealing with the label switching problem in MCMC outputs. Journal of Statistical Software, 69(1), 1–24.Google Scholar
Plackett, R. L. (1975). The analysis of permutations. Journal of the Royal Statistical Society: Series C (Applied Statistics), 24, (2), 193202.Google Scholar
Raftery, A. E., Satagopan Jaya, M., Newton, M. A., & Krivitsky, P. N. (2007). Bayesian statistics 8. In Bernardo, J., Bayarri, M., Berger, J., Dawid, A., Heckerman, D., Smith, A., West, M., (Eds.), Proceedings of the eighth Valencia International Meeting, June 2-6, 2006, pages 371–416. Oxford: Oxford University Press.Google Scholar
Spiegelhalter, D. J., & Best, N. G., & Carlin, B. P., & Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64, (4), 583639CrossRefGoogle Scholar
Stern, H. (1993). Probability models on rankings and the electoral process. Probability models and statistical analyses for ranking data, New York: Springer 173195CrossRefGoogle Scholar
Yao, G., & Böckenholt, U. (1999). Bayesian estimation of Thurstonian ranking models based on the Gibbs sampler. British Journal of Mathematical and Statistical Psychology, 52, (1), 7992CrossRefGoogle Scholar
Yellott, John I (1977). The relationship between Luce’s choice axiom, Thurstone’s theory of comparative judgment, and the double exponential distribution. J. Mathematical Psychology, 15, (2), 109144CrossRefGoogle Scholar
Supplementary material: File

Mollica and Tardella supplementary material

Mollica and Tardella supplementary material
Download Mollica and Tardella supplementary material(File)
File 2.8 MB