Hostname: page-component-cd9895bd7-mkpzs Total loading time: 0 Render date: 2024-12-28T09:57:31.608Z Has data issue: false hasContentIssue false

Reliability of relational event model estimates under sampling: How to fit a relational event model to 360 million dyadic events

Published online by Cambridge University Press:  22 November 2019

Jürgen Lerner*
Affiliation:
Department of Computer and Information Science, University of Konstanz, Konstanz, Germany
Alessandro Lomi
Affiliation:
University of Italian Switzerland, Lugano, Switzerland University of Exeter Business School, Exeter, UK
*
*Corresponding author. Email: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We assess the reliability of relational event model (REM) parameters estimated under two sampling schemes: (1) uniform sampling from the observed events and (2) case–control sampling which samples nonevents, or null dyads (“controls”), from a suitably defined risk set. We experimentally determine the variability of estimated parameters as a function of the number of sampled events and controls per event, respectively. Results suggest that REMs can be reliably fitted to networks with more than 12 million nodes connected by more than 360 million dyadic events by analyzing a sample of some tens of thousands of events and a small number of controls per event. Using the data that we collected on the Wikipedia editing network, we illustrate how network effects commonly included in empirical studies based on REMs need widely different sample sizes to be reliably estimated. For our analysis we use an open-source software which implements the two sampling schemes, allowing analysts to fit and analyze REMs to the same or other data that may be collected in different empirical settings, varying sample parameters or model specification.

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© Cambridge University Press 2019

References

Amati, V., Lomi, A., & Mascia, D. (2019). Some days are better than others: Examining time-specific variation in the structuring of interorganizational relations. Social Networks, 57, 1833.CrossRefGoogle Scholar
Barabási, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509512.CrossRefGoogle ScholarPubMed
Borgan, Ø., Goldstein, L., & Langholz, B. (1995). Methods for the analysis of sampled cohort data in the Cox proportional hazards model. The Annals of Statistics, 23(5), 17491778.CrossRefGoogle Scholar
Brandenberger, L. (2019). Predicting network events to assess goodness of fit of relational event models. Political Analysis, 27(4), 556571.CrossRefGoogle Scholar
Brandes, U., Lerner, J., & Snijders, T. A. B. (2009). Networks evolving step by step: Statistical analysis of dyadic event data. In Proceedings of the 2009 international conference on advances in social network analysis and mining (ASONAM) (pp. 200205). IEEE.CrossRefGoogle Scholar
Butts, C. T. (2008). A relational event framework for social action. Sociological Methodology, 38(1), 155200.CrossRefGoogle Scholar
Cox, D. (1972). Regression models and life-tables. Journal of the Royal Statistical Society, Series B (Methodological), 34(2), 87–22.CrossRefGoogle Scholar
Dodds, P. S., Harris, K. D., Kloumann, I. M., Bliss, C. A., & Danforth, C. M. (2011). Temporal patterns of happiness and information in a global social network: Hedonometrics and Twitter. PLOS One, 6(12), e26752.CrossRefGoogle Scholar
Efron, B. (1992). Bootstrap methods: Another look at the jackknife. In, S. Kotz, & Johnson, N. L. (Eds.), Breakthroughs in statistics (pp. 569593). New York: Springer-Verlag.CrossRefGoogle Scholar
Eppstein, D., Goodrich, M. T., Strash, D., & Trott, L. (2012). Extended dynamic subgraph statistics using h-index parameterized data structures. Theoretical Computer Science, 447, 4452.CrossRefGoogle Scholar
Foucault Welles, B., Vashevko, A., Bennett, N., & Contractor, N. (2014). Dynamic models of communication in an online friendship network. Communication Methods and Measures, 8(4), 223243.CrossRefGoogle Scholar
Golder, S. A., Wilkinson, D. M., & Huberman, B. A. (2007). Rhythms of social interaction: Messaging within a massive online network. In Steinfield, C., Pentland, B. T., Ackerman, M., & Contractor, N. (Eds.), Communities and technologies 2007 (pp. 4166). London: Springer.CrossRefGoogle Scholar
Greenstein, S., & Zhu, F. (2012). Is Wikipedia biased? American Economic Review, 102(3), 343348.CrossRefGoogle Scholar
Hunter, D., Smyth, P., Vu, D. Q., & Asuncion, A. U. (2011). Dynamic egocentric models for citation networks. In Proceedings of the 28th international conference on machine learning (ICML-11) (pp. 857864).Google Scholar
Hunter, D. R., Handcock, M. S., Butts, C. T., Goodreau, S. M., & Morris, M. (2008). ergm: A package to fit, simulate and diagnose exponential-family models for networks. Journal of Statistical Software, 24(3), nihpa54860.CrossRefGoogle Scholar
Jan Piskorski, M., & Gorbatâi, A. (2017). Testing Coleman’s social-norm enforcement mechanism: Evidence from Wikipedia. American Journal of Sociology, 122(4), 11831222.CrossRefGoogle Scholar
Keegan, B., Gergle, D., & Contractor, N. (2012). Do editors or articles drive collaboration? Multilevel statistical network analysis of Wikipedia coauthorship. In Proceedings of the 2012 conference on computer supported cooperative work (pp. 427436).ACM.Google Scholar
Langholz, B., & Borgan, Ø. (1995). Counter-matching: A stratified nested case-control sampling method. Biometrika, 82(1), 6979.CrossRefGoogle Scholar
Lawless, J. F. (2003). Statistical models and methods for lifetime data. New Jersey: Wiley.Google Scholar
Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabási, A.-L., Brewer, D., Christakis, N., Contractor, N., Fowler, J., Gutmann, M., Jebara, T., King, G., Macy, M., Roy, D., & Van Alstyne, M. (2009). Computational social science. Science, 323(5915), 721723.CrossRefGoogle ScholarPubMed
Leenders, R. Th. A. J., Contractor, N. S., & DeChurch, L. A. (2016). Once upon a time: Understanding team processes as relational event networks. Organizational Psychology Review, 6(1), 92115.CrossRefGoogle Scholar
Leifeld, P. (2016). Discourse network analysis: Policy debates as dynamic networks. In Victor, J. N., Montgomery, A. H., & Lubell, M. (Eds.), The oxford handbook of political networks. New York: Oxford University Press.Google Scholar
Lerner, J., & Tirole, J. (2001). The open source movement: Key research questions. European Economic Review, 45(4), 819826.CrossRefGoogle Scholar
Lerner, J., & Lomi, A. (2017). The third man: Hierarchy formation in Wikipedia. Applied Network Science, 2(1), 24.CrossRefGoogle ScholarPubMed
Lerner, J., & Lomi, A. (2018a). The free encyclopedia that anyone can dispute: An analysis of the micro-structural dynamics of positive and negative relations in the production of contentious Wikipedia articles. Social Networks, in press. https://doi.org/10.1016/j.socnet.2018.12.003.CrossRefGoogle Scholar
Lerner, J., & Lomi, A. (2018b). Knowledge categorization affects popularity and quality of Wikipedia articles. PLOS One, 13(1), e0190674.CrossRefGoogle ScholarPubMed
Lerner, J., & Lomi, A. (2019a). Let’s talk about refugees: Network effects drive contributor attention to Wikipedia articles about migration-related topics. In Proc. complex networks and their applications VII (pp. 211222). Springer International Publishing.CrossRefGoogle Scholar
Lerner, J., & Lomi, A. (2019b). The network structure of successful collaboration in Wikipedia. In Proceedings of the 52nd Hawaii international conference system sciences (HICSS 2019) (pp. 26222631). IEEE Computer Society.CrossRefGoogle Scholar
Lerner, J., Bussmann, M., Snijders, T. A. B., & Brandes, U. (2013). Modeling frequency and type of interaction in event networks. Corvinus Journal of Sociology and Social Policy, 4(1), 332.Google Scholar
Lomi, A., Mascia, D., Vu, D. Q., Pallotti, F., Conaldi, G., & Iwashyna, T. J. (2014). Quality of care and interhospital collaboration: A study of patient transfers in Italy. Medical Care, 52(5), 407.CrossRefGoogle ScholarPubMed
Moat, H. S., Curme, C., Avakian, A., Kenett, D. Y., Stanley, H. E., & Preis, T. (2013). Quantifying Wikipedia usage patterns before stock market moves. Scientific Reports, 3, 1801.CrossRefGoogle Scholar
Monge, P. R., & Contractor, N. S. (2003). Theories of communication networks. New York: Oxford University Press.Google Scholar
Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 85778582.CrossRefGoogle ScholarPubMed
Pallotti, F., & Lomi, A. (2011). Network influence and organizational performance: The effects of tie strength and structural equivalence. European Management Journal, 29(5), 389403.CrossRefGoogle Scholar
Perry, P. O., & Wolfe, P. J. (2013). Point process modelling for directed interaction networks. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75(5), 821849.CrossRefGoogle Scholar
Simon, H. A. (1977). Aggregation of variables in dynamic systems. In Models of discovery. Boston Studies in the Philosophy of Science, vol 54 (pp. 183–213). Dordrecht: Springer.Google Scholar
Simon, H. A. (1996). The architecture of complexity. Cambridge, MA: MIT Press.Google Scholar
Stadtfeld, C., & Block, P. (2017). Interactions, actors, and time: Dynamic network actor models for relational events. Sociological Science, 4, 318352.CrossRefGoogle Scholar
Therneau, T. M., & Grambsch, P. M. (2013). Modeling survival data: Extending the Cox model. New York: Springer Science & Business Media.Google Scholar
Vu, D., Pattison, P., & Robins, G. (2015). Relational event models for social learning in MOOCs. Social Networks, 43, 121135.CrossRefGoogle Scholar
Vu, D., Lomi, A., Mascia, D., & Pallotti, F. (2017). Relational event models for longitudinal network data with an application to interhospital patient transfers. Statistics in Medicine, 36(14), 22652287.Google ScholarPubMed
Yasseri, T., Sumi, R., & Kertész, J. (2012). Circadian patterns of Wikipedia editorial activity: A demographic analysis. PLOS One, 7(1), e30091.CrossRefGoogle ScholarPubMed