Reliability of relational event model estimates under sampling: How to fit a relational event model to 360 million dyadic events

Jürgen Lerner; Alessandro Lomi

doi:10.1017/nws.2019.57

Reliability of relational event model estimates under sampling: How to fit a relational event model to 360 million dyadic events

Published online by Cambridge University Press: 22 November 2019

Jürgen Lerner

and

Alessandro Lomi

Show author details

Jürgen Lerner*: Affiliation:
Department of Computer and Information Science, University of Konstanz, Konstanz, Germany
Alessandro Lomi: Affiliation:
University of Italian Switzerland, Lugano, Switzerland University of Exeter Business School, Exeter, UK
*: *Corresponding author. Email: [email protected]

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We assess the reliability of relational event model (REM) parameters estimated under two sampling schemes: (1) uniform sampling from the observed events and (2) case–control sampling which samples nonevents, or null dyads (“controls”), from a suitably defined risk set. We experimentally determine the variability of estimated parameters as a function of the number of sampled events and controls per event, respectively. Results suggest that REMs can be reliably fitted to networks with more than 12 million nodes connected by more than 360 million dyadic events by analyzing a sample of some tens of thousands of events and a small number of controls per event. Using the data that we collected on the Wikipedia editing network, we illustrate how network effects commonly included in empirical studies based on REMs need widely different sample sizes to be reliably estimated. For our analysis we use an open-source software which implements the two sampling schemes, allowing analysts to fit and analyze REMs to the same or other data that may be collected in different empirical settings, varying sample parameters or model specification.

Keywords

relational event models dynamic networks large networks parameter estimation sampling Wikipedia

Type: Research Article
Information: Network Science , Volume 8 , Issue 1 , March 2020 , pp. 97 - 135

DOI: https://doi.org/10.1017/nws.2019.57 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: © Cambridge University Press 2019

References

Amati, V., Lomi, A., & Mascia, D. (2019). Some days are better than others: Examining time-specific variation in the structuring of interorganizational relations. Social Networks, 57, 18–33.CrossRef Google Scholar

Barabási, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.CrossRef Google Scholar PubMed

Borgan, Ø., Goldstein, L., & Langholz, B. (1995). Methods for the analysis of sampled cohort data in the Cox proportional hazards model. The Annals of Statistics, 23(5), 1749–1778.CrossRef Google Scholar

Brandenberger, L. (2019). Predicting network events to assess goodness of fit of relational event models. Political Analysis, 27(4), 556–571.CrossRef Google Scholar

Brandes, U., Lerner, J., & Snijders, T. A. B. (2009). Networks evolving step by step: Statistical analysis of dyadic event data. In Proceedings of the 2009 international conference on advances in social network analysis and mining (ASONAM) (pp. 200–205). IEEE.CrossRef Google Scholar

Butts, C. T. (2008). A relational event framework for social action. Sociological Methodology, 38(1), 155–200.CrossRef Google Scholar

Cox, D. (1972). Regression models and life-tables. Journal of the Royal Statistical Society, Series B (Methodological), 34(2), 87–22.CrossRef Google Scholar

Dodds, P. S., Harris, K. D., Kloumann, I. M., Bliss, C. A., & Danforth, C. M. (2011). Temporal patterns of happiness and information in a global social network: Hedonometrics and Twitter. PLOS One, 6(12), e26752.CrossRef Google Scholar

Efron, B. (1992). Bootstrap methods: Another look at the jackknife. In, S. Kotz, & Johnson, N. L. (Eds.), Breakthroughs in statistics (pp. 569–593). New York: Springer-Verlag.CrossRef Google Scholar

Eppstein, D., Goodrich, M. T., Strash, D., & Trott, L. (2012). Extended dynamic subgraph statistics using h-index parameterized data structures. Theoretical Computer Science, 447, 44–52.CrossRef Google Scholar

Foucault Welles, B., Vashevko, A., Bennett, N., & Contractor, N. (2014). Dynamic models of communication in an online friendship network. Communication Methods and Measures, 8(4), 223–243.CrossRef Google Scholar

Golder, S. A., Wilkinson, D. M., & Huberman, B. A. (2007). Rhythms of social interaction: Messaging within a massive online network. In Steinfield, C., Pentland, B. T., Ackerman, M., & Contractor, N. (Eds.), Communities and technologies 2007 (pp. 41–66). London: Springer.CrossRef Google Scholar

Greenstein, S., & Zhu, F. (2012). Is Wikipedia biased? American Economic Review, 102(3), 343–348.CrossRef Google Scholar

Hunter, D., Smyth, P., Vu, D. Q., & Asuncion, A. U. (2011). Dynamic egocentric models for citation networks. In Proceedings of the 28th international conference on machine learning (ICML-11) (pp. 857–864).Google Scholar

Hunter, D. R., Handcock, M. S., Butts, C. T., Goodreau, S. M., & Morris, M. (2008). ergm: A package to fit, simulate and diagnose exponential-family models for networks. Journal of Statistical Software, 24(3), nihpa54860.CrossRef Google Scholar

Jan Piskorski, M., & Gorbatâi, A. (2017). Testing Coleman’s social-norm enforcement mechanism: Evidence from Wikipedia. American Journal of Sociology, 122(4), 1183–1222.CrossRef Google Scholar

Keegan, B., Gergle, D., & Contractor, N. (2012). Do editors or articles drive collaboration? Multilevel statistical network analysis of Wikipedia coauthorship. In Proceedings of the 2012 conference on computer supported cooperative work (pp. 427–436).ACM.Google Scholar

Langholz, B., & Borgan, Ø. (1995). Counter-matching: A stratified nested case-control sampling method. Biometrika, 82(1), 69–79.CrossRef Google Scholar

Lawless, J. F. (2003). Statistical models and methods for lifetime data. New Jersey: Wiley.Google Scholar

Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabási, A.-L., Brewer, D., Christakis, N., Contractor, N., Fowler, J., Gutmann, M., Jebara, T., King, G., Macy, M., Roy, D., & Van Alstyne, M. (2009). Computational social science. Science, 323(5915), 721–723.CrossRef Google Scholar PubMed

Leenders, R. Th. A. J., Contractor, N. S., & DeChurch, L. A. (2016). Once upon a time: Understanding team processes as relational event networks. Organizational Psychology Review, 6(1), 92–115.CrossRef Google Scholar

Leifeld, P. (2016). Discourse network analysis: Policy debates as dynamic networks. In Victor, J. N., Montgomery, A. H., & Lubell, M. (Eds.), The oxford handbook of political networks. New York: Oxford University Press.Google Scholar

Lerner, J., & Tirole, J. (2001). The open source movement: Key research questions. European Economic Review, 45(4), 819–826.CrossRef Google Scholar

Lerner, J., & Lomi, A. (2017). The third man: Hierarchy formation in Wikipedia. Applied Network Science, 2(1), 24.CrossRef Google Scholar PubMed

Lerner, J., & Lomi, A. (2018a). The free encyclopedia that anyone can dispute: An analysis of the micro-structural dynamics of positive and negative relations in the production of contentious Wikipedia articles. Social Networks, in press. https://doi.org/10.1016/j.socnet.2018.12.003.CrossRef Google Scholar

Lerner, J., & Lomi, A. (2018b). Knowledge categorization affects popularity and quality of Wikipedia articles. PLOS One, 13(1), e0190674.CrossRef Google Scholar PubMed

Lerner, J., & Lomi, A. (2019a). Let’s talk about refugees: Network effects drive contributor attention to Wikipedia articles about migration-related topics. In Proc. complex networks and their applications VII (pp. 211–222). Springer International Publishing.CrossRef Google Scholar

Lerner, J., & Lomi, A. (2019b). The network structure of successful collaboration in Wikipedia. In Proceedings of the 52nd Hawaii international conference system sciences (HICSS 2019) (pp. 2622–2631). IEEE Computer Society.CrossRef Google Scholar

Lerner, J., Bussmann, M., Snijders, T. A. B., & Brandes, U. (2013). Modeling frequency and type of interaction in event networks. Corvinus Journal of Sociology and Social Policy, 4(1), 3–32.Google Scholar

Lomi, A., Mascia, D., Vu, D. Q., Pallotti, F., Conaldi, G., & Iwashyna, T. J. (2014). Quality of care and interhospital collaboration: A study of patient transfers in Italy. Medical Care, 52(5), 407.CrossRef Google Scholar PubMed

Moat, H. S., Curme, C., Avakian, A., Kenett, D. Y., Stanley, H. E., & Preis, T. (2013). Quantifying Wikipedia usage patterns before stock market moves. Scientific Reports, 3, 1801.CrossRef Google Scholar

Monge, P. R., & Contractor, N. S. (2003). Theories of communication networks. New York: Oxford University Press.Google Scholar

Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582.CrossRef Google Scholar PubMed

Pallotti, F., & Lomi, A. (2011). Network influence and organizational performance: The effects of tie strength and structural equivalence. European Management Journal, 29(5), 389–403.CrossRef Google Scholar

Perry, P. O., & Wolfe, P. J. (2013). Point process modelling for directed interaction networks. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75(5), 821–849.CrossRef Google Scholar

Simon, H. A. (1977). Aggregation of variables in dynamic systems. In Models of discovery. Boston Studies in the Philosophy of Science, vol 54 (pp. 183–213). Dordrecht: Springer.Google Scholar

Simon, H. A. (1996). The architecture of complexity. Cambridge, MA: MIT Press.Google Scholar

Stadtfeld, C., & Block, P. (2017). Interactions, actors, and time: Dynamic network actor models for relational events. Sociological Science, 4, 318–352.CrossRef Google Scholar

Therneau, T. M., & Grambsch, P. M. (2013). Modeling survival data: Extending the Cox model. New York: Springer Science & Business Media.Google Scholar

Vu, D., Pattison, P., & Robins, G. (2015). Relational event models for social learning in MOOCs. Social Networks, 43, 121–135.CrossRef Google Scholar

Vu, D., Lomi, A., Mascia, D., & Pallotti, F. (2017). Relational event models for longitudinal network data with an application to interhospital patient transfers. Statistics in Medicine, 36(14), 2265–2287.Google Scholar PubMed

Yasseri, T., Sumi, R., & Kertész, J. (2012). Circadian patterns of Wikipedia editorial activity: A demographic analysis. PLOS One, 7(1), e30091.CrossRef Google Scholar PubMed

Article contents

Reliability of relational event model estimates under sampling: How to fit a relational event model to 360 million dyadic events

Abstract

Keywords

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests