Outliers and Influential Observations in Exponential Random Graph Models

Johan Koskinen; Peng Wang; Garry Robins; Philippa Pattison

doi:10.1007/s11336-018-9635-8

Outliers and Influential Observations in Exponential Random Graph Models

Published online by Cambridge University Press: 01 January 2025

Johan Koskinen

Peng Wang ,

Garry Robins and

Philippa Pattison

Show author details

Johan Koskinen*: Affiliation:
University of Manchester The University of Melbourne University of Linköping
Peng Wang: Affiliation:
Swinburne University of Technology
Garry Robins: Affiliation:
The University of Melbourne
Philippa Pattison: Affiliation:
The University of Sydney
*: Correspondence should be made to Johan Koskinen, The Mitchell Centre for Social Network Analysis and the Department of Social Statistics, School of Social Sciences, University of Manchester, Manchester M139PL, UK. Email: [email protected]

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

We discuss measuring and detecting influential observations and outliers in the context of exponential family random graph (ERG) models for social networks. We focus on the level of the nodes of the network and consider those nodes whose removal would result in changes to the model as extreme or “central” with respect to the structural features that “matter”. We construe removal in terms of two case-deletion strategies: the tie-variables of an actor are assumed to be unobserved, or the node is removed resulting in the induced subgraph. We define the difference in inferred model resulting from case deletion from the perspective of information theory and difference in estimates, in both the natural and mean-value parameterisation, representing varying degrees of approximation. We arrive at several measures of influence and propose the use of two that do not require refitting of the model and lend themselves to routine application in the ERGM fitting procedure. MCMC p values are obtained for testing how extreme each node is with respect to the network structure. The influence measures are applied to two well-known data sets to illustrate the information they provide. From a network perspective, the proposed statistics offer an indication of which actors are most distinctive in the network structure, in terms of not abiding by the structural norms present across other actors.

Keywords

statistical analysis of social networks exponential random graph models outliers leverage missing data principle case deletion

Type: Original Paper
Information: Psychometrika , Volume 83 , Issue 4 , December 2018 , pp. 809 - 830

DOI: https://doi.org/10.1007/s11336-018-9635-8 [Opens in a new window]
Copyright: Copyright © 2018 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

Johan Koskinen would like to acknowledge financial support from the Leverhulme Trust Grant RPG-2013-140 and SRG2012.

References

Anderson, B. S.,Butts, C., &Carley, K.(1999).The interaction of size and density with graph-level indices.Social Networks,21,239–267.CrossRef Google Scholar

Barndorff-Nielsen, O. E.(1978).Information and exponential families in statistical theory,New York:Wiley.Google Scholar

Belsley, D. A.,Kuh, E., &Welsh, R. E.(1980).Regression diagnostics: Identifying influential data and sources of collinearity, Wiley series in probability and mathematical statistics,New York:Wiley.CrossRef Google Scholar

Besag, J.(1974).Spatial interaction and the statistical analysis of lattice systems.Journal of the Royal Statistical Society B,36,96–127.CrossRef Google Scholar

Block, P.,Koskinen, J. H.,Stadtfeld, C. J.,Hollway, J., &Steglich, C.(2018).Change we can believe in: Comparing longitudinal network models on consistency, interpretability and predictive power.Social Networks,52,189–191.CrossRef Google Scholar

Borgatti, S. P., &Everett, M. G.(2006).A graph-theoretic perspective on centrality.Social Networks,28,466–484.CrossRef Google Scholar

Chatterjee, S., &Hadi, A. S.(2009).Sensitivity analysis in linear regression,New York:John Wiley & Sons.Google Scholar

Cook, R. D.(1977).Detection of influential observations in linear regression.Technometrics,19,15–18.CrossRef Google Scholar

Cook, R. D.(1986).Assessment of local influence.Journal of the Royal Statistical Society, Series B,48,133–169.CrossRef Google Scholar

Corander, J., Dahmström, K., & Dahmström, P. (1998). Maximum likelihood estimation for Markov graphs. Research report, 1998:8, Stockholm University, Department of Statistics.Google Scholar

Corander, J., Dahmström, K., & Dahmström, P. (2002). Maximum likelihood estimation for exponential random graph model. In Hagberg, J.(ed.), Contributions to social network analysis, information theory, and other topics in statistics; A Festschrift in honour of Ove Frank (pp. 1–17). University of Stockholm: Department of Statistics.Google Scholar

Crouch, B., Wasserman, S., & Trachtenberg, F. (1998). Markov Chain Monte Carlo maximum likelihood estimation for p* social network models. Paper presented at the Sunbelt XVIII and Fifth European International Social Networks Conference, Sitges (Spain), May 28–31, 1998.Google Scholar

Dahmström, K., & Dahmström, P. (1993). ML-estimation of the clustering parameter in a Markov graph model. Stockholm: Research report, 1993:4, Department of Statistics.Google Scholar

Frank, O., &Strauss, D.(1986).Markov graphs.Journal of the American Statistical Association,81,832–842.CrossRef Google Scholar

Freeman, L. C.(1978).Centrality in social networks conceptual clarification.Social Networks,1,215–239.CrossRef Google Scholar

Gelman, A., &Meng, X. L.(1998).Simulating normalizing constants: From importance sampling to bridge sampling to path sampling.Statistical Science,13,163–185.CrossRef Google Scholar

Handcock, M. S. (2003). Assessing degeneracy in statistical models of social networks. Working Paper no. 39, Center for Statistics and the Social Sciences, University of Washington. http://www.csss.washington.edu/Papers/wp39.pdf.Google Scholar

Handcock, M., &Gile, K.(2010).Modeling social networks from sampled data.The Annals of Applied Statistics,4,5–25.CrossRef Google Scholar PubMed

Hines, R. OH., &Hines, W. GS.(1995).Exploring Cook’s statistic graphically.The American Statistician,49,389–394.CrossRef Google Scholar

Hines, R. OH.,Lawless, J. F., &Carter, E. M.(1992).Diagnostics for a cumulative multinomial generalized linear model, with applications to grouped toxicological mortality data.Journal of the American Statistical Association,87,1059–1069.CrossRef Google Scholar

Holland, P., &Leinhardt, S.(1981).An exponential family of probability distributions for directed graphs (with discussion).Journal of the American Statistical Association,76,33–65.CrossRef Google Scholar

Huisman, M.(2009).Imputation of missing network data: Some simple procedures.Journal of Social Structure,10,11–29.Google Scholar

Hunter, D. R., &Handcock, M. S.(2006).Inference in curved exponential family models for networks.Journal of Computational and Graphical Statistics,15,565–583.CrossRef Google Scholar

Jonasson, J.(1999).The random triangle model.Journal of Applied Probability,36,852–876.CrossRef Google Scholar

Koskinen, J. (in press). Exponential random graph models. In B. Everitt, G. Molenberghs, W. Piegorsch, F. Ruggeri, M. Davidian, & R. Kenett (Eds.), Wiley StatsRef: Statistics Reference Online. Wiley, stat08136. https://doi.org/10.1002/9781118445112.stat08136.CrossRef Google Scholar

Koskinen, J.,Robins, G., &Pattison, P. E.(2010).Analysing exponential random graph (p-star) models with missing data using bayesian data augmentation.Statistical Methodology,7,3366–384.CrossRef Google Scholar

Koskinen, J.,Robins, G.,Wang, P., &Pattison, P. E.(2013).Bayesian analysis for partially observed network data, missing ties, attributes and actors.Social Networks,35,4514–527.CrossRef Google Scholar

Koskinen, J., &Snijders, T. AB.,Lusher, D.,Koskinen, J., &Robins, G.(2013).Simulation, estimation and goodness of fit.Exponential random graph models for social networks: Theory, methods and applications,New York, NY:Cambridge University Press.141–166.Google Scholar

Kuhnt, S.Outlier identification procedures for contingency tables using maximum likelihood and

L_{1}

\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$L_1$$\end{document}

estimates.(2004).Scandinavian Journal of Statistics,31,431–442.CrossRef Google Scholar

Laumann, E. O.,Marsden, P. V., &Prensky, D.Burt, R. S., &Minor, M. J.(1983).The boundary specification problem in network analysis.Applied network analysis,London:Sage Publications.18–34.Google Scholar

Lazega, E.(2001).The collegial phenomenon: The social mechanisms of cooperation among peers in a corporate law partnership,Oxford:Oxford University Press.CrossRef Google Scholar

Lee, A. H.(1988).Partial influence in generalized linear models.Biometrics,44,71–77.CrossRef Google Scholar

Lehmann, E. L.(1983).Theory of point estimation,New York:Wiley.CrossRef Google Scholar

Lesaffre, E., &Albert, A.(1989).Multiple-group logistic regression diagnostics.Applied Statistics,38,425–440.CrossRef Google Scholar

Lesaffre, E. mmanuel., &Verbeke, G. eert.(1998).Local Influence in Linear Mixed Models.Biometrics,54,2570–CrossRef Google Scholar PubMed

Little, R. JA., &Rubin, D. B.(1987).Statistical analysis with missing data,New York:Wiley.Google Scholar

Lusher, D.,Koskinen, J., &Robins, G. L.(2013).Exponential random graph models for social networks: Theory, methods, and applications,Cambridge:Cambridge University Press.Google Scholar

McPherson, M.,Smith-Lovin, L., &Cook, J. M.(2001).Birds of a feather: Homophily in social networks.Annual Review of Sociology,27,415–444.CrossRef Google Scholar

Meng, X-L, &Wong, W. H.(1996).Simulating ratios of normalizing constants via a simple identity: A theoretical exploration.Statistica Sinica,6,831–860.Google Scholar

Neal, R. M. (1993) Probabilistic inference using Markov Chain Monte Carlo methods. Technical Report CRG–TR–93–1, Department of Statistics, University of Toronto. http://www.cs.utoronto.ca/~radford/. Accessed 29 Sept 2008.Google Scholar

Nomikos, J. M.(2007).Terrorism, media, and intelligence in Greece: Capturing the 17 November group.International Journal of Intelligence and CounterIntelligence,20,165–78.CrossRef Google Scholar

Pattison, P. E., &Wasserman, S.(1999).Logit models and logistic regressions for social networks: II. Multivariate relations.British Journal of Mathematical and Statistical Psychology,52,169–193.CrossRef Google Scholar PubMed

Pierce, D. A., &Schafer, D. W.(1986).Residuals in generalized linear models.Journal of the American Statistical Association,81,977–986.CrossRef Google Scholar

Pregibon, D.(1981).Logistic regression diagnostics.The Annals of Statistics,9,705–724.CrossRef Google Scholar

Rhodes, C. J., &Jones, P.(2009).Inferring missing links in partially observed social networks.Journal of the Operational Research Society,60,1373–1383.CrossRef Google Scholar

Robins, G. L., &Daraganova, G.Lusher, D.,Koskinen, J., &Robins, G.(2013).Social selection, dyadic covariates, and geospatial effects.Exponential random graph models for social networks: Theory, methods, and applications,Cambridge:Cambridge University Press.91–101.Google Scholar

Robins, G. L.,Elliott, P., &Pattison, P. E.(2001).Network models for social selection processes.Social networks,23,1–30.CrossRef Google Scholar

Robins, G. L., &Lusher, D.Lusher, D.,Koskinen, J., &Robins, G.(2013).Illustrations: Simulation, estimation, and goodness of fit.Exponential random graph models for social networks: Theory, methods, and applications,Cambridge:Cambridge University Press.167–185.Google Scholar

Robins, G. L., &Morris, M.(2007).Advances in exponential random graph (p*) Models.Social Networks,29,169–172.CrossRef Google Scholar

Robins, G. L.,Pattison, P. E., &Elliot, P.(2001).Network models for social influence processes.Psychometrika,66,161–190.CrossRef Google Scholar

Robins, G. L.,Pattison, P. E., Woolcock, J.(2005).Small and other worlds: Global network structures from local processes.American Journal of Sociology,110,894–936.CrossRef Google Scholar

Rubin, D. B.(1976).Inference and missing data (with discussion).Biometrika,63,581–592.CrossRef Google Scholar

Schoch, D., & Brandes, U. (2015). Stars, neighborhood inclusion, and network centrality. In SIAM workshop on network science.Google Scholar

Shalizi, C. R., &Rinaldo, A.(2013).Consistency under sampling of exponential random graph models.The Annals of Statistics,41,508–535.CrossRef Google Scholar PubMed

Snijders, T. AB.(2002).Markov chain Monte Carlo estimation of exponential random graph models.Journal of Social Structure,3,21–40.Google Scholar

Snijders, T. AB.(2010).Conditional marginalization for exponential random graph models.Journal of Mathematical Sociology,34,239–252.CrossRef Google Scholar

Snijders, T. AB., &Borgatti, S. P.(1999).Non-parametric standard errors and tests for network statistics.Connections,22,61–70.Google Scholar

Snijders, T. AB.,Pattison, P. E.,Robins, G. L.,&Handcock, M. S.(2006).New specifications for exponential random graph models.Sociological Methodology,36,99–153.CrossRef Google Scholar

Schweinberger, M.(2011).Instability, sensitivity, and degeneracy of discrete exponential families.Journal of the American Statistical Association,106,1361–1370.CrossRef Google Scholar PubMed

Schweinberger, M., Krivitsky, P. N., & Butts, C. T. (2017). Foundations of finite-, super-, and infinite-population random graph inference. arXiv:1707.04800v1 Google Scholar

Strauss, D.(1986).On a general class of models for interaction.SIAM Review,28,513–527.CrossRef Google Scholar

The John Jay & ARTIS Transnational Terrorism Database, JJATT. (2009). http://doitapps.jjay.cuny.edu/jjatt/data.php. Accessed 27 July 2016.Google Scholar

van Duijn, M. AJ.,Gile, K. J., &Handcock, M. S.(2009).A framework for the comparison of maximum pseudo-likelihood and maximum likelihood estimation of exponential family random graph models.Social Networks,31,152–62.CrossRef Google Scholar PubMed

Wang, P.,Pattison, P., &Robins, G.(2013).Exponential random graph model specifications for bipartite networks—A dependence hierarchy.Social Networks,35,2211–222.CrossRef Google Scholar

Wang, P., Robins, G., Pattison, P., & Koskinen, J. (2014). MPNet, Program for the simulation and estimation of (

p^{*}

) exponential random graph models for Multilevel networks: USER MANUAL. Melbourne School of Psychological Sciences The University of Melbourne Australia.Google Scholar

Wasserman, S., &Faust, K.(1994).Social network analysis: Methods and applications,Cambridge:Cambridge University Press.CrossRef Google Scholar

Wasserman, S., &Pattison, P. E.(1996).Logit models and logistic regressions for social networks: I. An introduction to Markov graphs and p*.Psychometrika,61,401–425.CrossRef Google Scholar

Waternaux, C.,Laird, N. M., &Ware, J. H.(1989).Methods for analysis of longitudinal data: Blood-lead concentrations and cognitive development.Journal of the American Statistical Association,84,33–41.CrossRef Google Scholar

Weiss, R. E., &Lazaro, C. G.(1992).Residual plots for repeated measures.Statistics in Medicine,11,115–124.CrossRef Google Scholar PubMed

Williams, D. A. (1984). Residuals in generalized linear models. In Proceedings of the XIIth international biometric conference, Tokyo (pp. 59–68).Google Scholar

Williams, D. A.(1987).Generalized linear model diagnostics using the deviance and single case deletions.Applied Statistics,36,181–191.CrossRef Google Scholar

Article contents

Outliers and Influential Observations in Exponential Random Graph Models

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests