Hostname: page-component-cd9895bd7-8ctnn Total loading time: 0 Render date: 2024-12-26T20:05:19.433Z Has data issue: false hasContentIssue false

NetworKit: A tool suite for large-scale complex network analysis

Published online by Cambridge University Press:  28 December 2016

CHRISTIAN L. STAUDT
Affiliation:
Institute of Theoretical Informatics, Karlsruhe Institute of Technology (KIT), 76131 Karlsruhe, Germany (e-mail: [email protected])
ALEKSEJS SAZONOVS
Affiliation:
Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK (e-mail: [email protected])
HENNING MEYERHENKE
Affiliation:
Institute of Theoretical Informatics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany (e-mail: [email protected])
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We introduce NetworKit, an open-source software package for analyzing the structure of large complex networks. Appropriate algorithmic solutions are required to handle increasingly common large graph data sets containing up to billions of connections. We describe the methodology applied to develop scalable solutions to network analysis problems, including techniques like parallelization, heuristics for computationally expensive problems, efficient data structures, and modular software architecture. Our goal for the software is to package results of our algorithm engineering efforts and put them into the hands of domain experts. NetworKit is implemented as a hybrid combining the kernels written in C++ with a Python frontend, enabling integration into the Python ecosystem of tested tools for data analysis and scientific computing. The package provides a wide range of functionality (including common and novel analytics algorithms and graph generators) and does so via a convenient interface. In an experimental comparison with related software, NetworKit shows the best performance on a range of typical analysis tasks.

Type
Research Article
Copyright
Copyright © Cambridge University Press 2016 

References

Aiello, W., Chung, F., & Lu, L. (2000). A random graph model for massive graphs. In Proceedings of the 32nd Annual ACM Symposium on Theory of Computing, Portland, OR, USA. ACM, pp. 171–180.CrossRefGoogle Scholar
Alahakoon, T., Tripathi, R., Kourtellis, N., Simha, R., & Iamnitchi, A. (2011). K-path centrality: A new centrality measure in social networks. In Proceedings of the 4th Workshop on Social Network Systems, Salzburg, Austria. ACM, p. 1.CrossRefGoogle Scholar
Albert, R., & Barabási, A. (2002). Statistical mechanics of complex networks. Reviews of Modern Physics, 74 (1), 47.CrossRefGoogle Scholar
Bader, D. A., Meyerhenke, H., Sanders, P., Schulz, C., Kappes, A., & Wagner, D. (2014). Benchmarking for graph clustering and partitioning. In Encyclopedia of Social Network Analysis and Mining (pp. 7382). Springer.CrossRefGoogle Scholar
Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: An open source software for exploring and manipulating networks. In Proceedings of the Third International Conference on Weblogs and Social Media, ICWSM 2009, San Diego, CA, USA, May 17–20, 2009.CrossRefGoogle Scholar
Batagelj, V., & Brandes, U. (2005). Efficient generation of large random networks. Physical Review E, 71 (3), 036113.CrossRefGoogle ScholarPubMed
Batagelj, V., & Mrvar, A. (2004). Pajek—analysis and visualization of large networks, Lecture Notes in Computer Science, vol. 2265 (pp 477478). Springer.Google Scholar
Batagelj, V., & Zaveršnik, M. (2011). Fast algorithms for determining (generalized) core groups in social networks. Advances in Data Analysis and Classification, 5 (2), 129145.CrossRefGoogle Scholar
Behnel, S., Bradshaw, R., Citro, C., Dalcin, L., Seljebotn, D. S., & Smith, K. (2011). Cython: The best of both worlds. Computing in Science & Engineering, 13 (2), 3139.CrossRefGoogle Scholar
Bergamini, E., & Meyerhenke, H. (2015). Fully-dynamic approximation of betweenness centrality. In Proceedings of the Algorithms - ESA 2015 - 23rd Annual European Symposium, Patras, Greece, September 14–16. Springer, pp. 155–166.CrossRefGoogle Scholar
Bergamini, E., Meyerhenke, H., & Staudt, C. (2015). Approximating betweenness centrality in large evolving networks. In Proceedings of the 17th Workshop on Algorithm Engineering and Experiments, ALENEX 2015. San Diego, CA, USA, January 5, 2015. SIAM, pp. 133–146.CrossRefGoogle Scholar
Blondel, V. D., Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008 (10), P10008.CrossRefGoogle Scholar
Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., & Hwang, D.-U. (2006). Complex networks: Structure and dynamics. Physics Reports, 424 (4), 175308.CrossRefGoogle Scholar
Bode, M., Fountoulakis, N., & Müller, T. (2015). On the largest component of a hyperbolic model of complex networks. Electronic Journal of Combinatorics, 22 (3), P3.24.CrossRefGoogle Scholar
Boldi, P., Codenotti, B., Santini, M., & Vigna, S. (2004). Ubicrawler: A scalable fully distributed web crawler. Software: Practice & Experience, 34 (8), 711726.Google Scholar
Brandes, U. (2001). A faster algorithm for betweenness centrality. Journal of Mathematical Sociology, 25 (2), 163177.CrossRefGoogle Scholar
Brin, S., & Page, L. (2012). Reprint of: The anatomy of a large-scale hypertextual web search engine. Computer Networks, 56 (18), 38253833.CrossRefGoogle Scholar
CAIDA (2003). Caida skitter router-level topology measurements. Retrieved from http://www.caida.org/data/router-adjacencies/.Google Scholar
Chakrabarti, D., Zhan, Y., & Faloutsos, C. (2004). R-MAT: A recursive model for graph mining. In SDM (vol. 4, pp. 442446). Orlando, FL, USA: SIAM.Google Scholar
Costa, L. d. F., Oliveira, O. N. Jr, Travieso, G., Rodrigues, F. A., Villas Boas, P. R., Antiqueira, L., . . . Correa Rocha, L. E. (2011). Analyzing and modeling real-world phenomena with complex networks: A survey of applications. Advances in Physics, 60 (3), 329412.CrossRefGoogle Scholar
Crescenzi, P., Grossi, R., Habib, M., Lanzi, L., & Marino, A. (2013). On computing the diameter of real-world undirected graphs. Theoretical Computer Science, 514, 8495.CrossRefGoogle Scholar
Csardi, G., & Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, Complex Systems, 1695 (5), 19.Google Scholar
Dasari, N. S., Ranjan, D., & Zubair, M. (2014). ParK: An efficient algorithm for k-core decomposition on multicore processors. In Lin, J., Pei, J., Hu, X., Chang, W., Nambiar, R., Aggarwal, C., Cercone, N., Honavar, V., Huan, J., Mobasher, B., & Pyne, S. (Eds.), IEEE International Conference on Big Data, Big Data 2014 (pp. 9–16). Washington, DC: IEEE, October 27–30, 2014.Google Scholar
Ediger, D., Jiang, K., Riedy, E. J., & Bader, D. A. (2013). GraphCT: Multithreaded algorithms for massive graph analysis. IEEE Transactions on Parallel and Distributed Systems, 24 (11), 22202229.Google Scholar
Ediger, D., McColl, R., Riedy, E. J., & Bader, D. A. (2012). STINGER: High performance data structure for streaming graphs. In IEEE Conference on High Performance Extreme Computing HPEC 2012, Waltham, MA, USA, September 10–12, 2012, pp. 1–5.CrossRefGoogle Scholar
Eppstein, D., & Wang, J. (2004). Fast approximation of centrality. Journal of Graph Algorithms Applications, 8, 3945.CrossRefGoogle Scholar
Esders, K. (2015). Link prediction in large-scale complex networks. Master's thesis, Karlsruhe Institute of Technology.Google Scholar
Flick, P. (2014). Analysis of human tissue-specific protein-protein interaction networks. Master's thesis, Karlsruhe Institute of Technology.Google Scholar
Freeman, L. C. (1979). Centrality in social networks conceptual clarification. Social Networks, 1 (3), 215239.CrossRefGoogle Scholar
Geisberger, R., Sanders, P., & Schultes, D. (2008). Better approximation of betweenness centrality. In Proceedings of the Tenth Workshop on Algorithm Engineering and Experiments (ALENEX), San Francisco, CA, USA. SIAM, pp. 90–100.CrossRefGoogle Scholar
Girvan, M., & Newman, M. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99 (12), 7821.CrossRefGoogle ScholarPubMed
Gugelmann, L., Panagiotou, K., & Peter, U. (2012). Random hyperbolic graphs: Degree sequence and clustering - (extended abstract). In Automata, Languages, and Programming - 39th International Colloquium, ICALP 2012, Proceedings, Part II, pp. 573–585.Google Scholar
Hagberg, A., Swart, P., & S Chult, D. (2008). Exploring network structure, dynamics, and function using NetworkX. Technical report, Los Alamos National Laboratory (LANL).Google Scholar
Hakimi, S. L. (1962). On realizability of a set of integers as degrees of the vertices of a linear graph. I. Journal of the Society for Industrial & Applied Mathematics, 10 (3), 496506.CrossRefGoogle Scholar
Hamann, M., Lindner, G., Meyerhenke, H., Staudt, C. L., & Wagner, D. (2016). Structure-preserving sparsification methods for social networks. Social Network Analysis and Mining, 6 (1), 22:122:22.CrossRefGoogle Scholar
Katz, L. (1953). A new status index derived from sociometric analysis. Psychometrika, 18 (1), 3943.CrossRefGoogle Scholar
Kiwi, M. A., & Mitsche, D. (2015). A bound for the diameter of random hyperbolic graphs. In Proceedings of the Twelfth Workshop on Analytic Algorithmics and Combinatorics, ANALCO 2015, San Diego, CA, USA, January 4, 2015, pp. 26–39.CrossRefGoogle Scholar
Koch, J., Staudt, C. L., Vogel, M., & Meyerhenke, H. (2015). Complex network analysis on distributed systems: An empirical comparison. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Paris, France. ACM, pp. 1169–1176.CrossRefGoogle Scholar
Krioukov, D., Papadopoulos, F., Kitsak, M., Vahdat, A., & Boguñá, M. (2010). Hyperbolic geometry of complex networks. Physical Review E, 82, 036106.CrossRefGoogle ScholarPubMed
Kunegis, J. (2013). KONECT: The koblenz network collection. In 22nd International World Wide Web Web Conference, WWW '13, Rio de Janeiro, Brazil, May 13–17, 2013, Companion Volume, pp. 1343–1350.CrossRefGoogle Scholar
Lambiotte, R. (2010). Multi-scale modularity in complex networks. In Proceedings of the Eighth International Symposium on Modeling and Optimization in Mobile, Ad-Hoc and Wireless Networks (WiOpt 2010), May 31–June 4, 2010, University of Avignon, Avignon, France, pp. 546–553.Google Scholar
Lancichinetti, A., & Fortunato, S. (2009). Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Physical Review E, 80 (1), 016118.CrossRefGoogle ScholarPubMed
Leskovec, J., & Krevl, A. (2014). SNAP Datasets: Stanford large network dataset collection. Retrieved from http://snap.stanford.edu/data.Google Scholar
Leskovec, J., & Sosič, R. (2014). SNAP: A general purpose network analysis and graph mining library in C++. Retrieved from http://snap.stanford.edu/snap.Google Scholar
Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., & Hellerstein, J. M. (2012). Distributed graphlab: A framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment, 5 (8), 716727.CrossRefGoogle Scholar
Lugowski, A., Alber, D.M., Buluç, A., Gilbert, J. R., Reinhardt, S.P., Teng, Y., & Waranis, A. (2012). A flexible open-source toolbox for scalable complex graph analysis. In Proceedings of the Twelfth SIAM International Conference on Data Mining, Anaheim, CA, USA, April 26–28, 2012, pp. 930–941.CrossRefGoogle Scholar
Newman, M. (2010). Networks: An introduction. New York, NY, USA: Oxford University Press.CrossRefGoogle Scholar
O'Madadhain, J., Fisher, D., White, S., & Boey, Y. (2003). The JUNG (java universal network/graph) framework. Irvine, California: University of California.Google Scholar
Erdős, P., & Rényi, A. (1960). On the evolution of random graphs. Publication of the Mathematical Institute of the Hungarian Academy of Sciences, 5.Google Scholar
Peixoto, T. P. (2006). graph-tool. Retrieved from http://graph-tool.skewed.de.Google Scholar
Perez, F., Granger, B. E., & Obispo, C. (2013). An open source framework for interactive, collaborative and reproducible scientific computing and education.Google Scholar
Raghavan, U. N., Albert, R., & Kumara, S. (2007). Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 76 (3), 036106.CrossRefGoogle ScholarPubMed
Riondato, M., & Kornaropoulos, E. M. (2016). Fast approximation of betweenness centrality through sampling. Data Mining and Knowledge Discovery, 30 (2), 438475.CrossRefGoogle Scholar
Schank, T., & Wagner, D. (2005). Approximating clustering coefficient and transitivity. Journal of Graph Algorithms and Applications, 9 (2), 265275.CrossRefGoogle Scholar
Shun, J., & Blelloch, G. E. (2013). Ligra: A lightweight graph processing framework for shared memory. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '13, Shenzhen, China, February 23–27, 2013. ACM, pp. 135146.CrossRefGoogle Scholar
Sporns, O., & Betzel, R. F. (2016). Modular brain networks. Annual Review of Psychology, 67, 613640.CrossRefGoogle ScholarPubMed
Staudt, C. L., & Meyerhenke, H. (2016). Engineering parallel algorithms for community detection in massive networks. IEEE Transactions on Parallel and Distributed Systems, 27 (1), 171184.CrossRefGoogle Scholar
Traud, A. L., Mucha, P. J., & Porter, M. A. (2012). Social structure of facebook networks. Physica A: Statistical Mechanics and its Applications, 391 (16), 41654180.CrossRefGoogle Scholar
von Looz, M., Meyerhenke, H., & Prutkin, R. (2015). Generating random hyperbolic graphs in subquadratic time. In Proceedings of 26th International Symposium on Algorithms and Computation (ISAAC 2015), LNCS, Nagoya, Japan. Springer.Google Scholar
Zafarani, R., & Liu, H. (2009). Social computing data repository at ASU. Retrieved from http://socialcomputing.asu.edu.Google Scholar