Hostname: page-component-586b7cd67f-2brh9 Total loading time: 0 Render date: 2024-11-24T04:35:44.486Z Has data issue: false hasContentIssue false

Randomized optimal transport on a graph: framework and new distance measures

Published online by Cambridge University Press:  25 April 2019

Guillaume Guex*
Affiliation:
ICTEAM, Universite catholique de Louvain, Louvain-la-Neuve, Belgium (email: [email protected])
Ilkka Kivimäki
Affiliation:
Department of Computer Science, Aalto University, Finland & ICTEAM, Universite catholique de Louvain, Belgium (email: [email protected])
Marco Saerens
Affiliation:
ICTEAM, Universite catholique de Louvain & Universite Libre de Bruxelles, Belgium (email: [email protected])
*
*Corresponding author. Email: [email protected]

Abstract

The recently developed bag-of-paths (BoP) framework consists in setting a Gibbs–Boltzmann distribution on all feasible paths of a graph. This probability distribution favors short paths over long ones, with a free parameter (the temperature T) controlling the entropic level of the distribution. This formalism enables the computation of new distances or dissimilarities, interpolating between the shortest-path and the resistance distance, which have been shown to perform well in clustering and classification tasks. In this work, the bag-of-paths formalism is extended by adding two independent equality constraints fixing starting and ending nodes distributions of paths (margins).When the temperature is low, this formalism is shown to be equivalent to a relaxation of the optimal transport problem on a network where paths carry a flow between two discrete distributions on nodes. The randomization is achieved by considering free energy minimization instead of traditional cost minimization. Algorithms computing the optimal free energy solution are developed for two types of paths: hitting (or absorbing) paths and non-hitting, regular, paths and require the inversion of an n × n matrix with n being the number of nodes. Interestingly, for regular paths on an undirected graph, the resulting optimal policy interpolates between the deterministic optimal transport policy (T → 0+) and the solution to the corresponding electrical circuit (T → ∞). Two distance measures between nodes and a dissimilarity between groups of nodes, both integrating weights on nodes, are derived from this framework.

Type
Original Article
Copyright
© Cambridge University Press 2019 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ahuja, R. K., Magnanti, T. L., & Orlin, J. B. (1993). Network flows: Theory, algorithms, and applications. Prentice Hall.Google Scholar
Akamatsu, T. (1996). Cyclic flows, Markov process and stochastic traffic assignment. Transportation Research B, 30(5), 369386.Google Scholar
Alamgir, M., & von Luxburg, U. (2011). Phase transition in the family of p-resistances. In Advances in neural information processing systems 24: Proceedings of the NIPS ‘11 conference (pp. 379–387).Google Scholar
Bacharach, M. (1965) Estimating nonnegative matrices from marginal data. International Economic Review, 6(3), 294310.Google Scholar
Barabási, A.-L. (2016) Network science. Cambridge University Press.Google Scholar
Bavaud, F., & Guex, G. (2012). Interpolating between random walks and shortest paths: A path functional approach. In International conference on social informatics (pp. 6881).CrossRefGoogle Scholar
Brandes, U., & Fleischer, D. (2005). Centrality measures based on current flow. In Proceedings of the 22nd annual symposium on theoretical aspects of computer science (STACS ‘05) (pp. 533544).Google Scholar
Chebotarev, P. (2011). A class of graph-geodetic distances generalizing the shortest-path and the resistance distances. Discrete Applied Mathematics, 159(5), 295302.Google Scholar
Chebotarev, P. (2012) The walk distances in graphs. Discrete Applied Mathematics, 160(10-11), 14841500.CrossRefGoogle Scholar
Chebotarev, P. (2013). Studying new classes of graph metrics. In Nielsen, F., and Barbaresco, F. (Eds.), Proceedings of the 1st international conference on geometric science of information (GSI ‘13) (vol. 8085, pp. 207–214).Google Scholar
Chung, F. R., & Lu, L. (2006). Complex graphs and networks. American Mathematical Society.CrossRefGoogle Scholar
Courty, N., Flamary, R., Tuia, D., & Rakotomamonjy, A. (2017). Optimal transport for domain adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(9), 18531865.CrossRefGoogle ScholarPubMed
Cover, T. M., & Thomas, J. A. (2006). Elements of information theory (2nd ed.). John Wiley and Sons.Google Scholar
Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in neural information processing systems 26: Proceedings of the NIPS ‘13 conference (pp. 22922300). MIT Press.Google Scholar
Dobrushin, R. L. (1970). Prescribing a system of random variables by conditional distributions. Theory of Probability & Its Applications, 15(3), 458486.CrossRefGoogle Scholar
Doyle, P. G., & Snell, J. L. (1984). Random walks and electric networks. The Mathematical Association of America.Google Scholar
Erlander, S., & Stewart, N. (1990). The gravity model in transportation analysis. Theory and extensions. VSP International Science Publishers.Google Scholar
Estrada, E. (2012). The structure of complex networks. Oxford University Press.Google Scholar
Fang, S., Rajasekera, J., & Tsao, H. (1997). Entropy optimization and mathematical programming. Springer.CrossRefGoogle Scholar
Ferradans, S., Papadakis, N., Peyré, G., & Aujol, J.-F. (2014) Regularized discrete optimal transport. SIAM Journal on Imaging Sciences, 7(3), 18531882.CrossRefGoogle Scholar
Fouss, F., Saerens, M., & Shimbo, M. (2016). Algorithms and models for network data and link analysis. Cambridge University Press.CrossRefGoogle Scholar
Françoisse, K., Kivimäki, I., Mantrach, A., Rossi, F., & Saerens, M. (2017). A bag-of-paths framework for network data analysis. Neural Networks, 90, 90111.CrossRefGoogle ScholarPubMed
Freeman, L. C. (1977). A set of measures of centrality based on betweenness. Sociometry, 40(1), 3541.CrossRefGoogle Scholar
García-Díez, S., Vandenbussche, E., & Saerens, M. (2011). A continuous-state version of discrete randomized shortest-paths. In Proceedings of the 50th IEEE international conference on decision and control (IEEE CDC 2011) (pp. 6570–6577).Google Scholar
Graybill, F. (1983). Matrices with applications in statistics. Wadsworth International Group.Google Scholar
Grinstead, C., & Snell, J. L. (1997). Introduction to probability (2nd ed.). The Mathematical Association of America.Google Scholar
Griva, I., Nash, S. G., & Sofer, A. (2009). Linear and nonlinear optimization: Second edition. Society for Industrial and Applied Mathematics (SIAM).Google Scholar
Guex, G. (2016). Interpolating between random walks and optimal transportation routes: Flow with multiple sources and targets. Physica A: Statistical Mechanics and its Applications, 450, 264277.CrossRefGoogle Scholar
Guex, G., & Bavaud, F. (2015). Flow-based dissimilarities: shortest path, commute time, max-flow and free energy. In Lausen, B., Krolak-Schwerdt, S., and Bohmer, M. (Eds.), Data science, learning by latent structures, and knowledge discovery (vol. 1564, pp. 101111). Springer.CrossRefGoogle Scholar
Guex, G., Emmanouilidis, T., & Bavaud, F. (2017). Transportation clustering: A regularized version of the optimal transportation problem. (Submitted for publication)Google Scholar
Hara, K., Suzuki, I., Shimbo, M., Kobayashi, K., Fukumizu, K., & Radovanovic, M. (2015). Localized centering: Reducing hubness in large-sample data. In Proceedings of the 29th AAAI conference on artificial intelligence (AAAI ‘15) (pp. 26452651).Google Scholar
Hashimoto, T., Sun, Y., & Jaakkola, T. (2015). From random walks to distances on unweighted graphs. In Advances in neural information processing systems 28: Proceedings of the NIPS ‘15 conference (pp. 34293437). MIT Press.Google Scholar
Herbster, M., & Lever, G. (2009). Predicting the labelling of a graph via minimum p-seminorm interpolation. In Proceedings of the 22nd conference on learning theory (COLT ‘09) (pp. 18–21).Google Scholar
Jaynes, E. T. (1957). Information theory and statistical mechanics. Physical Review, 106, 620630.CrossRefGoogle Scholar
Kantorovich, L. V. (1942). On the translocation of masses. Doklady Akademii Nauk SSSR, 37(7-8), 227229.Google Scholar
Kapur, J. N. (1989). Maximum-entropy models in science and engineering. Wiley.Google Scholar
Kapur, J. N., & Kesavan, H. K. (1992). Entropy optimization principles with applications. Academic Press.CrossRefGoogle Scholar
Kivimäki, I., Lebichot, B., Saramäki, J., & Saerens, M. (2016). Two betweenness centrality measures based on randomized shortest paths (Scientific Reports, 6, srep19668).Google Scholar
Kivimäki, I., Shimbo, M., & Saerens, M. (2014). Developments in the theory of randomized shortest paths with a comparison of graph node distances. Physica A: Statistical Mechanics and its Applications, 393, 600616.CrossRefGoogle Scholar
Klein, D. J., & Randic, M. (1993). Resistance distance. Journal of Mathematical Chemistry, 12(1), 8195.CrossRefGoogle Scholar
Kolaczyk, E. D. (2009). Statistical analysis of network data: Methods and models. Springer.CrossRefGoogle Scholar
Kurras, S. (2015). Symmetric iterative proportional fitting. In Proceedings of the 18th international conference on artificial intelligence and statistics (AISTATS) (vol. 38, pp. 526–534).Google Scholar
Lebichot, B., Kivimäki, I., Françoisse, K., & Saerens, M. (2014). Semi-supervised classification through the bag-of-paths group betweenness. IEEE Transactions on Neural Networks and Learning Systems, 25, 11731186.Google Scholar
Lebichot, B., & Saerens, M. (2018). A bag-of-paths node criticality measure. Neurocomputing, 275, 224236.Google Scholar
Lewis, T. G. (2009). Network science. Wiley.Google Scholar
Li, Y., Zhang, Z.-L., & Boley, D. (2013). From shortest-path to all-path: The routing continuum theory and its applications. IEEE Transactions on Parallel and Distributed Systems, 25(7), 17451755.CrossRefGoogle Scholar
Lougee-Heimer, R. (2003). The Common Optimization INterface for Operations Research: Promoting open-source software in the operations research community. IBM Journal of Research and Development, 47(1), 5766. doi: 10.1147/rd.471.0057.CrossRefGoogle Scholar
, L., &Zhou, T. (2011). Link prediction in complex networks: A survey. PhysicaA: Statistical Mechanics and its Applications, 390, 11501170.CrossRefGoogle Scholar
Mantrach, A., Yen, L., Callut, J., Françoisse, K., Shimbo, M., & Saerens, M. (2010). The sum-over-paths covariance kernel: A novel covariance between nodes of a directed graph. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(6), 11121126.CrossRefGoogle ScholarPubMed
Mantrach, A., Zeebroeck, N. V., Francq, P., Shimbo, M., Bersini, H., & Saerens, M. (2011). Semi-supervised classification and betweenness computation on large, sparse, directed graphs. Pattern Recognition, 44(6), 12121224.CrossRefGoogle Scholar
Newman, M. E. (2005). A measure of betweenness centrality based on random walks. Social Networks, 27(1), 3954.CrossRefGoogle Scholar
Newman, M. E. (2010). Networks: An introduction. Oxford University Press.CrossRefGoogle Scholar
Nguyen, C. H., & Mamitsuka, H. (2016). New resistance distances with global information on large graphs. In Proceedings of the 19th international conference on artificial intelligence and statistics (AISTATS) (pp. 639–647).Google Scholar
Osborne, M. J. (2004). An introduction to game theory. Oxford University Press.Google Scholar
Pukelsheim, F. (2014). Biproportional scaling of matrices and the iterative proportional fitting procedure. Annals of Operations Research, 215(1), 269283.CrossRefGoogle Scholar
Radovanovic, M., Nanopoulos, A., & Ivanovic, M. (2010a). Hubs in space: Popular nearest neighbors in high-dimensional data. Journal of Machine Learning Research, 11, 24872531.Google Scholar
Radovanovic, M., Nanopoulos, A., & Ivanovic, M. (2010b). On the existence of obstinate results in vector space models. In Proceedings of the 33rd annual international ACM SIGIR conference on research and development in information retrieval (SIGIR ‘10) (pp. 186–193).Google Scholar
Saerens, M., Achbany, Y., Fouss, F., & Yen, L. (2009). Randomized shortest-path problems: Two related models. Neural Computation, 21(8), 23632404.CrossRefGoogle ScholarPubMed
Silva, T., & Zhao, L. (2016). Machine learning in complex networks. Springer.Google Scholar
Sinkhorn, R. (1967). Diagonal equivalence to matrices with prescribed row and column sums. The American Mathematical Monthly, 74(4), 402405.CrossRefGoogle Scholar
Solomon, J., Rustamov, R., Guibas, L., & Butscher, A. (2014). Wasserstein propagation for semi-supervised learning. In Proceedings of the 31 international conference on machine learning (ICML ‘14) (pp. 306–314).Google Scholar
Sommer, F., Fouss, F., & Saerens, M. (2016). Comparison of graph node distances on clustering tasks. In Proceedings of the 25th international conference on artificial neural networks (ICANN2016) (vol. 9886, pp. 192–201).CrossRefGoogle Scholar
Sommer, F., Fouss, F., & Saerens, M. (2017). Modularity-driven kernel k-means for community detection. In Proceedings of the 26th international conference on artificial neural networks (ICANN2017) (vol. 10614, pp. 423–433).CrossRefGoogle Scholar
Suzuki, I., Hara, K., Shimbo, M., Matsumoto, Y., & Saerens, M. (2012). Investigating the effectiveness of Laplacian-based kernels in hub reduction. In Proceedings of the 26th AAAI conference on artificial intelligence (AAAI ‘12) (pp. 1112–1118).Google Scholar
Suzuki, I., Hara, K., Shimbo, M., Saerens, M., & Fukumizu, K. (2013). Centering similarity measures to reduce hubs. In Proceedings of the international conference on empirical methods in natural language processing (EMNLP 2013) (pp. 613–623).Google Scholar
Thelwall, M. (2004). Link analysis: An information science approach. Elsevier.CrossRefGoogle Scholar
Tomasev, N., Radovanovic, M., Mladenic, D., & Ivanovic, M. (2014). The role of hubness in clustering high-dimensional data. IEEE Transactions on Knowledge and Data Engineering, 26(3), 739751.CrossRefGoogle Scholar
Villani, C. (2003). Topics in optimal transportation. American Mathematical Society.CrossRefGoogle Scholar
Villani, C. (2008). Optimal transport: Old and new. Springer.Google Scholar
von Luxburg, U., Radl, A., & Hein, M. (2010). Getting lost in space: Large sample analysis of the commute distance. In Advances in neural information processing systems 23: Proceedings of the NIPS ‘10 conference (pp. 26222630). MIT Press.Google Scholar
von Luxburg, U., Radl, A., & Hein, M. (2014). Hitting and commute times in large random neighborhood graphs. Journal of Machine Learning Research, 15(1), 17511798.Google Scholar
Wasserman, S., & Faust, K. (1994). Social network analysis: Methods and applications. Cambridge University Press.CrossRefGoogle Scholar
Wilson, A. (1970). Entropy in urban and regional modelling. Routledge.Google Scholar
Yen, L., Mantrach, A., Shimbo, M., & Saerens, M. (2008). A family ofdissimilarity measures between nodes generalizing both the shortest-path and the commute-time distances. In Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (KDD ‘08) (pp. 785–793).CrossRefGoogle Scholar
Zhang, W., Zhao, D., & Wang, X. (2013). Agglomerative clustering via maximum incremental path integral. Pattern Recognition, 46(11), 30563065.CrossRefGoogle Scholar
Zolotarev, V. M. (1983). Probability metrics. Teoriya Veroyatnostei i ee Primeneniya, 28(2), 264287.Google Scholar