Evaluating the learning and performance characteristics of self-organizing systems with different task features

Hao Ji; Yan Jin

doi:10.1017/S089006042100024X

Evaluating the learning and performance characteristics of self-organizing systems with different task features

Published online by Cambridge University Press: 27 December 2021

Hao Ji and

Yan Jin

Show author details

Hao Ji: Affiliation:
Department of Aerospace and Mechanical Engineering, University of Southern California, 3650 McClintock Avenue, OHE 400, Los Angeles, CA90089-1453, USA
Yan Jin*: Affiliation:
Department of Aerospace and Mechanical Engineering, University of Southern California, 3650 McClintock Avenue, OHE 400, Los Angeles, CA90089-1453, USA
*: Author for correspondence: Yan Jin, E-mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Self-organizing systems (SOS) are developed to perform complex tasks in unforeseen situations with adaptability. Predefining rules for self-organizing agents can be challenging, especially in tasks with high complexity and changing environments. Our previous work has introduced a multiagent reinforcement learning (RL) model as a design approach to solving the rule generation problem of SOS. A deep multiagent RL algorithm was devised to train agents to acquire the task and self-organizing knowledge. However, the simulation was based on one specific task environment. Sensitivity of SOS to reward functions and systematic evaluation of SOS designed with multiagent RL remain an issue. In this paper, we introduced a rotation reward function to regulate agent behaviors during training and tested different weights of such reward on SOS performance in two case studies: box-pushing and T-shape assembly. Additionally, we proposed three metrics to evaluate the SOS: learning stability, quality of learned knowledge, and scalability. Results show that depending on the type of tasks; designers may choose appropriate weights of rotation reward to obtain the full potential of agents’ learning capability. Good learning stability and quality of knowledge can be achieved with an optimal range of team sizes. Scaling up to larger team sizes has better performance than scaling downwards.

Keywords

Complex system deep q-learning robustness scalability self-organizing system

Type: Research Article
Information: AI EDAM , Volume 35 , Issue 4 , November 2021 , pp. 404 - 422

DOI: https://doi.org/10.1017/S089006042100024X [Opens in a new window]
Copyright: Copyright © The Author(s), 2021. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abramson, J, Ahuja, A, Barr, I, Brussee, A, Carnevale, F, Cassin, M, Chhaparia, R, Clark, S, Damoc, B, Dudzik, A, Georgiev, P, Guy, A, Harley, T, Hill, F, Hung, A, Kenton, Z, Landon, J, Lillicrap, T, Mathewson, K, Mokrá, S, Muldal, A, Santoro, A, Savinov, N, Varma, V, Wayne, G, Williams, D, Wong, N, Yan, C and Zhu, R (2020) Imitating interactive intelligence. arXiv preprint arXiv:2012.05672.Google Scholar

Arroyo, M, Huisman, N and Jensen, DC (2018) Exploring natural strategies for bio-inspired fault adaptive systems design. Journal of Mechanical Design 140, 091101-1–091101-11.CrossRef Google Scholar

Ashby, WR (1961) An Introduction to Cybernetics. London, UK: Chapman & Hall Ltd.Google Scholar

Ashby, WR (1991) Requisite variety and its implications for the control of complex systems. In Klir, CJ (ed.), Facets of Systems Science. Boston, MA: Springer, pp. 405–417.CrossRef Google Scholar

Bar-Yam, Y (2002) General features of complex systems. In Kiel, LD (ed.), Encyclopedia of Life Support Systems (EOLSS). Oxford, UK: UNESCO, EOLSS Publishers.Google Scholar

Beckers, R, Holland, OE and Deneubourg, JL (2000) From local actions to global tasks: stigmergy and collective robotics. In Cruse, H, Dean, J and Ritter, H (eds), Prerational Intelligence: Adaptive Behavior and Intelligent Systems Without Symbols and Logic, Volume 1, Volume 2 Prerational Intelligence: interdisciplinary Perspectives on the Behavior of Natural and Artificial Systems, Volume 3. Dordrecht: Springer, pp. 1008–1022CrossRef Google Scholar

Busoniu, L, Babuska, R and De Schutter, B (2008) A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 38, 156–172.CrossRef Google Scholar

Chen, C and Jin, Y (2011) A behavior based approach to cellular self-organizing systems design. International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Vol. 54860, pp. 95–107.CrossRef Google Scholar

Chiang, W and Jin, Y (2012) Design of cellular self-organizing systems. International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Vol. 45028. American Society of Mechanical Engineers, pp. 511–521.CrossRef Google Scholar

Chung, J, Gulcehre, C, Cho, K and Bengio, Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.Google Scholar

Collinot, A and Drogoul, A (1998) Using the Cassiopeia method to design a robot soccer team. Applied Artificial Intelligence 12, 127–147.CrossRef Google Scholar

Dasgupta, P (2008) A multiagent swarming system for distributed automatic target recognition using unmanned aerial vehicles. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 38, 549–563.CrossRef Google Scholar

Drogoul, A and Zucker, JD (1998) Methodological Issues for Designing Multiagent Systems with Machine Learning Techniques: Capitalizing Experiences from the Robocup Challenge (Doctoral dissertation, LIP6).Google Scholar

Ferguson, SM and Lewis, K (2006) Effective development of reconfigurable systems using linear state-feedback control. AIAA Journal 44, 868–878.CrossRef Google Scholar

Foerster, J, Nardelli, N, Farquhar, G, Afouras, T, Torr, PH, Kohli, P and Whiteson, S (2017) Stabilising experience replay for deep multiagent reinforcement learning. International Conference on Machine Learning. PMLR, pp. 1146–1155.Google Scholar

Foerster, J, Farquhar, G, Afouras, T, Nardelli, N and Whiteson, S (2018) Counterfactual multiagent policy gradients. Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32, No. 1.CrossRef Google Scholar

Groß, R, Bonani, M, Mondada, F and Dorigo, M (2006) Autonomous self-assembly in swarm-bots. IEEE Transactions on Robotics 22, 1115–1130.Google Scholar

Hausknecht, M and Stone, P (2015) Deep recurrent q-learning for partially observable mdps. arXiv preprint arXiv:1507.06527.Google Scholar

Hochreiter, S and Schmidhuber, J (1997) Long short-term memory. Neural Computation 9, 1735–1780.CrossRef Google Scholar PubMed

Humann, J, Khani, N and Jin, Y (2014) Evolutionary computational synthesis of self-organizing systems. AI EDAM 28, 259–275.Google Scholar

Humann, J, Khani, N and Jin, Y (2016) Adaptability tradeoffs in the design of self-organizing systems. International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Vol. 50190. American Society of Mechanical Engineers, p. V007T06A016.CrossRef Google Scholar

Ji, H and Jin, Y (2018) Modeling trust in self-organizing systems with heterogeneity. ASME 2018 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers Digital Collection.CrossRef Google Scholar

Ji, H and Jin, Y (2019) Designing self-organizing systems with deep multi-agent reinforcement learning. International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Vol. 59278. American Society of Mechanical Engineers, p. V007T06A019.CrossRef Google Scholar

Ji, H and Jin, Y (2020) Designing self-assembly systems with deep multiagent reinforcement learning. Design Computing and Cognition’14. Springer, Cham, pp. xx–xx.Google Scholar

Jones, C and Mataric, MJ (2003) Adaptive division of labor in large-scale minimalist multi-robot systems. Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003)(Cat. No. 03CH37453), Vol. 2. IEEE, pp. 1969–1974.CrossRef Google Scholar

Kennedy, J (2006) Swarm intelligence. In Zomaya, AY (ed.), Handbook of Nature-Inspired and Innovative Computing. Boston, MA: Springer, pp. 187–219.CrossRef Google Scholar

Khani, N and Jin, Y (2015) Dynamic structuring in cellular self-organizing systems. In Gero, JS (ed.), Design Computing and Cognition’14. Cham: Springer, pp. 3–20.Google Scholar

Khani, N, Humann, J and Jin, Y (2016) Effect of social structuring in self-organizing systems. Journal of Mechanical Design 138, 041101-1–041101-11.CrossRef Google Scholar

Königseder, C and Shea, K (2016) Comparing strategies for topologic and parametric rule application in automated computational design synthesis. Journal of Mechanical Design 138, 011102-1–011102-12.CrossRef Google Scholar

Lamont, GB, Slear, JN and Melendez, K (2007) UAV swarm mission planning and routing using multi-objective evolutionary algorithms. 2007 IEEE Symposium on Computational Intelligence in Multi-Criteria Decision-Making. IEEE, pp. 10–20.CrossRef Google Scholar

LaValle, SM (2006) Planning Algorithms. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Liu, X and Jin, Y (2018) Design of transfer reinforcement learning mechanisms for autonomous collision avoidance. International Conference on-Design Computing and Cognition. Cham: Springer, pp. 303–319.Google Scholar

Lowe, R, Wu, Y, Tamar, A, Harb, J, Abbeel, P and Mordatch, I (2017) Multiagent actor-critic for mixed cooperative-competitive environments. arXiv preprint arXiv:1706.02275.Google Scholar

Martin, MV and Ishii, K (1997.Design for variety: development of complexity indices and design charts. Proceedings of ASME 1997 Design Engineering Technical Conferences, September 14–17, 1997, Sacramento, CA, DFM-4359-1–DFM-4359-9.Google Scholar

McComb, C, Cagan, J and Kotovsky, K (2017) Optimizing design teams based on problem properties: computational team simulations and an applied empirical test. Journal of Mechanical Design 139, 041101-1–041101-12.CrossRef Google Scholar

Meluso, J and Austin-Breneman, J (2018) Gaming the system: an agent-based model of estimation strategies and their effects on system performance. Journal of Mechanical Design 140, 121101-1–121101-9.CrossRef Google Scholar

Min, G, Suh, ES and Hölttä-Otto, K (2016) System architecture, level of decomposition, and structural complexity: analysis and observations. Journal of Mechanical Design 138, 021102-1–021102-11.CrossRef Google Scholar

Mnih, V, Kavukcuoglu, K, Silver, D, Graves, A, Antonoglou, I, Wierstra, D and Riedmiller, M (2013) Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.Google Scholar

Mnih, V, Kavukcuoglu, K, Silver, D, Rusu, AA, Veness, J, Bellemare, MG, Graves, A, Riedmiller, M, Fidjeland, AK, Ostrovski, G, Petersen, S, Beattie, C, Sadik, A, Antonoglou, I, King, H, Kumaran, D, Wierstra, D, Legg, S and Hassabis, D (2015) Human-level control through deep reinforcement learning. Nature 518, 529–533.CrossRef Google Scholar PubMed

Peng, XB, Berseth, G, Yin, K and Van De Panne, M (2017) Deeploco: dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Transactions on Graphics (TOG) 36, 1–13.Google Scholar

Pippin, CE (2013) Trust and Reputation for Formation and Evolution of Multi-robot Teams (Doctoral dissertation). Georgia Institute of Technology.Google Scholar

Pippin, C and Christensen, H (2014) Trust modeling in multi-robot patrolling. 2014 IEEE International Conference on Robotics and Automation (ICRA). IEEE, pp. 59–66.CrossRef Google Scholar

Price, IC and Lamont, GB (2006) GA directed self-organized search and attack UAV swarms. Proceedings of the 2006 Winter Simulation Conference. IEEE, pp. 1307–1315.Google Scholar

Rahimi, M, Gibb, S, Shen, Y and La, HM (2018) A comparison of various approaches to reinforcement learning algorithms for multi-robot box pushing. International Conference on Engineering Research and Applications. Cham: Springer, pp. 16–30.Google Scholar

Rashid, T, Samvelyan, M, Schroeder, C, Farquhar, G, Foerster, J and Whiteson, S (2018) Qmix: monotonic value function factorisation for deep multiagent reinforcement learning. International Conference on Machine Learning. PMLR, pp. 4295–4304.Google Scholar

Reynolds, CW (1987) Flocks, herds and schools: a distributed behavioral model. Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques. pp. 25–34.Google Scholar

Ruini, F and Cangelosi, A (2009) Extending the evolutionary robotics approach to flying machines: an application to MAV teams. Neural Networks 22, 812–821.CrossRef Google Scholar PubMed

Sutton, RS and Barto, AG (2018) Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press.Google Scholar

Tampuu, A, Matiisen, T, Kodelja, D, Kuzovkin, I, Korjus, K, Aru, J, Aru, J and Vicente, R (2017) Multiagent cooperation and competition with deep reinforcement learning. PLoS One 12, e0172395.CrossRef Google Scholar PubMed

Tan, M (1993) Multiagent reinforcement learning: Independent vs. cooperative agents. Proceedings of the Tenth International Conference on Machine Learning. pp. 330–337.Google Scholar

Wang, Y and De Silva, CW (2006) Multi-robot box-pushing: single-agent q-learning vs. team q-learning.2006 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, pp. 3694–3699.Google Scholar

Wang, Z, Schaul, T, Hessel, M, Hasselt, H, Lanctot, M and Freitas, N (2016) Dueling network architectures for deep reinforcement learning. International Conference on Machine Learning. PMLR, pp. 1995–2003.Google Scholar

Watkins, CJCH (1989) Learning from delayed rewards.Google Scholar

Wei, Y, Madey, GR and Blake, MB (2013) Agent-based simulation for uav swarm mission planning and execution. Proceedings of the Agent-Directed Simulation Symposium, pp. 1–8.Google Scholar

Werfel, J (2012) Collective construction with robot swarms. In Doursat, R, Sayama, H and Michel, O (eds), Morphogenetic Engineering. Berlin, Heidelberg: Springer, pp. 115–140.CrossRef Google Scholar

Article contents

Evaluating the learning and performance characteristics of self-organizing systems with different task features

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests