Hostname: page-component-586b7cd67f-rdxmf Total loading time: 0 Render date: 2024-11-27T18:20:19.200Z Has data issue: false hasContentIssue false

Domain adaptation-based transfer learning using adversarial networks

Published online by Cambridge University Press:  26 February 2020

Farzaneh Shoeleh
Affiliation:
University of New Brunswick, Fredericton, New Brunswick, Canada e-mails: [email protected], [email protected]
Mohammad Mehdi Yadollahi
Affiliation:
University of New Brunswick, Fredericton, New Brunswick, Canada e-mails: [email protected], [email protected]
Masoud Asadpour
Affiliation:
University of Tehran, Tehran, Iran e-mail: [email protected]

Abstract

There is an implicit assumption in machine learning techniques that each new task has no relation to the tasks previously learned. Therefore, tasks are often addressed independently. However, in some domains, particularly reinforcement learning (RL), this assumption is often incorrect because tasks in the same or similar domain tend to be related. In other words, even though tasks are quite different in their specifics, they may have general similarities, such as shared skills, making them related. In this paper, a novel domain adaptation-based method using adversarial networks is proposed to do transfer learning in RL problems. Our proposed method incorporates skills previously learned from source task to speed up learning on a new target task by providing generalization not only within a task but also across different, but related tasks. The experimental results indicate the effectiveness of our method in dealing with RL problems.

Type
Research Article
Copyright
© Cambridge University Press, 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abel, D., et al. 2018. Policy and value transfer in lifelong reinforcement learning. In International Conference on Machine Learning, 2029.Google Scholar
Ammar, H. B., Eaton, E., Ruvolo, P. & Taylor, M. 2014. Online multi-task learning for policy gradient methods. In Proceedings of the 31st International Conference on Machine Learning (ICML-14), 12061214.Google Scholar
Ammar, H. B., Eaton, E., Ruvolo, P. & Taylor, M. E. 2015. Unsupervised cross-domain transfer in policy gradient reinforcement learning via manifold alignment. In Proceedings of AAAI.Google Scholar
Ammar, H. B., Eaton, E., Taylor, M. E., et al. 2014. An automated measure of MDP similarity for transfer in reinforcement learning. In Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence.Google Scholar
Ammar, H. B., et al. 2012. Reinforcement learning transfer via sparse coding. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems,. 1. International Foundation for Autonomous Agents and Multiagent Systems, 383390.Google Scholar
Asadi, M. & Huber, M. 2007. Effective control knowledge transfer through learning skill and representation hierarchies. In 20th International Joint Conference on Artificial Intelligence, ICML, 20542059.Google Scholar
Asadi, M. & Huber, M. 2015. A dynamic hierarchical task transfer in multiple robot explorations. In Proceedings on the International Conference on Artificial Intelligence (ICAI), 8, 2227.Google Scholar
Barreto, A., et al. 2017. Successor features for transfer in reinforcement learning. In Advances in Neural Information Processing Systems, 40554065.Google Scholar
Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., et al. 2010. A theory of learning from different domains. Machine Learning 79(1–2), 151175.CrossRefGoogle Scholar
Ben-David, S., Blitzer, J., Crammer, K. & Pereira, F. 2007. Analysis of representations for domain adaptation. In: Advances in Neural Information Processing Systems, 137144.Google Scholar
Bocsi, B., Csató, L. & Peters, J. 2013. Alignment-based transfer learning for robot models. In The 2013 International Joint Conference on Neural Networks (IJCNN), 17. IEEE.CrossRefGoogle Scholar
Celiberto, L. A. Jr., et al. 2011. Using cases as heuristics in reinforcement learning: a transfer learning application. In: IJCAI Proceedings-International Joint Conference on Artificial Intelligence 22(1), 1211.Google Scholar
Cheng, Q., Wang, X. & Shen, L. 2017. An autonomous inter-task mapping learning method via artificial neural network for transfer learning. In 2017 IEEE International Conference on Robotics and Biomimetics (ROBIO), 768773. IEEE.CrossRefGoogle Scholar
Cheng, Q., Wang, X. & Shen, L. 2019. A survey on transfer learning for multiagent reinforcement learning systems. Journal of Artificial Intelligence Research 64, 645703.Google Scholar
Da Silva, F. L. & Reali Costa, A. H. 2017. Towards zero-shot autonomous inter-task mapping through object-oriented task description. In: Workshop on Transfer in Reinforcement Learning (TiRL). Google Scholar
Da Silva, F. L. & Reali Costa, A. H. 2019. A survey on transfer learning for multiagent reinforcement learning systems. Journal of Artificial Intelligence Research 64, 645703.CrossRefGoogle Scholar
Da Silva, F. L., Glatt, R. & Reali Costa, A. H. 2017. Simultaneously learning and advising in multiagent reinforcement learning. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, 11001108. International Foundation for Autonomous Agents and Multiagent Systems.Google Scholar
Dabney, W. & Barto, A. G. 2012. Adaptive step-size for online temporal difference learning. In Twenty-Sixth AAAI Conference on Artificial Intelligence.Google Scholar
Fachantidis, A., et al. 2011. Transfer learning via multiple inter-task mappings. In European Workshop on Reinforcement Learning, 225236. Springer.CrossRefGoogle Scholar
Fachantidis, A., et al. 2015. Transfer learning with probabilistic mapping selection. Adaptive Behavior 23(1), 319.CrossRefGoogle Scholar
Ferns, N., Panangaden, P. & Precup, D. 2011. Bisimulation metrics for continuous Markov decision processes. SIAM Journal on Computing 40(6), 16621714.CrossRefGoogle Scholar
Ganin, Y. & Lempitsky, V. 2014. Unsupervised domain adaptation by backpropagation. arXiv preprint arXiv:1409.7495.Google Scholar
Ganin, Y. & Lempitsky, V. S. 2015. Unsupervised domain adaptation by back-propagation. In ICML.Google Scholar
Ganin, Y., Ustinova, E., et al. 2016. Domain-adversarial training of neural networks. The Journal of Machine Learning Research 17(1), 2096–2030.Google Scholar
Goodfellow, I., et al. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems, 26722680.Google Scholar
Hoffman, J., et al. 2017. Simultaneous deep transfer across domains and tasks. In Domain Adaptation in Computer Vision Applications, 173187. Springer.CrossRefGoogle Scholar
Konidaris, G. & Barto, A. G. 2009. Skill discovery in continuous reinforcement learning domains using skill chaining. In Advances in Neural Information Processing Systems, 10151023.Google Scholar
Konidaris, G., Thomas, P., et al. 2011. Value function approximation in reinforcement learning using the Fourier basis. In Proceedings of the Twenty-Fifth Conference on Artificial Intelligence, 380385.Google Scholar
Konidaris, G., et al. 2012. Robot learning from demonstration by constructing skill trees. The International Journal of Robotics Research 31(3), 360375.CrossRefGoogle Scholar
Lazaric, A. 2012. Transfer in reinforcement learning: a framework and a survey. Reinforcement Learning 12, 143173.CrossRefGoogle Scholar
Lazaric, A. & Restelli, M. 2011. Transfer from multiple MDPs. In Advances in Neural Information Processing Systems, 17461754.Google Scholar
Lazaric, A., Restelli, M. & Bonarini, A. 2008. Transfer of samples in batch reinforcement learning. In: Proceedings of the 25th International Conference on Machine Learning – ICML 2008, pp. 544551. ACM Press.CrossRefGoogle Scholar
Liu, M.-Y. & Tuzel, O. 2016. Coupled generative adversarial networks. In Advances in Neural Information Processing Systems, 469477.Google Scholar
Mahadevan, S. & Maggioni, M. 2007. Proto-value functions: a Laplacian framework for learning representation and control in Markov decision processes. Journal of Machine Learning Research 8, 21692231, 16.Google Scholar
Moradi, P., et al. 2012. Automatic skill acquisition in reinforcement learning using graph centrality measures. Intelligent Data Analysis 16, 113135.CrossRefGoogle Scholar
Puterman, M. L. 2014. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons.Google Scholar
Shoeleh, F. & Asadpour, M. 2017. Graph based skill acquisition and transfer Learning for continuous reinforcement learning domains. Pattern Recognition Letters 87, 104116.CrossRefGoogle Scholar
Shoeleh, F. & Asadpour, M. 2019. Skill based transfer learning with domain adaptation for continuous reinforcement learning domains. Applied Intelligence, 117.Google Scholar
Spector, B. & Belongie, S. 2018. Sample-effcient reinforcement learning through transfer and architectural priors. arXiv preprint arXiv:1801.02268.Google Scholar
Sutton, R. S. S., Precup, D. & Singh, S. 1999. Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1–2), 181211.CrossRefGoogle Scholar
Taylor, M. E. & Stone, P. 2009. Transfer learning for reinforcement learning domains: a survey. Journal of Machine Learning Research 10, 16331685.Google Scholar
Taylor, M. E. & Stone, P. 2011. An introduction to intertask transfer for reinforcement learning. AI Magazine 32(1), 15.CrossRefGoogle Scholar
Taylor, M. E., Stone, P. & Liu, Y. 2007. Transfer learning via inter-task mappings for temporal difference learning. Journal of Machine Learning Research 8, 21252167.Google Scholar
Tzeng, E., et al. 2017. Adversarial discriminative domain adaptation. Computer Vision and Pattern Recognition (CVPR) 1(2), 4.Google Scholar