Learn multi-step object sorting tasks through deep reinforcement learning

Jiatong Bao; Guoqing Zhang; Yi Peng; Zhiyu Shao; Aiguo Song

doi:10.1017/S0263574722000650

Learn multi-step object sorting tasks through deep reinforcement learning

Published online by Cambridge University Press: 06 May 2022

Jiatong Bao

Guoqing Zhang ,

Yi Peng ,

Zhiyu Shao and

Aiguo Song

Show author details

Jiatong Bao*: Affiliation:
School of Electrical, Energy and Power Engineering, Yangzhou University, Yangzhou 225000, China School of Instrument Science and Engineering, Southeast University, Nanjing 210000, China
Guoqing Zhang: Affiliation:
School of Electrical, Energy and Power Engineering, Yangzhou University, Yangzhou 225000, China
Yi Peng: Affiliation:
School of Electrical, Energy and Power Engineering, Yangzhou University, Yangzhou 225000, China
Zhiyu Shao: Affiliation:
School of Electrical, Energy and Power Engineering, Yangzhou University, Yangzhou 225000, China
Aiguo Song: Affiliation:
School of Instrument Science and Engineering, Southeast University, Nanjing 210000, China
*: *Corresponding author. E-mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Robotic systems are usually controlled to repetitively perform specific actions for manufacturing tasks. The traditional control methods are domain-dependent and model-dependent with cost of much human efforts. They cannot meet the new requirements of generality and flexibility in many areas such as intelligent manufacturing and customized production. This paper develops a general model-free approach to enable robots to perform multi-step object sorting tasks through deep reinforcement learning. Taking projected heightmap images from different time steps as input without extra high-level image analysis and understanding, critic models are designed to produce a pixel-wise Q value map for each type of action. It is a new trial to apply pixel-wise Q value-based critic networks to solve multi-step sorting tasks that involve many types of actions and complex action constraints. The experimental validations on simulated and realistic object sorting tasks demonstrate the effectiveness of the proposed approach. Qualitative results (videos), code for simulated and realistic experiments, and pre-trained models are available at https://github.com/JiatongBao/DRLSorting

Keywords

object sorting deep reinforcement learning vision-based robotic manipulation

Type: Research Article
Information: Robotica , Volume 40 , Issue 11 , November 2022 , pp. 3878 - 3894

DOI: https://doi.org/10.1017/S0263574722000650 [Opens in a new window]
Copyright: © The Author(s), 2022. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Jia, Y., She, L., Cheng, Y., Bao, J., Chai, J. Y. and Xi, N., “Program Robots Manufacturing Tasks By Natural Language Instructions,” In: IEEE International Conference on Automation Science and Engineering, Fort Worth, TX, USA (2016) pp. 633–638.Google Scholar

Ma, Y., Du, K., Zhou, D., Zhang, J., Liu, X. and Xu, D., “Automatic precision robot assembly system with microscopic vision and force sensor,” Int. J. Adv. Robot. Syst. 16(3), 172988141985161 (2019).CrossRef Google Scholar

Rónai, L. and Szabó, T., “Snap-fit assembly process with industrial robot including force feedback,” Robotica 38(2), 317–336 (2020).10.1017/S0263574719000614CrossRef Google Scholar

Laursen, J. S., Ellekilde, L.-P. and Schultz, U. P., “Modelling reversible execution of robotic assembly,” Robotica 36(5), 625–654 (2018).10.1017/S0263574717000613CrossRef Google Scholar

Nicola, G., Tagliapietra, L., Tosello, E., Navarin, N., Ghidoni, S. and Menegatti, E., “Robotic Object Sorting Via Deep Reinforcement Learning: A Generalized Approach,” In: IEEE International Conference on Robot and Human Interactive Communication, Naples, Italy (2020) pp. 1266–1273.Google Scholar

Haarnoja, T., Pong, V. H., Zhou, A., Dalal, M., Abbeel, P. and Levine, S., “Composable Deep Reinforcement Learning for Robotic Manipulation,” In: IEEE International Conference on Robotics and Automation, Brisbane, QLD, Australia (2018) pp. 6244–6251.Google Scholar

Zeng, A., Song, S., Welker, S., Lee, J., Rodriguez, A. and Funkhouser, T., “Learning Synergies Between Pushing and Grasping with Self-Supervised Deep Reinforcement Learning,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems, Madrid, Spain (2018) pp. 633–638.Google Scholar

Kim, D., Li, A. and Lee, J., “Stable robotic grasping of multiple objects using deep neural networks,” Robotica 39(4), 735–748 (2021).CrossRef Google Scholar

Yang, Y., Ni, Z., Gao, M., Zhang, J. and Tao, D., “Collaborative pushing and grasping of tightly stacked objects via deep reinforcement learning,” IEEE/CAA J. Automat. Sin. 9(1), 135–145 (2021).CrossRef Google Scholar

Añazco, E. V., Lopez, P. R., Park, N., Oh, J., Ryu, G., Al-antari, M. A. and Kim, T.-S., “Natural object manipulation using anthropomorphic robotic hand through deep reinforcement learning and deep grasping probability network,” Appl. Intell. 51(2), 1041–1055 (2021).CrossRef Google Scholar

Kim, T., Park, Y., Kim, J. B., Park, Y. and Suh, I. H., “Development of an actor-critic deep reinforcement learning platform for robotic grasping in real world,” J. Korea Robot. Soc. 15(2), 197–204 (2020).CrossRef Google Scholar

Chen, P. and Lu, W., “Deep reinforcement learning based moving object grasping,” Inform. Sci. 565(1), 62–76 (2021).CrossRef Google Scholar

Tang, B., Corsaro, M., Konidaris, G., Nikolaidis, S. and Tellex, S., “Learning Collaborative Pushing and Grasping Policies in Dense Clutter,” In: International Conference on Robotics and Automation, Xi’an, China (2021) pp. 6177–6184.Google Scholar

Joshi, S., Kumra, S. and Sahin, F., “Robotic Grasping Using Deep Reinforcement Learning,” In: IEEE International Conference on Automation Science and Engineering, Hong Kong, China (2020) pp. 1461–1466.Google Scholar

Chen, Y., Ju, Z. and Yang, C., “Combining Reinforcement Learning and Rule-Based Method to Manipulate Objects in Clutter,” In: International Joint Conference on Neural Networks, Glassgow, UK (2020) pp. 1–6.Google Scholar

Hundt, A., Killeen, B., Greene, N., Wu, H., Kwon, H., Paxton, C. and Hager, G., ““Good robot!”: Efficient reinforcement learning for multi-step visual tasks with sim to real transfer,” IEEE Robot. Automat. Lett. 5(4), 6724–6731 (2020).CrossRef Google Scholar

Yang, X., Ji, Z., Wu, J., Lai, Y.-K., Wei, C., Liu, G. and Setchi, R., “Hierarchical reinforcement learning with universal policies for multistep robotic manipulation,” IEEE Trans. Neur. Netw. Learn. Syst., 1–15 (2021).Google Scholar

Liu, D., Lu, B., Cong, M., Yu, H., Zou, Q. and Du, Y., “Robotic manipulation skill acquisition via demonstration policy learning,” IEEE Trans. Cognit. Develop. Syst. 1–1 (2021).Google Scholar

Schoettler, G., Nair, A., Luo, J., Bahl, S., Ojea, J., Solowjow, E. and Levine, S., “Deep Reinforcement Learning for Industrial Insertion Tasks with Visual Inputs and Natural Rewards,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems, Las Vegas, NV, USA (2020) pp. 5548–5555.Google Scholar

Li, F., Jiang, Q., Zhang, S., Wei, M. and Song, R., “Robot skill acquisition in assembly process using deep reinforcement learning,” Neurocomputing 345(8), 92–102 (2019).CrossRef Google Scholar

Luo, J., Solowjow, E., Wen, C., Ojea, J. A., Agogino, A., Tamar, A. and Abbeel, P., “Reinforcement Learning on Variable Impedance Controller for High-Precision Robotic Assembly,” In: International Conference on Robotics and Automation, Montreal, QC, Canada (2019) pp. 3080–3087.Google Scholar

Beltran-Hernandez, C. C., Petit, D., Ramirez-Alpizar, I. G. and Harada, K., “Variable compliance control for robotic peg-in-hole assembly: A deep-reinforcement-learning approach,” Appl. Sci. 10(19), 6923–6939 (2020).CrossRef Google Scholar

Wang, Y., Zhu, S., Zhang, Q., Zhou, R., Dou, R., Sun, H., Yao, Q., Xu, M. and Zhang, Y., “A visual grasping strategy for improving assembly efficiency based on deep reinforcement learning,” J. Sens. 1, 2021–2011 (2021).Google Scholar

Ceola, F., Tosello, E., Tagliapietra, L., Nicola, G. and Ghidoni, S., “Robot Task Planning Via Deep Reinforcement Learning: A Tabletop Object Sorting Application,” In: IEEE International Conference on Systems, Man and Cybernetics, Bari, Italy (2019) pp. 486–492.Google Scholar

Mnih, V., Kavukcuoglu, K. and etal, D. S., “Human-level control through deep reinforcement learning,” Nature 518(7540), 529–533 (2019).CrossRef Google Scholar

Hellaby, W. C. J. C., Learning From Delayed Rewards, PhD Thesis (University of Cambridge, 1989).Google Scholar

Huang, G., Liu, Z. and Weinberger, K. Q., “Densely Connected Convolutional Networks,” In: IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA (2017) pp. 2261–2269.Google Scholar

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K. and Fei-Fei, L., “Imagenet: A Large-Scale Hierarchical Image Database,” In: IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA (2009) pp. 248–255.Google Scholar

Nair, V. and Hinton, G. E., “Rectified Linear Units Improve Restricted Boltzmann Machines,” In: Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel (2010) pp. 807–814.Google Scholar

Ioffe, S. and Szegedy, C., “Batch Normalization: Accelerating Deep Network Training By Reducing Internal Covariate Shift,” In: International Conference on Machine Learning, Lille, France (2015) pp. 448–456.Google Scholar

Schaul, T., Quan, J., Antonoglou, I. and Silver, D., “Prioritized Experience Replay,” In: International Conference on Learning Representations, Caribe Hilton, San Juan, Puerto Rico (2016).Google Scholar

Rohmer, E., Singh, S. P. N. and Freese, M., “V-rep: A Versatile and Scalable Robot Simulation Framework,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan (2013) pp. 1321–1326.Google Scholar

Article contents

Learn multi-step object sorting tasks through deep reinforcement learning

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests