Improvement of Deep Reinforcement Learning Using Curriculum in Game Environment

Document Type : Original Articles

Authors

1 PhD Student, Department of Artificial Intelligence, School of Electrical and Computer Engineering, University of Semnan, Semnan, Iran

2 Assistant Professor, Department of Software Engineering, School of Electrical and Computer Engineering, University of Semnan, Semnan, Iran

3 Assistant Professor, Department of Software Engineering,, School of Electrical and Computer Engineering, University of Semnan, Semnan, Iran

10.22122/jrrs.v15i1.3446

Abstract

Introduction: Training deep curriculum learning is a kind of smart agent training in which, first the simple acts, and then, the difficult acts are trained to smart agent. In this study, we proposed a new framework for training deep curriculum learning to defense-based game in particular Dragon Cave.Materials and Methods: Deep reinforcement learning approach with curriculum learning was used to train an intelligent agent in the game Dragon Cave. Curriculum learning paradigm started from simple tasks, and then gradually tried harder ones. Using Proximal Policy Optimization, the intelligent agents were trained in various environments, once in a curriculum-learning environment, and once in an environment without curriculum learning. Then, they started the game in the same environment.Results: The improvement of the agent was observed with deep curriculum reinforcement learning.Conclusion: It seems that the deep curriculum reinforcement learning increases the rate and the quality of intelligent agent training in complex environment of strategic games.

Keywords

  1. Arulraj JP. Adaptive agent generation using machine learning for dynamic difficulty adjustment. Proceedings of the 2010 International Conference on Computer and Communication Technology (ICCCT). 2019 Sep 17-19; Allahabad, Uttar Pradesh, India. p. 746-51.
  2. Mohammadnejad M, Yaghmaee F. Design of Intelligent agent with deep reinforcement learning in game enviroment. Proceedings of the 4th National and 2nd International Conference on Computer Games, Challenge and Opportunities; 2019 Feb 21; Kashan, Iran. p. 1-16. [In Persian].
  3. Wu Y, Tian Y. Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning. Proceedings of the International Conference on Learning Representations, ICLR 2017; 2017 Apr 24-26; Toulon, France. p. 1-10.
  4. Bengio Y, Louradour J, Collobert R, Weston J. Curriculum learning. Proceedings of the 26th Annual International Conference on Machine Learning (ICML 2009); 2009 Jun 14-18; Montreal, Canada. p. 41-8.
  5. Gong C, Tao D, Maybank SJ, Liu W, Kang G, Yang J. Multi-modal curriculum learning for semi-supervised image classification. IEEE T Image Process 2016; 25(7): 3249-60.
  6. Francois-Lavet V, Henderson P, Islam R, Bellemare MG, Pineau J. An introduction to deep reinforcement learning. Foundations and Trends in Machine Learning 2018; 11(3-4): 219-354.
  7. Tesauro G. Temporal difference learning and TD-Gammon. Communications of the ACM 1995; 38(3): 58-68.
  8. Narendra KS, Parthasarathy K. Identification and control of dynamical systems using neural networks. IEEE Transactions on Neural Networks 1990; 1(1): 4-27.
  9. Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M. Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602. 2013.
  10. Lotter W, Sorensen G, Cox D. A Multi-scale CNN and Curriculum Learning Strategy for Mammogram Classification. Cham, Switzerland: Springer International Publishing; 2017 p. 169-77.
  11. Silver D, Huang A, Maddison CJ, Guez A, Sifre L, van den Driessche G, et al. Mastering the game of Go with deep neural networks and tree search. Nature 2016; 529(7587): 484-9.
  12. Xie Z, Fu X, Yu J. AlphaGomoku: An AlphaGo-based Gomoku Artificial Intelligence using Curriculum Learning. arXiv, abs/1809.10595. 2018
  13. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM 2012; 60 (6): 1097–1105.
  14. Dahl GE, Yu D, Deng L, Acero A. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing 2012; 20(1): 30-42.
  15. Tuan YL, Zhang J, Li Y, Lee HY. Proximal policy optimization and its dynamic version for sequence generation. arXiv: 1808.07982. 2018.
  16. Mohammadnejad M. Dragon Cave, a strategy game [Online]. [cited 2020 Feb 20]; Available from: URL: https://cafebazaar.ir/app/ir.sinsin.DragonCave.v_0/?l=en, developed by M. Mohammadnejad
  17. Sukhbaatar S, Lin Z, Kostrikov I, Synnaeve G, Szlam A, Fergus R. Intrinsic motivation and automatic curricula via asymmetric self-play. 2018. Proceedings of the 6th International Conference on Learning Representations, ICLR 2018; 2018 Apr 30-May 3; Vancouver, Canada.
  18. Justesen N, Torrado RR, Bontrager P, Khalifa A, Togelius J, Risi S. Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation. arXiv: 1806.10729 [cs.LG]. 2018.