دانلود مقاله انگلیسی رایگان:XCS با مدل سازی حریف برای یادگیرنده تقویتی همزمان - 2020
دانلود بهترین مقالات isi همراه با ترجمه فارسی
دانلود مقاله انگلیسی یادگیری تقویتی رایگان
  • XCS with opponent modelling for concurrent reinforcement learners XCS with opponent modelling for concurrent reinforcement learners
    XCS with opponent modelling for concurrent reinforcement learners

    سال انتشار:

    2020


    عنوان انگلیسی مقاله:

    XCS with opponent modelling for concurrent reinforcement learners


    ترجمه فارسی عنوان مقاله:

    XCS با مدل سازی حریف برای یادگیرنده تقویتی همزمان


    منبع:

    Sciencedirect - Elsevier - Neurocomputing, 399 (2020) 449-466. doi:10.1016/j.neucom.2020.02.118


    نویسنده:

    Hao ChenHao Chen, Chang Wang, Jian Huang ∗, Jiangtao Kong, Hanqiang Deng


    چکیده انگلیسی:

    Reinforcement learning (RL) of optimal policies against an opponent agent also with learning capabil- ity is still challenging in Markov games. A variety of algorithms have been proposed for solving this problem such as the traditional Q-learning-based RL (QbRL) algorithms as well as the state-of-the-art neural-network-based RL (NNbRL) algorithms. However, the QbRL approaches have poor generalization capability for complex problems with non-stationary opponents, while the learned policies by NNbRL al- gorithms are lack of explainability and transparency. In this paper, we propose an algorithm X-OMQ( λ) that integrates eXtended Classifier System (XCS) with opponent modelling for concurrent reinforcement learners in zero-sum Markov Games. The algorithm can learn general, accurate, and interpretable action selection rules and allow policy optimization using the genetic algorithm (GA). Besides, the X-OMQ( λ) agent optimizes the established opponent’s model while simultaneously learning to select actions in a goal-directed manner. In addition, we use the eligibility trace mechanism to further speed up the learn- ing process. In the reinforcement component, not only the classifiers in the action set are updated, but other relevant classifiers are also updated in a certain proportion. We demonstrate the performance of the proposed algorithm in the hunter prey problem and two adversarial soccer scenarios where the op- ponent is allowed to learn with several benchmark QbRL and NNbRL algorithms. The results show that our method has similar learning performance with the NNbRL algorithms while our method requires no prior knowledge of the opponent or the environment. Moreover, the learned action selection rules are also interpretable while having generalization capability.
    Keywords: Opponent modelling | XCS | Markov games | Reinforcement learning


    سطح: متوسط
    تعداد صفحات فایل pdf انگلیسی: 18
    حجم فایل: 4553 کیلوبایت

    قیمت: رایگان


    توضیحات اضافی:




اگر این مقاله را پسندیدید آن را در شبکه های اجتماعی به اشتراک بگذارید (برای به اشتراک گذاری بر روی ایکن های زیر کلیک کنید)

تعداد نظرات : 0

الزامی
الزامی
الزامی
rss مقالات ترجمه شده rss مقالات انگلیسی rss کتاب های انگلیسی rss مقالات آموزشی
logo-samandehi