دانلود مقاله انگلیسی رایگان:توصیه به یادگیری تقویتی به سمت عوامل مقیاس گذاری در محیط کنترل مداوم با جوایز ناچیز - 2020
دانلود بهترین مقالات isi همراه با ترجمه فارسی
دانلود مقاله انگلیسی یادگیری تقویتی رایگان
  • Advising reinforcement learning toward scaling agents in continuous control environments with sparse rewards Advising reinforcement learning toward scaling agents in continuous control environments with sparse rewards
    Advising reinforcement learning toward scaling agents in continuous control environments with sparse rewards

    سال انتشار:

    2020


    عنوان انگلیسی مقاله:

    Advising reinforcement learning toward scaling agents in continuous control environments with sparse rewards


    ترجمه فارسی عنوان مقاله:

    توصیه به یادگیری تقویتی به سمت عوامل مقیاس گذاری در محیط کنترل مداوم با جوایز ناچیز


    منبع:

    Sciencedirect - Elsevier - Engineering Applications of Artificial Intelligence, 90 (2020) 103515. doi:10.1016/j.engappai.2020.103515


    نویسنده:

    Hailin Ren, Pinhas Ben-Tzvi


    چکیده انگلیسی:

    This paper adapts the success of the teacher–student framework for reinforcement learning to a continuous control environment with sparse rewards. Furthermore, the proposed advising framework is designed for the scaling agents problem, wherein the student policy is trained to control multiple agents while the teacher policy is well trained for a single agent. Existing research on teacher–student frameworks have been focused on discrete control domain. Moreover, they rely on similar target and source environments and as such they do not allow for scaling the agents. On the other hand, in this work the agents face a scaling agents problem where the value functions of the source and target task converge at different rates. Existing concepts from the teacher– student framework are adapted to meet new challenges including early advising, importance of advising, and mistake correction, but a modified heuristic was used to decide on when to teach. The performance of the proposed algorithm was evaluated using the case study of pushing, and picking and placing objects with a dual arm manipulation system. The teacher policy was trained using a simulated scenario consisting of a single arm. The student policy was trained to handle the dual arm manipulation system in simulation under the advice of the teacher agent. The trained student policy was then validated using two Quanser Mico arms for experimental demonstration. The effects of varying parameters on the student performance in the advising framework was also analyzed and discussed. The results showed that the proposed advising framework expedited the training process and achieved the desired scaling within a limited advising budget.
    Keywords: Reinforcement learning | Advising framework | Continuous control | Sparse reward | Multi-agent


    سطح: متوسط
    تعداد صفحات فایل pdf انگلیسی: 12
    حجم فایل: 2282 کیلوبایت

    قیمت: رایگان


    توضیحات اضافی:




اگر این مقاله را پسندیدید آن را در شبکه های اجتماعی به اشتراک بگذارید (برای به اشتراک گذاری بر روی ایکن های زیر کلیک کنید)

تعداد نظرات : 0

الزامی
الزامی
الزامی
rss مقالات ترجمه شده rss مقالات انگلیسی rss کتاب های انگلیسی rss مقالات آموزشی
logo-samandehi