عنوان انگلیسی مقاله:
Reinforcement learning for quadrupedal locomotion with design of continual–hierarchical curriculum
ترجمه فارسی عنوان مقاله:
یادگیری تقویتی برای حرکت چهار پا با طراحی برنامه درسی مداوم - سلسله مراتبی
Sciencedirect - Elsevier - Engineering Applications of Artificial Intelligence, 95 (2020) 103869. doi:10.1016/j.engappai.2020.103869
Taisuke Kobayashi ∗, Toshiki Sugino
End-to-end reinforcement learning is a promising approach to enable robots to acquire complicated skills.
However, this requires numerous samples to be implemented successfully. The issue is that it is often difficult
to collect the sufficient number of samples. To accelerate learning in the field of robotics, knowledge gathered
from robotics engineering and previously learned tasks must be fully exploited. Specifically, we propose using
a sample-efficient curriculum to establish quadrupedal robot control in which the walking and turning tasks
are divided into two hierarchical layers, and a robot learns them incrementally from lower to upper layers. To
develop such a curriculum, two core components are designed. First the fractal design of neural networks in
reservoir computing is aimed at allocating the tasks to be learned to respective modules in fractal networks.
This allows mitigating the problem of catastrophic forgetting in neural networks and achieves the capability of
continuous learning. The second task includes hierarchical task decomposition according to robotics knowledge
for controlling legged robots. Owing to the combination of these two components, the proposed curriculum
enables a robot to tune the lower layer even when the upper layer is optimized. As a result of implementing
the proposed design, we confirm that a quadrupedal robot in a dynamical simulator succeeds in learning skills
hierarchically according to the given curriculum, starting from moving legs and finally, walking/turning, unlike
the considered conventional curriculums that are unable to achieve such results.
Keywords: Continual learning | Curriculum learning | Hierarchical learning | Reservoir computing | Fractal network