دانلود و نمایش مقالات مرتبط با آموزش تقویتی::صفحه 1
دانلود بهترین مقالات isi همراه با ترجمه فارسی
نتیجه جستجو - آموزش تقویتی

تعداد مقالات یافته شده: 5
ردیف عنوان نوع
1 A self-learning genetic algorithm based on reinforcement learning for flexible job-shop scheduling problem
یک الگوریتم ژنتیک خودآموز مبتنی بر یادگیری تقویتی برای مسئله زمان بندی انعطاف پذیر مشاغل فروشگاهی -2020
As an important branch of production scheduling, flexible job-shop scheduling problem (FJSP) is difficult to solve and is proven to be NP-hard. Many intelligent algorithms have been proposed to solve FJSP, but their key parameters cannot be dynamically adjusted effectively during the calculation process, which causes the solution efficiency and quality not being able to meet the production requirements. Therefore, a self-learning genetic algorithm (SLGA) is proposed in this paper, in which genetic algorithm (GA) is adopted as the basic optimization method and its key parameters are intelligently adjusted based on reinforcement learning (RL). Firstly, the selflearning model is analyzed and constructed in SLGA, SARSA algorithm and Q-Learning algorithm are applied as the learning methods at initial and later stages of optimization, respectively, and the conversion condition is designed. Secondly, the state determination method and reward method are designed for RL in GA environment. Finally, the learning effect and performance of SLGA in solving FJSP are compared with other algorithms using two groups of benchmark data instances with different scales. Experiment results show that the proposed SLGA significantly outperforms its competitors in solving FJSP.
Keywords: Flexible job-shop scheduling problem (FJSP) | Self-learning genetic algorithm (SLGA) | Genetic algorithm (GA) | Reinforcement learning (RL)
مقاله انگلیسی
2 An adaptive deep reinforcement learning approach for MIMO PID control of mobile robots
یک رویکرد یادگیری تقویتی عمیق سازگار برای کنترل MIMO PID ربات های موبایل-2020
Intelligent control systems are being developed for the control of plants with complex dynamics. However, the simplicity of the PID (proportional–integrative–derivative) controller makes it still widely used in industrial applications and robotics. This paper proposes an intelligent control system based on a deep reinforcement learning approach for self-adaptive multiple PID controllers for mobile robots. The proposed hybrid control strategy uses an actor–critic structure and it only receives low-level dynamic information as input and simultaneously estimates the multiple parameters or gains of the PID controllers. The proposed approach was tested in several simulated environments and in a real time robotic platform showing the feasibility of the approach for the low-level control of mobile robots. From the simulation and experimental results, our proposed approach demonstrated that it can be of aid by providing with behavior that can compensate or even adapt to changes in the uncertain environments providing a model free unsupervised solution. Also, a comparative study against other adaptive methods for multiple PIDs tuning is presented, showing a successful performance of the approach.
Keywords: Reinforcement learning | Adaptive control | Policy gradient | Mobile robots | Multi-platforms
مقاله انگلیسی
3 Integrating reinforcement learning and skyline computing for adaptive service composition
یکپارچه سازی یادگیری تقویت و محاسبات خط افقی برای ترکیب خدمات سازگار-2020
In service computing, combining multiple services through service composition to address complex user requirements has become a popular research topic. QoS-aware service com- position aims to find the optimal composition scheme with the QoS attributes that best match user requirements. However, certain QoS attributes may continuously change in a dynamic service environment, so service composition methods need to be adaptive. Fur- thermore, the large number of candidate services poses a key challenge for service com- position, where existing service composition approaches based on reinforcement learning (RL) suffer from low efficiency. To deal with the problems above, in this paper, a new ser- vice composition approach is proposed which combines RL with skyline computing where the latter is used for reducing the search space and computational complexity. A WSC- MDP model is proposed to solve the large-scale service composition within a dynamically changing environment. To verify the proposed method, a series of comparative experi- ments are conducted, and the experimental results demonstrate the effectiveness, scala- bility and adaptability of the proposed approach.
Keywords: Service composition | QoS | Reinforcement learning | Skyline computing | Adaptability
مقاله انگلیسی
4 Motion control of a space manipulator using fuzzy sliding mode control with reinforcement learning
کنترل حرکت یک مکانیزم فضا با استفاده از کنترل حالت کشویی فازی با یادگیری تقویتی-2020
The free-flying space manipulators present challenges in controlling the motions of both the spacecraft bus and the manipulator, because of the highly-coupling system dynamics and the unknown space environment disturbances. Although the sliding mode controllers are robust to the unknown disturbances and system uncertainties, the chattering effect could affect the pointing accuracy and the lifetime of the actuators. This paper first introduces the dynamics of a CuBot, which is a 3-rigid-link manipulator based on the CubeSat platform. To maintain the robustness while decreasing the chattering effect, an innovative reinforcement learning based fuzzy adaptive sliding mode controller is proposed. To maintain the robustness while reducing the chattering effect, an innovative reinforcement learning based fuzzy adaptive sliding mode controller is proposed. The switching gain is updated to estimate the lumped upper bound of the system uncertainties and the unknown disturbances, and then a new fuzzy logic adaptive law is applied on the switching gain to decrease the chattering effects. On top of that, the fuzzy logic rules are tuned by an innovative modified reinforcement learning mechanism to achieve the better tracking performance. The uniformly ultimately bounded tracking errors are guaranteed by the proposed control scheme, and the effectiveness is validated by the simulation results.
Keywords: CubeSat | Fuzzy logic inference | Reinforcement learning | Sliding mode control | Space manipulator
مقاله انگلیسی
5 تعادل بار و بهینه سازی مشترک انتقال در شبکه های LTE با استفاده از منطق فازی و آموزش تقویت
سال انتشار: 2015 - تعداد صفحات فایل pdf انگلیسی: 14 - تعداد صفحات فایل doc فارسی: 45
با گسترش شبکه های تلفن همراه، اپراتورها باید تلاش های قابل توجهی را برای مدیریت شبکه اختصاص دهند. در نتیجه، شبکه های خود سازمانده (SONs) به طور فزاینده ای اهمیت یافته اند تا سطح عملیات خودکار در تکنولوژی های سلولی افزایش یابد. در این زمینه، تعادل بار (LB) و بهینه سازی انتقال (HOO) توسط صنعت به عنوان مکانیسم اصلی خود سازماندهی برای شبکه های دسترسی رادیویی (RAN) شناخته شده است. با این حال، بیشتر تلاش ها برای ایجاد یک نهاد مستقل برای هر مکانیزم خود سازماندهی شده است، که به صورت موازی با سایر نهادها اجرا می شود، و نیز طراحی سازه های هماهنگی برای تثبیت شبکه به طور کلی، متمرکز شده است. با توجه به اهمیت LB و HOO در این مقاله، یک مکانیزم یکپارچه سازی خودمختار مبتنی بر منطق فازی و آموزش تقویتی پیشنهاد شده است. به طور خاص، الگوریتم پیشنهاد شده، پارامترهای انتقال را تغییر می دهد تا بهینه سازی شاخص های کلیدی عملکرد کل مرتبط با LB و HOO شود. نتایج نشان می دهد که طرح پیشنهادی به طور موثر عملکرد بهتری از نهاد های مستقل را که همزمان در شبکه فعالیت می کنند، فراهم می کند.
کلمات کلیدی: تعادل بار | انتقال دادن | شبکه های خودسازمانده | تکامل بلند مدت | منطق فازی | تقویت یادگیری
مقاله ترجمه شده
rss مقالات ترجمه شده rss مقالات انگلیسی rss کتاب های انگلیسی rss مقالات آموزشی
logo-samandehi