Optimal carbon storage reservoir management through deep reinforcement learning
مدیریت بهینه ذخیره مخزن کربن از طریق یادگیری تقویتی عمیق-2020
Model-based optimization plays a central role in energy system design and management. The complexity and high-dimensionality of many process-level models, especially those used for geosystem energy exploration and utilization, often lead to formidable computational costs when the dimension of decision space is also large. This work adopts elements of recently advanced deep learning techniques to solve a sequential decisionmaking problem in applied geosystem management. Specifically, a deep reinforcement learning framework was formed for optimal multiperiod planning, in which a deep Q-learning network (DQN) agent was trained to maximize rewards by learning from high-dimensional inputs and from exploitation of its past experiences. To expedite computation, deep multitask learning was used to approximate high-dimensional, multistate transition functions. Both DQN and deep multitask learning are pattern based. As a demonstration, the framework was applied to optimal carbon sequestration reservoir planning using two different types of management strategies: monitoring only and brine extraction. Both strategies are designed to mitigate potential risks due to pressure buildup. Results show that the DQN agent can identify the optimal policies to maximize the reward for given risk and cost constraints. Experiments also show that knowledge the agent gained from interacting with one environment is largely preserved when deploying the same agent in other similar environments.
Keywords: Reinforcement learning | Multistage decision-making | Deep autoregressive model | Deep Q network | Surrogate modeling | Markov decision process | Geological carbon sequestration
Towards optimal control of air handling units using deep reinforcement learning and recurrent neural network
به سمت کنترل بهینه واحدهای مدیریت هوا با استفاده از یادگیری تقویتی عمیق و شبکه عصبی بازگشتی -2020
A new generation of smart stormwater systems promises to reduce the need for new construction by enhancing the performance of the existing infrastructure through real-time control. Smart stormwater systems dynamically adapt their response to individual storms by controlling distributed assets, such as valves, gates, and pumps. This paper introduces a real-time control approach based on Reinforcement Learning (RL), which has emerged as a state-of-the-art methodology for autonomous control in the artificial intelligence community. Using a Deep Neu- ral Network, a RL-based controller learns a control strategy by interacting with the system it controls - effectively trying various control strategies until converging on those that achieve a desired objective. This paper formulates and implements a RL algorithm for the real-time control of urban stormwater systems. This algorithm trains a RL agent to control valves in a distributed stormwater system across thousands of simulated storm scenarios, seeking to achieve water level and flow set-points in the system. The algorithm is first evaluated for the control of an individual stormwater basin, after which it is adapted to the control of multiple basins in a larger watershed (4 km 2 ). The results indicate that RL can very effectively control individual sites. Performance is highly sensitive to the reward formulation of the RL agent. Generally, more explicit guidance led to better control performance, and more rapid and stable convergence of the learning process. While the control of multiple distributed sites also shows promise in reducing flooding and peak flows, the complexity of controlling larger systems comes with a number of caveats. The RL controller’s performance is very sensitive to the formulation of the Deep Neural Network and requires a significant amount of computational resource to achieve a reasonable performance en- hancement. Overall, the controlled system significantly outperforms the uncontrolled system, especially across storms of high intensity and duration. A frank discussion is provided, which should allow the benefits and draw- backs of RL to be considered when implementing it for the real-time control of stormwater systems. An open source implementation of the full simulation environment and control algorithms is also provided.
Keywords: Real-time control | Reinforcement learning | Smart stormwater systems
Reinforcement learning based adaptive power pinch analysis for energy management of stand-alone hybrid energy storage systems considering uncertainty
تجزیه و تحلیل جایگزین قدرت تطبیقی مبتنی بر یادگیری تقویتی برای مدیریت انرژی سیستم های ذخیره سازی انرژی ترکیبی مستقل با توجه به عدم اطمینان-2020
Hybrid energy storage systems (HESS) involve synergies between multiple energy storage technologies with complementary operating features aimed at enhancing the reliability of intermittent renewable energy sources (RES). Nevertheless, coordinating HESS through optimized energy management strategies (EMS) introduces complexity. The latter has been previously addressed by the authors through a systems-level graphical EMS via Power Pinch Analysis (PoPA). Although of proven efficiency, accounting for uncertainty with PoPA has been an issue, due to the assumption of a perfect day ahead (DA) generation and load profiles forecast. This paper proposes three adaptive PoPA-based EMS, aimed at negating load demand and RES stochastic variability. Each method has its own merits such as; reduced computational complexity and improved accuracy depending on the probability density function of uncertainty. The first and simplest adaptive scheme is based on a receding horizon model predictive control framework. The second employs a Kalman filter, whereas the third is based on a machine learning algorithm. The three methods are assessed on a real isolated HESS microgrid built in Greece. In validating the proposed methods against the DA PoPA, the proposed methods all performed better with regards to violation of the energy storage operating constraints and plummeting carbon emission footprint.
Keywords: Hybrid energy storage systems | Energy management strategies | Model predictive control | Kalman filter | Reinforcement learning
Rule-interposing deep reinforcement learning based energy management strategy for power-split hybrid electric vehicle
استراتژی مدیریت انرژی مبتنی بر یادگیری تقویتی عمیق قانون برای خودروی الکتریکی هیبریدی تقسیم برق-2020
The optimization and training processes of deep reinforcement learning (DRL) based energy management strategy (EMS) can be very slow and resource-intensive. In this paper, an improved energy management framework that embeds expert knowledge into deep deterministic policy gradient (DDPG) is proposed. Incorporated with the battery characteristics and the optimal brake specific fuel consumption (BSFC) curve of hybrid electric vehicles (HEVs), we are committed to solving the optimization problem of multi-objective energy management with a large space of control variables. By incorporating this prior knowledge, the proposed framework not only accelerates the learning process, but also gets a better fuel economy, thus making the energy management system relatively stable. The experimental results show that the proposed EMS outperforms the one without prior knowledge and the other state-of-art deep reinforcement learning approaches. In addition, the proposed approach can be easily generalized to other types of HEV EMSs.
Keywords: Energy management strategy | Hybrid electric vehicle | Expert knowledge | Deep deterministic policy gradient | Continuous action space
Deep reinforcement learning based energy management for a hybrid electric vehicle
مدیریت انرژی مبتنی بر یادگیری تقویت عمیق برای یک وسیله نقلیه الکتریکی هیبریدی-2020
This research proposes a reinforcement learning-based algorithm and a deep reinforcement learningbased algorithm for energy management of a series hybrid electric tracked vehicle. Firstly, the powertrain model of the series hybrid electric tracked vehicle (SHETV) is constructed, then the corresponding energy management formulation is established. Subsequently, a new variant of reinforcement learning (RL) method Dyna, namely Dyna-H, is developed by combining the heuristic planning step with the Dyna agent and is applied to energy management control for SHETV. Its rapidity and optimality are validated by comparing with DP and conventional Dyna method. Facing the problem of the “curse of dimensionality” in the reinforcement learning method, a novel deep reinforcement learning algorithm deep Qlearning (DQL) is designed for energy management control, which uses a new optimization method (AMSGrad) to update the weights of the neural network. Then the proposed deep reinforcement learning control system is trained and verified by the realistic driving condition with high-precision, and is compared with the benchmark method DP and the traditional DQL method. Results show that the proposed deep reinforcement learning method realizes faster training speed and lower fuel consumption than traditional DQL policy does, and its fuel economy quite approximates to global optimum. Furthermore, the adaptability of the proposed method is confirmed in another driving schedule.
Keywords: Hybrid electric tracked vehicle | Energy management | Dyna-H | Deep reinforcement learning | AMSGrad optimizer
A new hybrid ensemble deep reinforcement learning model for wind speed short term forecasting
یک مدل یادگیری تقویتی عمیق گروه ترکیبی جدید برای پیش بینی کوتاه مدت سرعت باد-2020
Wind speed forecasting is a promising solution to improve the efficiency of energy utilization. In this study, a novel hybrid wind speed forecasting model is proposed. The whole modeling process of the proposed model consists of three steps. In stage I, the empirical wavelet transform method reduces the non-stationarity of the original wind speed data by decomposing the original data into several subseries. In stage II, three kinds of deep networks are utilized to build the forecasting model and calculate prediction results of all sub-series, respectively. In stage III, the reinforcement learning method is used to combine three kinds of deep networks. The forecasting results of each sub-series are combined to obtain the final forecasting results. By comparing all the results of the predictions over three different types of wind speed series, it can be concluded that: (a) the proposed reinforcement learning based ensemble method is effective in integrating three kinds of deep network and works better than traditional optimization based ensemble method; (b) the proposed ensemble deep reinforcement learning based wind speed prediction model can get accurate results in all cases and provide the best accuracy compared with sixteen alternative models and three state-of-the-art models.
Keywords: Wind speed forecasting | Ensemble deep reinforcement learning | Empirical wavelet transform | Hybrid wind speed forecasting model
Modified deep learning and reinforcement learning for an incentive-based demand response model
یادگیری عمیق اصلاح شده و یادگیری تقویتی برای یک مدل پاسخ تقاضای مبتنی بر انگیزه-2020
Incentive-based demand response (DR) program can induce end users (EUs) to reduce electricity demand during peak period through rewards. In this study, an incentive-based DR program with modified deep learning and reinforcement learning is proposed. A modified deep learning model based on recurrent neural network (MDL-RNN) was first proposed to identify the future uncertainties of environment by forecasting day-ahead wholesale electricity price, photovoltaic (PV) power output, and power load. Then, reinforcement learning (RL) was utilized to explore the optimal incentive rates at each hour which can maximize the profits of both energy service providers (ESPs) and EUs. The results showed that the proposed modified deep learning model can achieve more accurate forecasting results compared with some other methods. It can support the development of incentive-based DR programs under uncertain environment. Meanwhile, the optimized incentive rate can increase the total profits of ESPs and EUs while reducing the peak electricity demand. A short-term DR program was developed for peak electricity demand period, and the experimental results show that peak electricity demand can be reduced by 17%. This contributes to mitigating the supply-demand imbalance and enhancing power system security.
Keywords: Demand response | Modified deep learning | Reinforcement learning | Smart grid
Truck scheduling in a multi-door cross-docking center with partial unloading : Reinforcement learning-based simulated annealing approaches
زمانبندی کامیون در یک مرکز متصل متقابل چند درب با تخلیه جزئی: رویکردهای بازپخت شبیه سازی شده مبتنی بر یادگیری تقویتی -2020
In this paper, a truck scheduling problem at a cross-docking center is investigated where inbound trucks are also used as outbound. Moreover, inbound trucks do not need to unload and reload the demand of allocated destination, i.e. they can be partially unloaded. The problem is modeled as a mixed integer program to find the optimal dock-door and destination assignments as well as the scheduling of trucks to minimize makespan. Due to model complexity, a hybrid heuristic-simulated annealing is developed. A number of generic and tailor-made neighborhood search structures are also developed to efficiently search solution space. Moreover, some reinforcement learning methods are applied to intellectually learn more suitable neighborhood search structures in different situations. Finally, the numerical study shows that partial unloading of compound trucks has a crucial impact on makespan reduction.
Keywords: Logistics | Cross docking | Truck scheduling | Simulated annealing | Reinforcement learning
A deep reinforcement learning approach for real-time sensor-driven decision making and predictive analytics
یک رویکرد یادگیری تقویتی عمیق برای تصمیم گیری در زمان واقعی مبتنی بر حسگر و تجزیه و تحلیل پیش بینی-2020
The increased complexity of sensor-intensive systems with expensive subsystems and costly repairs and failures calls for efficient real-time control and decision making policies. Deep reinforcement learning has demonstrated great potential in addressing highly complex and challenging control and decision making problems. Despite its potential to derive real-time policies using real-time data for dynamic systems, it has been rarely used for sensordriven maintenance related problems. In this paper, we propose two novel decision making methods in which reinforcement learning and particle filtering are utilized for (i) deriving real-time maintenance policies and (ii) estimating remaining useful life for sensor-monitored degrading systems. The proposed framework introduces a new direction with many potential opportunities for system monitoring. To demonstrate the effectiveness of the proposed methods, numerical experiments are provided from a set of simulated data and a turbofan engine dataset provided by NASA.
Keywords: Particle filters | Deep reinforcement learning | Real-time control | Decision-making | Remaining useful life estimation
Deep reinforcement learning based AGVs real-time scheduling with mixed rule for flexible shop floor in industry 4.0
زمانبندی مبتنی بر یادگیری تقویتی عمیق مبتنی بر AGV با قاعده مختلط برای کف انعطاف پذیر در صنعت 4.0-2020
Driven by the recent advances in industry 4.0 and industrial artificial intelligence, Automated Guided Vehicles (AGVs) has been widely used in flexible shop floor for material handling. However, great challenges aroused by the high dynamics, complexity, and uncertainty of the shop floor environment still exists on AGVs real-time scheduling. To address these challenges, an adaptive deep reinforcement learning (DRL) based AGVs real-time scheduling approach with mixed rule is proposed to the flexible shop floor to minimize the makespan and delay ratio. Firstly, the problem of AGVs real-time scheduling is formulated as a Markov Decision Process (MDP) in which state representation, action representation, reward function, and optimal mixed rule policy, are described in detail. Then a novel deep q-network (DQN) method is further developed to achieve the optimal mixed rule policy with which the suitable dispatching rules and AGVs can be selected to execute the scheduling towards various states. Finally, the case study based on a real-world flexible shop floor is illustrated and the results validate the feasibility and effectiveness of the proposed approach.
Keywords: Automated guided vehicles | Real-time scheduling | Deep reinforcement learning | Industry 4.0