با سلام خدمت کاربران در صورتی که با خطای سیستم پرداخت بانکی مواجه شدید از طریق کارت به کارت (6037997535328901 بانک ملی ناصر خنجری ) مقاله خود را دریافت کنید (تا مشکل رفع گردد).
ردیف | عنوان | نوع |
---|---|---|
1 |
Deep Reinforcement Learning With Quantum-Inspired Experience Replay
یادگیری تقویتی عمیق با تکرار تجربه کوانتومی-2022 In this article, a novel training paradigm inspired
by quantum computation is proposed for deep reinforcement
learning (DRL) with experience replay. In contrast to the traditional experience replay mechanism in DRL, the proposed DRL
with quantum-inspired experience replay (DRL-QER) adaptively
chooses experiences from the replay buffer according to the
complexity and the replayed times of each experience (also
called transition), to achieve a balance between exploration and
exploitation. In DRL-QER, transitions are first formulated in
quantum representations and then the preparation operation
and depreciation operation are performed on the transitions.
In this process, the preparation operation reflects the relationship between the temporal-difference errors (TD-errors) and the
importance of the experiences, while the depreciation operation is
taken into account to ensure the diversity of the transitions. The
experimental results on Atari 2600 games show that DRL-QER
outperforms state-of-the-art algorithms, such as DRL-PER and
DCRL on most of these games with improved training efficiency
and is also applicable to such memory-based DRL approaches
as double network and dueling network.
Index Terms: Deep reinforcement learning (DRL) | quantum computation | quantum-inspired experience replay (QER) | quantum reinforcement learning. |
مقاله انگلیسی |
2 |
DQRA: Deep Quantum Routing Agent for Entanglement Routing in Quantum Networks
DQRA: عامل مسیریابی کوانتومی عمیق برای مسیریابی درهم تنیده در شبکه های کوانتومی-2022 Quantum routing plays a key role in the development of the next-generation network system. In
particular, an entangled routing path can be constructed with the help of quantum entanglement and swapping
among particles (e.g., photons) associated with nodes in the network. From another side of computing,
machine learning has achieved numerous breakthrough successes in various application domains, including
networking. Despite its advantages and capabilities, machine learning is not as much utilized in quantum
networking as in other areas. To bridge this gap, in this article, we propose a novel quantum routing model
for quantum networks that employs machine learning architectures to construct the routing path for the
maximum number of demands (source–destination pairs) within a time window. Specifically, we present a
deep reinforcement routing scheme that is called Deep Quantum Routing Agent (DQRA). In short, DQRA
utilizes an empirically designed deep neural network that observes the current network states to accommodate
the network’s demands, which are then connected by a qubit-preserved shortest path algorithm. The training
process of DQRA is guided by a reward function that aims toward maximizing the number of accommodated
requests in each routing window. Our experiment study shows that, on average, DQRA is able to maintain a
rate of successfully routed requests at above 80% in a qubit-limited grid network and approximately 60% in
extreme conditions, i.e., each node can be repeater exactly once in a window. Furthermore, we show that the
model complexity and the computational time of DQRA are polynomial in terms of the sizes of the quantum
networks.
INDEX TERMS: Deep learning | deep reinforcement learning (DRL) | machine learning | next-generation network | quantum network routing | quantum networks. |
مقاله انگلیسی |
3 |
Resource Management for Edge Intelligence (EI)-Assisted IoV Using Quantum-Inspired Reinforcement Learning
مدیریت منابع برای IoV به کمک هوش لبه (EI) با استفاده از یادگیری تقویتی الهام گرفته از پردازش کوانتومی-2022 Recent developments in the Internet of Vehicles
(IoV) enable interconnected vehicles to support ubiquitous
services. Various emerging service applications are promising to
increase the Quality of Experience (QoE) of users. On-board
computation tasks generated by these applications have heavily
overloaded the resource-constrained vehicles, forcing it to offload
on-board tasks to other edge intelligence (EI)-assisted servers.
However, excessive task offloading can lead to severe competition
for communication and computation resources among vehicles,
thereby increasing the processing latency, energy consumption,
and system cost. To address these problems, we investigate
the transmission-awareness and computing-sense uplink resource
management problem and formulate it as a time-varying Markov
decision process. Considering the total delay, energy consumption, and cost, quantum-inspired reinforcement learning (QRL)
is proposed to develop an intelligence-oriented edge offloading
strategy. Specifically, the vehicle can flexibly choose the network
access mode and offloading strategy through two different radio
interfaces to offload tasks to multiaccess edge computing (MEC)
servers through WiFi and cloud servers through 5G. The objective of this joint optimization is to maintain a self-adaptive
balance between these two aspects. Simulation results show that
the proposed algorithm can significantly reduce the transmission
latency and computation delay.
Index Terms: Cloud computing | edge intelligence (EI) | Internet of Vehicles (IoV) | multiaccess edge computing (MEC) | quantum-inspired reinforcement learning (QRL) |
مقاله انگلیسی |
4 |
Resource Allocation in Time Slotted Channel Hopping (TSCH) Networks Based on Phasic Policy Gradient Reinforcement Learning
تخصیص منابع در شبکه های گام کانال با شکاف زمانی (TSCH) بر اساس یادگیری تقویت گرادیان خط مشی فازی-2022 The concept of the Industrial Internet of Things (IIoT) is gaining prominence due to its lowcost solutions and improved productivity of manufacturing processes. To address the ultra-high
reliability and ultra-low power communication requirements of IIoT networks, Time Slotted
Channel Hopping (TSCH) behavioral mode has been introduced in IEEE 802.15.4e standard.
Scheduling the packet transmissions in IIoT networks is a difficult task owing to the limited
resources and dynamic topology. In IEEE 802.15.4e TSCH, the design of the schedule is open
to implementation. In this paper, we propose a phasic policy gradient (PPG) based TSCH
schedule learning algorithm. We construct the utility function that accounts for the throughput,
and energy efficiency of the TSCH network. The proposed PPG based scheduling algorithm
overcomes the drawbacks of totally distributed and totally centralized deep reinforcement
learning-based scheduling algorithms by employing the actor–critic policy gradient method that
learns the scheduling algorithm in two phases, namely policy phase and auxiliary phase. In
this method, we show that the schedule converges quickly compared to any other actor–critic
method and also improves the system throughput performance by 58% compared to the minimal
scheduling function, a default TSCH schedule.
Keywords: Industrial internet of things | IEEE 802.15.4e | Time slotted channel hopping | Deep reinforcement learning | Actor–critic policy gradient methods | Phasic policy gradient |
مقاله انگلیسی |
5 |
Attention-based model and deep reinforcement learning for distribution of event processing tasks
مدل مبتنی بر توجه و یادگیری تقویتی عمیق برای توزیع وظایف پردازش رویداد-2022 Event processing is the cornerstone of the dynamic and responsive Internet of Things (IoT).
Recent approaches in this area are based on representational state transfer (REST) principles,
which allow event processing tasks to be placed at any device that follows the same principles.
However, the tasks should be properly distributed among edge devices to ensure fair resources
utilization and guarantee seamless execution. This article investigates the use of deep learning
to fairly distribute the tasks. An attention-based neural network model is proposed to generate
efficient load balancing solutions under different scenarios. The proposed model is based on
the Transformer and Pointer Network architectures, and is trained by an advantage actorcritic reinforcement learning algorithm. The model is designed to scale to the number of
event processing tasks and the number of edge devices, with no need for hyperparameters
re-tuning or even retraining. Extensive experimental results show that the proposed model
outperforms conventional heuristics in many key performance indicators. The generic design
and the obtained results show that the proposed model can potentially be applied to several
other load balancing problem variations, which makes the proposal an attractive option to be
used in real-world scenarios due to its scalability and efficiency.
keywords: Web of Things (WoT) | Representational state transfer (REST) | application programming interface (APIs) | Edge computing | Load balancing | Resource placement | Deep reinforcement leaning | Transformer model | Pointer networks | Actor critic |
مقاله انگلیسی |
6 |
Deep Q learning based secure routing approach for OppIoT networks
رویکرد مسیریابی ایمن مبتنی بر یادگیری Q برای شبکه های OppIoT-2022 Opportunistic IoT (OppIoT) networks are a branch of IoT where the human and machines
collaborate to form a network for sharing data. The broad spectrum of devices and ad-hoc
nature of connections, further alleviate the problem of network and data security. Traditional
approaches like trust based approaches or cryptographic approaches fail to preemptively secure
these networks. Machine learning (ML) approaches, mainly deep reinforcement learning (DRL)
methods can prove to be very effective in ensuring the security of the network as they
are profoundly capable of solving complex and dynamic problems. Deep Q-learning (DQL)
incorporates deep neural network in the Q learning process for dealing with high-dimensional
data. This paper proposes a routing approach for OppIoT, DQNSec, based on DQL for securing the
network against attacks viz. sinkhole, hello flood and distributed denial of service attack. The
actor–critic approach of DQL is utilized and OppIoT is modeled as a Markov decision process
(MDP). Extensive simulations prove the efficiency of DQNSec in comparison to other ML based
routing protocols, viz. RFCSec, RLProph, CAML and MLProph.
Keywords: OppIoT | Reinforcement learning | Deep learning | Deep Q-learning | Markov decision process | Sinkhole attack | Hello flood attack | Distributed denial of service attack |
مقاله انگلیسی |
7 |
Curriculum-Based Deep Reinforcement Learning for Quantum Control
یادگیری تقویتی عمیق مبتنی بر برنامه درسی برای کنترل کوانتومی-2022 Deep reinforcement learning (DRL) has been recognized as an efficient technique to design optimal strategies for
different complex systems without prior knowledge of the control
landscape. To achieve a fast and precise control for quantum
systems, we propose a novel DRL approach by constructing a
curriculum consisting of a set of intermediate tasks defined by
fidelity thresholds, where the tasks among a curriculum can be
statically determined before the learning process or dynamically
generated during the learning process. By transferring knowledge
between two successive tasks and sequencing tasks according to
their difficulties, the proposed curriculum-based DRL (CDRL)
method enables the agent to focus on easy tasks in the early
stage, then move onto difficult tasks, and eventually approaches
the final task. Numerical comparison with the traditional methods
[gradient method (GD), genetic algorithm (GA), and several
other DRL methods] demonstrates that CDRL exhibits improved
control performance for quantum systems and also provides an
efficient way to identify optimal strategies with few control pulses.
Index Terms: Curriculum learning | deep reinforcement learning (DRL) | quantum control. |
مقاله انگلیسی |
8 |
Zero shot augmentation learning in internet of biometric things for health signal processing
یادگیری تقویتی صفر در اینترنت اشیا بیومتریک برای پردازش سیگنال سلامتی-2021 In recent years, the number of Internet of Things (IoT) devices has increased rapidly. The Internet of Biometric Things (IoBT) can process biometrics and health signals, and it will greatly extend the range of biometric applications. The analysis of health signals in the IoBT can use computer-aided diagnosis techniques. However, most of the existing computer-aided diagnosis methods are developed for common diseases and are not suitable for rare diseases. Zero shot learning is a potential method for the computer- aided diagnosis of rare diseases because it can identify objects of unknown categories. However, the ex- isting zero shot learning methods are based on attribute learning and rely on an attribute dataset. There is no attribute dataset for health signal processing. Therefore, the existing zero shot learning methods are not suitable for health signal processing. Based on the above background, we propose a zero shot aug- mentation learning model (ZSAL) in the IoBT for health signal processing. First, an expert doctor identifies the contour of a lesion and selects a background image without a lesion. Second, the computer automatically generates virtual images using zero shot augmentation technology. Finally, the generated virtual dataset is used to train a convolutional classifier, and then we apply the classifier to the computer-aided diagnosis of actual medical images. The experiment shows the efficiency and effectiveness of our method.© 2021 Elsevier B.V. All rights reserved. Keywords: Internet of biometric things | Zero shot learning | Data augmentation | Health signal processing |
مقاله انگلیسی |
9 |
Actor–Critic Reinforcement Learning and Application in Developing Computer-Vision-Based Interface Tracking
یادگیری و کاربرد تقویت کننده منتقد در توسعه ردیابی رابط مبتنی بر بینایی ماشین-2021 This paper synchronizes control theory with computer vision by formalizing object tracking as a sequential decision-making process. A
reinforcement learning (RL) agent successfully tracks an interface between two liquids, which is often a critical variable to track in many
chemical, petrochemical, metallurgical, and oil industries. This method utilizes less than 100 images for creating an environment, from which
the agent generates its own data without the need for expert knowledge. Unlike supervised learning (SL) methods that rely on a huge number of
parameters, this approach requires far fewer parameters, which naturally reduces its maintenance cost. Besides its frugal nature, the agent is
robust to environmental uncertainties such as occlusion, intensity changes, and excessive noise. From a closed-loop control context, an interface
location-based deviation is chosen as the optimization goal during training. The methodology showcases RL for real-time object-tracking
applications in the oil sands industry. Along with a presentation of the interface tracking problem, this paper provides a detailed review of one of
the most effective RL methodologies: actor–critic policy.
Keywords: Interface tracking | Object tracking | Occlusion | Reinforcement learning | Uniform manifold approximation and projection |
مقاله انگلیسی |
10 |
Optimal carbon storage reservoir management through deep reinforcement learning
مدیریت بهینه ذخیره مخزن کربن از طریق یادگیری تقویتی عمیق-2020 Model-based optimization plays a central role in energy system design and management. The complexity and
high-dimensionality of many process-level models, especially those used for geosystem energy exploration
and utilization, often lead to formidable computational costs when the dimension of decision space is also
large. This work adopts elements of recently advanced deep learning techniques to solve a sequential decisionmaking
problem in applied geosystem management. Specifically, a deep reinforcement learning framework was
formed for optimal multiperiod planning, in which a deep Q-learning network (DQN) agent was trained to
maximize rewards by learning from high-dimensional inputs and from exploitation of its past experiences. To
expedite computation, deep multitask learning was used to approximate high-dimensional, multistate transition
functions. Both DQN and deep multitask learning are pattern based. As a demonstration, the framework was
applied to optimal carbon sequestration reservoir planning using two different types of management strategies:
monitoring only and brine extraction. Both strategies are designed to mitigate potential risks due to pressure
buildup. Results show that the DQN agent can identify the optimal policies to maximize the reward for given
risk and cost constraints. Experiments also show that knowledge the agent gained from interacting with one
environment is largely preserved when deploying the same agent in other similar environments. Keywords: Reinforcement learning | Multistage decision-making | Deep autoregressive model | Deep Q network | Surrogate modeling | Markov decision process | Geological carbon sequestration |
مقاله انگلیسی |