سال انتشار:
2020
عنوان انگلیسی مقاله:
Adaptive early classification of temporal sequences using deep reinforcement learning
ترجمه فارسی عنوان مقاله:
طبقه بندی اولیه انطباقی توالی های زمانی با استفاده از یادگیری تقویتی عمیق
منبع:
Sciencedirect - Elsevier - Knowledge-Based Systems, 190 (2020) 105290. doi:10.1016/j.knosys.2019.105290
نویسنده:
Coralie Martinez a,∗, Emmanuel Ramasso b, Guillaume Perrin a, Michèle Rombaut c
چکیده انگلیسی:
In this article, we address the problem of early classification (EC) of temporal sequences with adaptive
prediction times. We frame EC as a sequential decision making problem and we define a partially
observable Markov decision process (POMDP) fitting the competitive objectives of classification
earliness and accuracy. We solve the POMDP by training an agent for EC with deep reinforcement
learning (DRL). The agent learns to make adaptive decisions between classifying incomplete sequences
now or delaying its prediction to gather more measurements. We adapt an existing DRL algorithm for
batch and online learning of the agent’s action value function with a deep neural network. We propose
strategies of prioritized sampling, prioritized storing and random episode initialization to address the
fact that the agent’s memory is unbalanced due to (1): all but one of its actions terminate the process
and thus (2): actions of classification are less frequent than the action of delay. In experiments, we
show improvements in accuracy induced by our specific adaptation of the algorithm used for online
learning of the agent’s action value function. Moreover, we compare two definitions of the POMDP
based on delay reward shaping against reward discounting. Finally, we demonstrate that a static naive
deep neural network, i.e. trained to classify at static times, is less efficient in terms of accuracy against
speed than the equivalent network trained with adaptive decision making capabilities
Keywords: Early classification | Adaptive prediction time | Deep reinforcement learning | Temporal sequences | Double DQN | Trade-off between accuracy vs. speed
قیمت: رایگان
توضیحات اضافی:
تعداد نظرات : 0