با سلام خدمت کاربران در صورتی که با خطای سیستم پرداخت بانکی مواجه شدید از طریق کارت به کارت (6037997535328901 بانک ملی ناصر خنجری ) مقاله خود را دریافت کنید (تا مشکل رفع گردد).
ردیف | عنوان | نوع |
---|---|---|
1 |
Wireless control using reinforcement learning for practical web QoE
کنترل بی سیم با استفاده از یادگیری تقویت کننده برای QoE عملی وب-2020 Wireless networks show several challenges not found in wired networks, due to the dynamics of data
transmission. Besides, home wireless networks are managed by non-technical people, and providers do not
implement full management services because of the difficulties of manually managing thousands of devices.
Thus, automatic management mechanisms are desirable. However, such control mechanisms are hard to
achieve in practice because we do not always have a model of the process to be controlled, or the behavior
of the environment is dynamic. Thus, the control must adapt to changing conditions, and it is necessary to
identify the quality of the control executed from the perspective of the user of the network service. This article
proposes a control loop for transmission power and channel selection, based on Software Defined Networking
and Reinforcement Learning (RL), and capable of improving Web Quality of Experience metrics, thus benefiting
the user. We evaluate a prototype in which some Access Points are controlled by a single controller or by
independent controllers. The control loop uses the predicted Mean Opinion Score (MOS) as a reward, thus the
system needs to classify the web traffic. We proposed a semi-supervised learning method to classify the web
sites into three classes (light, average and heavy) that groups pages by their complexity, i.e. number and size
of page elements. These classes define the MOS predictor used by the control loop. The proposed web site
classifier achieves an average score of 87% ± 1%, classifying 500 unlabeled examples with only fifteen known
examples, with a sub-second runtime. Further, the RL control loop achieves higher Mean Opinion Score (up
to 167% in our best result) than the baselines. The page load time of clients browsing heavy web sites is
improved by up to 6.6x. Keywords: Wireless network | Software defined network | Reinforcement learning | Q-Learning | Multi-armed bandit | Quality of Experience |
مقاله انگلیسی |
2 |
Efficient crowdsourcing of unknown experts using bounded multi-armed bandits
جمعیت کارآمد کارشناسان ناشناخته با استفاده از راهزنان محدود شده مسلح-2014 Increasingly, organisations flexibly outsource work on a temporary basis to a global
audience of workers. This so-called crowdsourcing has been applied successfully to a
range of tasks, from translating text and annotating images, to collecting information
during crisis situations and hiring skilled workers to build complex software. While
traditionally these tasks have been small and could be completed by non-professionals,
organisations are now starting to crowdsource larger, more complex tasks to experts
in their respective fields. These tasks include, for example, software development and
testing, web design and product marketing. While this emerging expert crowdsourcing offers
flexibility and potentially lower costs, it also raises new challenges, as workers can be
highly heterogeneous, both in their costs and in the quality of the work they produce.
Specifically, the utility of each outsourced task is uncertain and can vary significantly
between distinct workers and even between subsequent tasks assigned to the same
worker. Furthermore, in realistic settings, workers have limits on the amount of work
they can perform and the employer will have a fixed budget for paying workers. Given
this uncertainty and the relevant constraints, the objective of the employer is to assign
tasks to workers in order to maximise the overall utility achieved. To formalise this
expert crowdsourcing problem, we introduce a novel multi-armed bandit (MAB) model, the
bounded MAB. Furthermore, we develop an algorithm to solve it efficiently, called bounded
ε-first, which proceeds in two stages: exploration and exploitation. During exploration, it
first uses ε B of its total budget B to learn estimates of the workers’ quality characteristics.
Then, during exploitation, it uses the remaining (1 − ε) B to maximise the total utility
based on those estimates. Using this technique allows us to derive an O(B 23 ) upper bound
on its performance regret (i.e., the expected difference in utility between our algorithm
and the optimum), which means that as the budget B increases, the regret tends to 0. In
addition to this theoretical advance, we apply our algorithm to real-world data from oDesk,
a prominent expert crowdsourcing site. Using data from real projects, including historic
project budgets, expert costs and quality ratings, we show that our algorithm outperforms
existing crowdsourcing methods by up to 300%, while achieving up to 95% of a hypothetical
optimum with full information.
Keywords:
Crowdsourcing
Machine learning
Multi-armed bandits
Budget limitation |
مقاله انگلیسی |