دانلود و نمایش مقالات مرتبط با Multi-armed bandit::صفحه 1
دانلود بهترین مقالات isi همراه با ترجمه فارسی 2

با سلام خدمت کاربران در صورتی که با خطای سیستم پرداخت بانکی مواجه شدید از طریق کارت به کارت (6037997535328901 بانک ملی ناصر خنجری ) مقاله خود را دریافت کنید (تا مشکل رفع گردد). 

نتیجه جستجو - Multi-armed bandit

تعداد مقالات یافته شده: 2
ردیف عنوان نوع
1 Wireless control using reinforcement learning for practical web QoE
کنترل بی سیم با استفاده از یادگیری تقویت کننده برای QoE عملی وب-2020
Wireless networks show several challenges not found in wired networks, due to the dynamics of data transmission. Besides, home wireless networks are managed by non-technical people, and providers do not implement full management services because of the difficulties of manually managing thousands of devices. Thus, automatic management mechanisms are desirable. However, such control mechanisms are hard to achieve in practice because we do not always have a model of the process to be controlled, or the behavior of the environment is dynamic. Thus, the control must adapt to changing conditions, and it is necessary to identify the quality of the control executed from the perspective of the user of the network service. This article proposes a control loop for transmission power and channel selection, based on Software Defined Networking and Reinforcement Learning (RL), and capable of improving Web Quality of Experience metrics, thus benefiting the user. We evaluate a prototype in which some Access Points are controlled by a single controller or by independent controllers. The control loop uses the predicted Mean Opinion Score (MOS) as a reward, thus the system needs to classify the web traffic. We proposed a semi-supervised learning method to classify the web sites into three classes (light, average and heavy) that groups pages by their complexity, i.e. number and size of page elements. These classes define the MOS predictor used by the control loop. The proposed web site classifier achieves an average score of 87% ± 1%, classifying 500 unlabeled examples with only fifteen known examples, with a sub-second runtime. Further, the RL control loop achieves higher Mean Opinion Score (up to 167% in our best result) than the baselines. The page load time of clients browsing heavy web sites is improved by up to 6.6x.
Keywords: Wireless network | Software defined network | Reinforcement learning | Q-Learning | Multi-armed bandit | Quality of Experience
مقاله انگلیسی
2 Efficient crowdsourcing of unknown experts using bounded multi-armed bandits
جمعیت کارآمد کارشناسان ناشناخته با استفاده از راهزنان محدود شده مسلح-2014
Increasingly, organisations flexibly outsource work on a temporary basis to a global audience of workers. This so-called crowdsourcing has been applied successfully to a range of tasks, from translating text and annotating images, to collecting information during crisis situations and hiring skilled workers to build complex software. While traditionally these tasks have been small and could be completed by non-professionals, organisations are now starting to crowdsource larger, more complex tasks to experts in their respective fields. These tasks include, for example, software development and testing, web design and product marketing. While this emerging expert crowdsourcing offers flexibility and potentially lower costs, it also raises new challenges, as workers can be highly heterogeneous, both in their costs and in the quality of the work they produce. Specifically, the utility of each outsourced task is uncertain and can vary significantly between distinct workers and even between subsequent tasks assigned to the same worker. Furthermore, in realistic settings, workers have limits on the amount of work they can perform and the employer will have a fixed budget for paying workers. Given this uncertainty and the relevant constraints, the objective of the employer is to assign tasks to workers in order to maximise the overall utility achieved. To formalise this expert crowdsourcing problem, we introduce a novel multi-armed bandit (MAB) model, the bounded MAB. Furthermore, we develop an algorithm to solve it efficiently, called bounded ε-first, which proceeds in two stages: exploration and exploitation. During exploration, it first uses ε B of its total budget B to learn estimates of the workers’ quality characteristics. Then, during exploitation, it uses the remaining (1 − ε) B to maximise the total utility based on those estimates. Using this technique allows us to derive an O(B 23 ) upper bound on its performance regret (i.e., the expected difference in utility between our algorithm and the optimum), which means that as the budget B increases, the regret tends to 0. In addition to this theoretical advance, we apply our algorithm to real-world data from oDesk, a prominent expert crowdsourcing site. Using data from real projects, including historic project budgets, expert costs and quality ratings, we show that our algorithm outperforms existing crowdsourcing methods by up to 300%, while achieving up to 95% of a hypothetical optimum with full information. Keywords: Crowdsourcing Machine learning Multi-armed bandits Budget limitation
مقاله انگلیسی
rss مقالات ترجمه شده rss مقالات انگلیسی rss کتاب های انگلیسی rss مقالات آموزشی
logo-samandehi
بازدید امروز: 7974 :::::::: بازدید دیروز: 0 :::::::: بازدید کل: 7974 :::::::: افراد آنلاین: 74