A Joint Power Efficient Server and Network Consolidation approach for virtualized data centers
یک سرور توانی کارآمد مشترک و دیدگاه یکپارچه سازی شبکه برای مراکز داده ای مجازی-2018
Cloud computing and virtualization are enabling technologies for designing energy-aware resource management mechanisms in virtualized data centers. Indeed, one of the main challenges of big data centers is to decrease the power consumption, both to cut costs and to reduce the environmental impact. To this extent, Virtual Machine (VM) consolidation is often used to smartly reallocate the VMs with the objective of reducing the power consumption, by exploiting the VM live migration. The consolidation problem consists in finding the set of migrations that allow to keep turned on the minimum number of servers needed to host all the VMs. However, most of the proposed consolidation approaches do not consider the network related consumption, which represents about 10–20% of the total energy consumed by IT equipment in real data centers. This paper proposes a novel joint server and network consolidation model that takes into account the power efficiency of both the switches forwarding the traffic and the servers hosting the VMs. It powers down switch ports and routes traffic along the most energy efficient path towards the least energy consuming server under QoS constraints. Since the model is complex, a fast Simulated Annealing based Resource Consolidation algorithm (SARC) is proposed. Our numerical results demonstrate that our approach is able to save on average 50% of the network related power consumption compared to a network unaware consolidation.
keywords: Cloud| Virtualization| Power| Green computing| Simulated annealing
A new architecture of Internet of Things and big data ecosystem for secured smart healthcare monitoring and alerting system
معماری جدید اینترنت اشیاء و اکوسیستم داده های بزرگ برای نظارت بر سیستم مراقبت سلامت هوشمند و سیستم هشدار دهنده امن-2018
Wearable medical devices with sensor continuously generate enormous data which is often called as big data mixed with structured and unstructured data. Due to the complexity of the data, it is difficult to process and analyze the big data for finding valuable information that can be useful in decision making. On the other hand, data security is a key requirement in healthcare big data system. In order to overcome this issue, this paper proposes a new architecture for the implementation of IoT to store and process scalable sensor data (big data) for health care applications. The Proposed architecture consists of two main sub architectures, namely, Meta Fog-Redirection (MF-R) and Grouping and Choosing (GC) architecture. MF-R architecture uses big data technologies such as Apache Pig and Apache HBase for collection and storage of the sensor data (big data) generated from different sensor devices. The proposed GC architecture is used for securing integration of fog computing with cloud computing. This architecture also uses key management service and data categorization function (Sensitive, Critical and Normal) for providing security services. The framework also uses MapReduce based prediction model to predict the heart diseases. Performance evaluation parameters such as throughput, sensitivity, accuracy, and f-measure are calculated to prove the efficiency of the proposed architecture as well as the prediction model.
Keywords: Wireless sensor networks ، Internet of Things ، Big data analytics ، Cloud computing and health car
Bridging data-capacity gap in big data storage
شکاف ظرفیت داده ها در ذخیره سازی داده های بزرگ-2018
Big data is aggressive in its production, and with the merger of Cloud computing and IoT, the huge volumes of data generated are increasingly challenging the storage capacity of data centres. This has led to a growing data-capacity gap in big data storage. Unfortunately, the limitations faced by current storage technologies have severely handicapped their potential to meet the storage demand of big data. Consequently, storage technologies with higher storage density, throughput and lifetime have been researched to overcome this gap. In this paper, we first introduce the working principles of three such emerging storage technologies, and justify their inclusion in the study based on the tremendous advances received by them in the recent past. These storage technologies include Optical data storage, DNA data storage & Holographic data storage. We then evaluate the recent advances received in storage density, throughput and lifetime of these emerging storage technologies, and compare them with the trends and advances in prevailing storage technologies. We finally discuss the implications of their adoption, evaluate their prospects, and highlight the challenges faced by them to bridge the data-capacity gap in big data storage.
Keywords: Big data ، Data-capacity gap ، Optical storage ، DNA storage ، Holographic storage ، Magnetic storage
Cost optimization for deadline-aware scheduling of big-data processing jobs on clouds
بهینه سازی هزینه برای زمانبندی دقیق پردازش داده های بزرگ کارها در ابرها-2018
Cloud computing has been widely regarded as a capable solution for big data processing. Nowadays cloud service providers usually offer users virtual machines with various combinations of configurations and prices. As this new service scheme emerges, the problem of choosing the cost-minimized combination under a deadline constraint is becoming more complex for users. The complexity of determining the cost minimized combination may be resulted from different causes: the characteristics of user applications, and providers’ setting on the configurations and pricing of virtual machine. In this paper, we proposed a variety of algorithms to help the users to schedule their big data processing workflow applications on clouds so that the cost can be minimized and the deadline constraints can be satisfied. The proposed algorithms were evaluated by extensive simulation experiments with diverse experimental settings.
Keywords: Big-data ، Scheduling ، Cost-efficient ، Cloud computing
Fault-diagnosis for reciprocating compressors using big data and machine learning
تشخیص گسل برای کمپرسورهای مجاور با استفاده از داده های بزرگ و یادگیری ماشین-2018
Reciprocating compressors are widely used in petroleum industry. A small fault in recipro cating compressor may cause serious issues in operation. Traditional regular maintenance and fault diagnosis solutions cannot efficiently detect potential faults in reciprocating com pressors. This paper proposes a fault-diagnosis system for reciprocating compressors. It applies machine-learning techniques to data analysis and fault diagnosis. The raw data is denoised first. Then the denoised data is sparse coded to train a dictionary. Based on the learned dictionary, potential faults are finally recognized and classified by support vector machine (SVM). The system is evaluated by using 5-year operation data collected from an offshore oil corporation in a cloud environment. The collected data is evenly divided into two halves. One half is used for training, and the other half is used for testing. The results demonstrate that the proposed system can efficiently diagnose potential faults in com pressors with more than 80% accuracy, which represents a better result than the current practice.
Keywords: Reciprocating compressor، Big data ، Cloud computing ، Deep learning ، RPCA ، SVM
Graph grammars according to the type of input and manipulated data: A survey
گرامر نمودار با توجه به نوع ورودی و دستکاری شده است داده ها: یک مرور-2018
Graph grammars which generate graphs are a generalization of Chomsky grammars that generate strings. During the last decades there has been a remarkable development of graph grammars. Due to their wide diversity of applications, graph grammars have received a particular attention from many scientists and researchers. There has been applications of graph grammars in several areas such as pattern recognition, data base systems, biological developments in organisms, semantics of programming languages, compiler construction, software development environments, etc. In the literature, in some surveys, graph grammars have been studied and classified according to some criteria such as: parallel or sequential applicability of rules, embedding mechanism, type of generated graphs, etc. In addition to this, as data play an important role more and more in different domains, we survey in this paper the vast field of graph grammars by classifying them according to three criteria: the number of manipulated data (single or multiple types), the nature of data (structured or unstructured), and finally the kind of data (images, graphs, patterns, etc.). In particular, we consider that a graph grammar is well defined by five components instead of four, namely: type of generated graphs (TG), a start graph (Z), a set of production rules (P), a set of additional specifications of the rules (A), and the criterion that we additionally consider which is the type of input and manipulated data (TD). This proposed formalism, especially with the added fifth component, may serve to overcome some issues related to Big Data and Cloud Computing domains.
Keywords: Graph grammar ، Type of input and manipulated data ، Type of generated graph ، Big Data ، Cloud computing ، Application
IoV distributed architecture for real-time traffic data analytics
معماری توزیع شده IoV را برای تحلیل داده های ترافیک در زمان واقعی-2018
In this paper, we present necessary premises for the deployment of the Internet of Vehicles (IoV) integrating Big Data analytics of road network traffic measurements of the city of Mohammedia, Morocco. Thus, we introduce an architecture based on three main layers such as IoV, Fog Computing and Cloud Computing Layer. We specifically put more focus on Fog Computing layer in which we develop a framework for a real-time collecting and processing events generated by intelligent vehicles as well as visualizing traffic state on each road section. Furthermore, we consider deployment and test of the proposed framework using events retrieved from a Vanets-type micro simulation. Finally, we present and discuss the first obtained results as well as the advantages and limitations of the proposed architecture.
Keywords: IoV, Big Data analytics, Fog computing, Real-time data analytics, Traffic control
Fall detection system for elderly people using IoT and Big Data
سیستم تشخیص سقوط برای سالمندان با استفاده از اینترنت اشیا و داده های بزرگ-2018
Falls represent a major public health risk worldwide for the elderly people. A fall not assisted in time can cause functional impairment in an elder and a significant decrease in his mobility, independence and life quality. In that sense, the present work proposes an innovative IoT-based system for detecting falls of elderly people in indoor environments, which takes advantages of low-power wireless sensor networks, smart devices, big data and cloud computing. For this purpose, a 3D-axis accelerometer embedded into a 6LowPAN device wearable is used, which is responsible for collecting data from movements of elderly people in real-time. To provide high efficiency in fall detection, the sensor readings are processed and analyzed using a decision trees based Big Data model running on a Smart IoT Gateway. If a fall is detected, an alert is activated and the system reacts automatically by sending notifications to the groups responsible for the care of the elderly people. Finally, the system provides services built on cloud. From medical perspective, there is a storage service that enables healthcare professional to access to falls data for perform further analysis. On the other hand, the system provides a service leveraging this data to create a new machine learning model each time a fall is detected. The results of experiments have shown high success rates in fall detection in terms of accuracy, precision and gain.
Keywords: Fall detection; Internet-of-Things; Big Data, 6LowPAN; wearable sensor; Smart IoT Gateway; fall detection; decision tree learning algorithm; accelerometer; elderly people.
A course on big data analytics
دوره ای در تجزیه و تحلیل داده های بزرگ-2018
This report details a course on big data analytics designed for undergraduate junior and senior computer science students. The course is heavily focused on projects and writing code for big data processing. It is designed to help students learn parallel and distributed computing frameworks and techniques commonly used in industry. The curriculum includes a progression of projects requiring increasingly sophisticated big data processing ranging from data preprocessing with Linux tools, distributed pro cessing with Hadoop MapReduce and Spark, and database queries with Hive and Google’s BigQuery. We discuss hardware infrastructure and experimentally evaluate the cost/benefit of an on-premise server versus Amazon’s Elastic MapReduce. Finally, we showcase outcomes of our course in terms of student engagement and anonymous student feedback.
Keywords: Curriculum ، Undergraduate education ، Big data ،Cloud computing
SLA based healthcare big data analysis and computing in cloud network
تحلیل داده های بزرگ سلامت مبتنی بر sla و محاسبات در شبکه های ابری-2018
Large volume of multi-structured and low-latency patient data are generated in healthcare services, which is challenging task to process and analyze within the Service Level Agreement (SLA). In this paper, a Parallel Semi-Naive Bayes (PSNB) based probabilistic method is used to process the healthcare big data in cloud for future health condition prediction. In order to improve the accuracy of PSNB method, a Modified Conjunctive Attribute (MCA) algorithm is proposed for reducing the dimension. Emergency condition of the patient is considered by setting a global priority among the patients and an Optimal Data Distribution (ODD) algorithm is proposed to position both batch and streaming patient data into the Spark nodes. Further, a Dynamic Job Scheduling (DJS) algorithm is designed to schedule the jobs efficiently to the most suitable nodes for processing the data taking SLA into account. Our proposed PSNB algorithm provides better accuracy of 87.8% for both batch and streaming data, which is 12.8% higher than the original Naive Bayes (NB) algorithm and can conveniently be employed in various patient monitoring applications.
Keywords: Big Data ، cloud computing ،healthcare, spark