دانلود و نمایش مقالات مرتبط با MapReduce ::صفحه 1
دانلود بهترین مقالات isi همراه با ترجمه فارسی 2

با سلام خدمت کاربران در صورتی که با خطای سیستم پرداخت بانکی مواجه شدید از طریق کارت به کارت (6037997535328901 بانک ملی ناصر خنجری ) مقاله خود را دریافت کنید (تا مشکل رفع گردد). 

نتیجه جستجو - MapReduce

تعداد مقالات یافته شده: 100
ردیف عنوان نوع
1 Pivot-based approximate k-NN similarity joins for big high-dimensional data
پیوندهای شباهت تقریبی k-NN مبتنی بر محوری برای داده های بزرگ با ابعاد بزرگ-2020
Given an appropriate similarity model, the k-nearest neighbor similarity join represents a useful yet costly operator for data mining, data analysis and data exploration applications. The time to evaluate the operator depends on the size of datasets, data distribution and the dimensionality of data representations. For vast volumes of high-dimensional data, only distributed and approximate approaches make the joins practically feasible. In this paper, we investigate and evaluate the performance of multiple MapReduce-based approximate k-NN similarity join approaches on two leading Big Data systems Apache Hadoop and Spark. Focusing on the metric space approach relying on reference dataset objects (pivots), this paper investigates distributed similarity join techniques with and without approximation guarantees and also proposes high-dimensional extensions to previously proposed algorithms. The paper describes the design guidelines, algorithmic details, and key theoretical underpinnings of the compared approaches and also presents the empirical performance evaluation, approximation precision, and scalability properties of the implemented algorithms. Moreover, the Spark source code of all these algorithms has been made publicly available. Key findings of the experimental analysis are that randomly initialized pivot-based methods perform well with big highdimensional data and that, in general, the selection of the best algorithm depends on the desired levels of approximation guarantee, precision and execution time.
Keywords: Hadoop | Spark | MapReduce | k-NN | Approximate similarity join | High-dimensional data
مقاله انگلیسی
2 MapReduce based tipping point scheduler for parallel image processing
مانبندی نقطه اوج بر اساس MapReduce برای پردازش تصویر موازی-2020
Nowadays, Big Data image processing is very much in need due to its proven success in the field of business information system, medical science and social media. However, as the days are passing by, the computation of Big Data images is becoming more complex which ultimately results in complex resource management and higher task execution time. Researchers have been using a combination of CPU and GPU based computing to cut down the execution time, however, when it comes to scaling of compute nodes, then the combination of CPU and GPU based computing still remains a challenge due to the high commu- nication cost factor. In order to tackle this issue, the Map-Reduce framework has come out to be a viable option as its workflow optimization could be enhanced by changing its underlying job scheduling mech- anism. This paper presents a comparative study of job scheduling algorithms which could be deployed over various Big Data based image processing application and also proposes a tipping point scheduling algorithm to optimize the workflow for job execution on multiple nodes. The evaluation of the proposed scheduling algorithm is done by implementing parallel image segmentation algorithm to detect lung tu- mor for up to 3GB size of image dataset. In terms of performance comprising of task execution time and throughput, the proposed tipping point scheduler has come out to be the best scheduler followed by the Map-Reduce based Fair scheduler. The proposed tipping point scheduler is 1.14 times better than Map- Reduce based Fair scheduler and 1.33 times better than Map-Reduced based FIFO scheduler in terms of task execution time and throughput. In terms of speedup comparison between single node and multiple nodes, the proposed tipping point scheduler attained a speedup of 4.5 X for multi-node architecture.
Keywords: Job scheduler | Workflow optimization | Map-Reduce | Tipping point scheduler | Parallel image segmentation | Lung tumor
مقاله انگلیسی
3 Distributed mining of high utility time interval sequential patterns using mapreduce approach
کاوش توزیع شده الگوهای پی در پی فاصله زمانی ابزار مطلوب با استفاده از روش Mapreduce-2020
High Utility Sequential Pattern mining (HUSP) algorithms aim to find all the high utility sequences from a sequence database. Due to the large explosion of data, recently few distributed algorithms have been designed for mining HUSPs based on the MapReduce framework. However, the existing HUSP algorithms such as USpan, HUS-Span and BigHUSP are able to predict only the order of items, they do not pre- dict the time between the items, that is, they do not include the time intervals between the successive items. But in a real-world scenario, time interval patterns provide more valuable information than con- ventional high utility sequential patterns. Therefore, we propose a distributed high utility time interval sequential pattern mining (DHUTISP) algorithm using the MapReduce approach that is suitable for big data. DHUTISP creates a novel time interval utility linked list data structure (TIUL) to efficiently calculate the utility of the resulting patterns. Moreover, two utility upper bounds, namely, remaining utility upper bound (RUUB) and co-occurrence utility upper bound (CUUB) are proposed to prune the unpromising candidates. We conducted various experiments to prove the efficiency of the proposed algorithm over both the distributed and non-distributed approaches. The experimental results show the efficiency of DHUTISP over state-of-the-art algorithms, namely, BigHUSP, AHUS-P, PUSOM and UTMining_A.
Keywords: Big data | High utility itemset mining | High utility sequential pattern mining | Time interval sequential pattern mining | Mapreduce framework
مقاله انگلیسی
4 BAMHealthCloud: A biometric authentication and data management system for healthcare data in cloud
BAMHealthCloud: یک سیستم احراز هویت بیومتریک و سیستم مدیریت داده برای داده های مراقبت های بهداشتی در ابر-2020
Advancements in the healthcare industry have given rise to the security threat to the ever growing emedical data. The healthcare data management system records patient’s data in different formats such as text, numeric, pictures and videos leading to data which is big and unstructured. Also, hospitals may have several branches in different geographical locations. Sometimes, for research purposes, there is a need to integrate patients’ health data stored at different locations. In view of this, a cloud-based healthcare management system can be an effective solution for efficient health care data management. But the major concern of cloud-based healthcare system is the security aspect. It includes theft of identity, tax fraudulence, bank fraud, insurance frauds, medical frauds and defamation of high profile patients. Hence, a secure data access and retrieval is needed in order to provide security of critical medical records in healthcare management system. Biometric based authentication mechanism is suitable in this scenario since it overcomes the limitations of token theft and forgetting passwords in the conventional token idpassword mechanism used for providing security. It also has high accuracy rate for secure data access and retrieval. In the present paper, a cloud-based system for management of healthcare data BAMHealthCloud is proposed, which ensures the security of e-medical data access through a behavioral biometric signature-based authentication. Training of the signature samples for authentication purpose has been performed in parallel on Hadoop MapReduce framework using Resilient Backpropagation neural network. From rigorous experiments, it can be concluded that it achieves a speedup of 9 times, Equal error rate (EER) of 0.12, the sensitivity of 0.98 and specificity of 0.95. Performance comparison of the system with other state-of-art-algorithms shows that the proposed system preforms better than the existing systems in literature
Keywords: Biometric | Authentication | Healthcare | Cloud | Healthcare cloud | Hadoop
مقاله انگلیسی
5 A new MapReduce solution for associative classification to handle scalability and skewness in vertical data structure
یک راه حل جدید MapReduce برای طبقه بندی انجمنی برای مقابله با مقیاس پذیری و پوستی در ساختار داده های عمودی-2020
Associative classification is a promising methodology in information mining that uses the association rule discovery procedures to build the classifier. But they have some limitations like: they are not able to handle big data as they have memory constraints, high time complexity, load imbalance and data skewness. Data skewness occurs invariably when big data analytics comes in picture and affects the efficiency of an approach. This paper presents the MapReduce solution for associative classification in respect of vertical data layout. To handle these problems we have proposed two algorithms MRMCAR- F (MapReduce-Multi Class Associative Classifier-MapReduce fast algorithm) and MR-MCAR-L (MapReduce-Multi Class Associative Classifier Load parallel frequent pattern growth algorithm). Also in this paper, MapReduce solution of Tid List and Database coverage has been proposed. We have used three type of pruning techniques viz. database coverage, global and distributed pruning. The proposed approaches have been compared with latest approach from the literature survey in terms of accuracy, computation time and data skewness. The existing scalable approaches cannot handle skewness while, our proposed method handles it in a very effective manner. All the experiments have been performed on six datasets which have been extracted from UCI repositories on the Hadoop framework. Proposed algorithms are scalable solutions for associative classification to handle big data and data skewness.
Keywords: Associative classification | Scalability | Data skewness | Load balancing | Big data | Hadoop
مقاله انگلیسی
6 A big-data oriented recommendation method based on multi-objective optimization
یک روش توصیه داده های بزرگ گرا برای بهینه سازی چند هدفی-2019
Due to its successful application in recommender systems, collaborative filtering (CF) has become a hot research topic in data mining. For traditional CF-based recommender systems, the accuracy of recommendation results can be guaranteed while the diversity will be lost. An ideal recommender system should be built with both accurate and diverse performance. Faced with accuracy–diversity dilemma, we propose a novel recommendation method based on MapReduce framework. In MapReduce framework, a block computational technique is used to shorten the operational time. And an improved collaborative filtering model is refined with a novel similarity computational process which considers many factors. By translating the procedure of generating personalized recommendation results into a multi-objective optimization problem, the multiple conflicts between accuracy and diversity are well handled. The experimental results demonstrate that our method outperforms other state-of-the-art methods.
Keywords: Recommender systems | Multi-objective optimization | MapReduce | Accuracy | Diversity
مقاله انگلیسی
7 PUMA: Parallel subspace clustering of categorical data using multi-attribute weights
PUMA: خوشه بندی موازی زیر فضای داده های دسته ای با استفاده از وزنهای چند صفته-2019
There are two main reasons why traditional clustering schemes are incompetent for high-dimensional categorical data. First, traditional methods usually represent each cluster by all dimensions without dif- ference; and second, traditional clustering methods only rely on an individual dimension of projection as an attribute’s weight ignoring relevance among attributes. We solve these two problems by a MapReduce- based subspace clustering algorithm (called PUMA ) using multi-attribute weights. The attribute subspaces are constructed in our PUMA by calculating an attribute-value weight based on the co-occurrence prob- ability of attribute values among different dimensions. PUMA obtains sub-clusters corresponding to re- spective attribute subspaces from each computing node in parallel. Lastly, PUMA measures various scale clusters by applying the hierarchical clustering method to iteratively merge sub-clusters. We implement PUMA on a 24-node Hadoop cluster. Experimental results reveal that using multi-attribute weights with subspace clustering can achieve better clustering accuracy on both synthetic and real-world high dimen- sional datasets. Experimental results also show that PUMA achieves high performance in terms of exten- sibility, scalability and the nearly linear speedup with respect to number of nodes. Additionally, exper- imental results demonstrate that PUMA is reasonable, effective, and practical to expert systems such as knowledge acquisition, word sense disambiguation, automatic abstracting and recommender systems.
Keywords: Parallel subspace clustering | Multi-attribute weights | High dimension | Categorical data | MapReduce
مقاله انگلیسی
8 Toward modeling and optimization of features selection in Big Data based social Internet of Things
به سوی مدل سازی و بهینه سازی انتخاب ویژگی ها در داده های بزرگ مبتنی بر اینترنت اشیا اجتماعی-2018
The growing gap between users and the Big Data analytics requires innovative tools that address the challenges faced by big data volume, variety, and velocity. Therefore, it becomes computationally inefficient to analyze and select features from such massive volume of data. Moreover, advancements in the field of Big Data application and data science poses additional challenges, where a selection of appropriate features and High-Performance Computing (HPC) solution has become a key issue and has attracted attention in recent years. Therefore, keeping in view the needs above, there is a requirement for a system that can efficiently select features and analyze a stream of Big Data within their requirements. Hence, this paper presents a system architecture that selects features by using Artificial Bee Colony (ABC). Moreover, a Kalman filter is used in Hadoop ecosystem that is used for removal of noise. Furthermore, traditional MapReduce with ABC is used that enhance the processing efficiency. Moreover, a complete four-tier architecture is also proposed that efficiently aggregate the data, eliminate unnecessary data, and analyze the data by the proposed Hadoop-based ABC algorithm. To check the efficiency of the proposed algorithms exploited in the proposed system architecture, we have implemented our proposed system using Hadoop and MapReduce with the ABC algorithm. ABC algorithm is used to select features, whereas, MapReduce is supported by a parallel algorithm that efficiently processes a huge volume of data sets. The system is implemented using MapReduce tool at the top of the Hadoop parallel nodes with near real time. Moreover, the proposed system is compared with Swarm approaches and is evaluated regarding efficiency, accuracy and throughput by using ten different data sets. The results show that the proposed system is more scalable and efficient in selecting features.
Keywords: SIoT ، Big Data ، ABC algorithm، Feature selection
مقاله انگلیسی
9 A new architecture of Internet of Things and big data ecosystem for secured smart healthcare monitoring and alerting system
معماری جدید اینترنت اشیاء و اکوسیستم داده های بزرگ برای نظارت بر سیستم مراقبت سلامت هوشمند و سیستم هشدار دهنده امن-2018
Wearable medical devices with sensor continuously generate enormous data which is often called as big data mixed with structured and unstructured data. Due to the complexity of the data, it is difficult to process and analyze the big data for finding valuable information that can be useful in decision making. On the other hand, data security is a key requirement in healthcare big data system. In order to overcome this issue, this paper proposes a new architecture for the implementation of IoT to store and process scalable sensor data (big data) for health care applications. The Proposed architecture consists of two main sub architectures, namely, Meta Fog-Redirection (MF-R) and Grouping and Choosing (GC) architecture. MF-R architecture uses big data technologies such as Apache Pig and Apache HBase for collection and storage of the sensor data (big data) generated from different sensor devices. The proposed GC architecture is used for securing integration of fog computing with cloud computing. This architecture also uses key management service and data categorization function (Sensitive, Critical and Normal) for providing security services. The framework also uses MapReduce based prediction model to predict the heart diseases. Performance evaluation parameters such as throughput, sensitivity, accuracy, and f-measure are calculated to prove the efficiency of the proposed architecture as well as the prediction model.
Keywords: Wireless sensor networks ، Internet of Things ، Big data analytics ، Cloud computing and health car
مقاله انگلیسی
10 A novel adaptive e-learning model based on Big Data by using competence-based knowledge and social learner activities
یک مدل تطبیقی جدید یادگیری الکترونیکی مبتنی بر داده های بزرگ با استفاده ازدانش مبتنی بر شایستگی و فعالیت های یادگیرنده اجتماعی-2018
The e-learning paradigm is becoming one of the most important educational methods, which is a deci sive factor for learning and for making learning relevant. However, most existing e-learning platforms offer traditional e-learning system in order that learners access the same evaluation and learning con tent. In response, Big Data technology in the proposed adaptive e-learning model allowed to consider new approaches and new learning strategies. In this paper, we propose an adaptive e-learning model for providing the most suitable learning content for each learner. This model based on two levels of adaptive e-learning. The first level involves two steps: (1) determining the relevant future educational objectives through the adequate learner e-assessment method using MapReduce-based Genetic Algo rithm, (2) generating adaptive learning path for each learner using the MapReduce-based Ant Colony Optimization algorithm. In the second level, we propose MapReduce-based Social Networks Analysis for determining the learner motivation and social productivity in order to assign a specific learning rhythm to each learner. Finally, the experimental results show that the presented algorithms implemented on Big Data environment converge much better than those implementations with traditional concurrent works. Also, this work provides main benefit because it describes how Big Data technology transforms e-learning paradigm.
Keywords: Adaptive e-learning ، Big data ، MapReduce ، Genetic algorithm ، Personalized learning path ، Ant colony optimization algorithms ، Social networks analysis ، Motivation and productivity ، Learning content
مقاله انگلیسی
rss مقالات ترجمه شده rss مقالات انگلیسی rss کتاب های انگلیسی rss مقالات آموزشی
logo-samandehi
بازدید امروز: 6427 :::::::: بازدید دیروز: 3097 :::::::: بازدید کل: 40694 :::::::: افراد آنلاین: 51