با سلام خدمت کاربران در صورتی که با خطای سیستم پرداخت بانکی مواجه شدید از طریق کارت به کارت (6037997535328901 بانک ملی ناصر خنجری ) مقاله خود را دریافت کنید (تا مشکل رفع گردد).
ردیف | عنوان | نوع |
---|---|---|
1 |
Pivot-based approximate k-NN similarity joins for big high-dimensional data
پیوندهای شباهت تقریبی k-NN مبتنی بر محوری برای داده های بزرگ با ابعاد بزرگ-2020 Given an appropriate similarity model, the k-nearest neighbor similarity join represents a useful
yet costly operator for data mining, data analysis and data exploration applications. The time to
evaluate the operator depends on the size of datasets, data distribution and the dimensionality of
data representations. For vast volumes of high-dimensional data, only distributed and approximate
approaches make the joins practically feasible. In this paper, we investigate and evaluate the performance
of multiple MapReduce-based approximate k-NN similarity join approaches on two leading
Big Data systems Apache Hadoop and Spark. Focusing on the metric space approach relying on
reference dataset objects (pivots), this paper investigates distributed similarity join techniques with
and without approximation guarantees and also proposes high-dimensional extensions to previously
proposed algorithms. The paper describes the design guidelines, algorithmic details, and key theoretical
underpinnings of the compared approaches and also presents the empirical performance evaluation,
approximation precision, and scalability properties of the implemented algorithms. Moreover, the
Spark source code of all these algorithms has been made publicly available. Key findings of the
experimental analysis are that randomly initialized pivot-based methods perform well with big highdimensional
data and that, in general, the selection of the best algorithm depends on the desired levels
of approximation guarantee, precision and execution time. Keywords: Hadoop | Spark | MapReduce | k-NN | Approximate similarity join | High-dimensional data |
مقاله انگلیسی |
2 |
MapReduce based tipping point scheduler for parallel image processing
مانبندی نقطه اوج بر اساس MapReduce برای پردازش تصویر موازی-2020 Nowadays, Big Data image processing is very much in need due to its proven success in the field of business information system, medical science and social media. However, as the days are passing by, the computation of Big Data images is becoming more complex which ultimately results in complex resource management and higher task execution time. Researchers have been using a combination of CPU and GPU based computing to cut down the execution time, however, when it comes to scaling of compute nodes, then the combination of CPU and GPU based computing still remains a challenge due to the high commu- nication cost factor. In order to tackle this issue, the Map-Reduce framework has come out to be a viable option as its workflow optimization could be enhanced by changing its underlying job scheduling mech- anism. This paper presents a comparative study of job scheduling algorithms which could be deployed over various Big Data based image processing application and also proposes a tipping point scheduling algorithm to optimize the workflow for job execution on multiple nodes. The evaluation of the proposed scheduling algorithm is done by implementing parallel image segmentation algorithm to detect lung tu- mor for up to 3GB size of image dataset. In terms of performance comprising of task execution time and throughput, the proposed tipping point scheduler has come out to be the best scheduler followed by the Map-Reduce based Fair scheduler. The proposed tipping point scheduler is 1.14 times better than Map- Reduce based Fair scheduler and 1.33 times better than Map-Reduced based FIFO scheduler in terms of task execution time and throughput. In terms of speedup comparison between single node and multiple nodes, the proposed tipping point scheduler attained a speedup of 4.5 X for multi-node architecture. Keywords: Job scheduler | Workflow optimization | Map-Reduce | Tipping point scheduler | Parallel image segmentation | Lung tumor |
مقاله انگلیسی |
3 |
Distributed mining of high utility time interval sequential patterns using mapreduce approach
کاوش توزیع شده الگوهای پی در پی فاصله زمانی ابزار مطلوب با استفاده از روش Mapreduce-2020 High Utility Sequential Pattern mining (HUSP) algorithms aim to find all the high utility sequences from a sequence database. Due to the large explosion of data, recently few distributed algorithms have been designed for mining HUSPs based on the MapReduce framework. However, the existing HUSP algorithms such as USpan, HUS-Span and BigHUSP are able to predict only the order of items, they do not pre- dict the time between the items, that is, they do not include the time intervals between the successive items. But in a real-world scenario, time interval patterns provide more valuable information than con- ventional high utility sequential patterns. Therefore, we propose a distributed high utility time interval sequential pattern mining (DHUTISP) algorithm using the MapReduce approach that is suitable for big data. DHUTISP creates a novel time interval utility linked list data structure (TIUL) to efficiently calculate the utility of the resulting patterns. Moreover, two utility upper bounds, namely, remaining utility upper bound (RUUB) and co-occurrence utility upper bound (CUUB) are proposed to prune the unpromising candidates. We conducted various experiments to prove the efficiency of the proposed algorithm over both the distributed and non-distributed approaches. The experimental results show the efficiency of DHUTISP over state-of-the-art algorithms, namely, BigHUSP, AHUS-P, PUSOM and UTMining_A. Keywords: Big data | High utility itemset mining | High utility sequential pattern mining | Time interval sequential pattern mining | Mapreduce framework |
مقاله انگلیسی |
4 |
BAMHealthCloud: A biometric authentication and data management system for healthcare data in cloud
BAMHealthCloud: یک سیستم احراز هویت بیومتریک و سیستم مدیریت داده برای داده های مراقبت های بهداشتی در ابر-2020 Advancements in the healthcare industry have given rise to the security threat to the ever growing emedical
data. The healthcare data management system records patient’s data in different formats such
as text, numeric, pictures and videos leading to data which is big and unstructured. Also, hospitals
may have several branches in different geographical locations. Sometimes, for research purposes, there
is a need to integrate patients’ health data stored at different locations. In view of this, a cloud-based
healthcare management system can be an effective solution for efficient health care data management.
But the major concern of cloud-based healthcare system is the security aspect. It includes theft of identity,
tax fraudulence, bank fraud, insurance frauds, medical frauds and defamation of high profile patients.
Hence, a secure data access and retrieval is needed in order to provide security of critical medical records
in healthcare management system. Biometric based authentication mechanism is suitable in this scenario
since it overcomes the limitations of token theft and forgetting passwords in the conventional token idpassword
mechanism used for providing security. It also has high accuracy rate for secure data access and
retrieval. In the present paper, a cloud-based system for management of healthcare data
BAMHealthCloud is proposed, which ensures the security of e-medical data access through a behavioral
biometric signature-based authentication. Training of the signature samples for authentication purpose
has been performed in parallel on Hadoop MapReduce framework using Resilient Backpropagation neural
network. From rigorous experiments, it can be concluded that it achieves a speedup of 9 times, Equal
error rate (EER) of 0.12, the sensitivity of 0.98 and specificity of 0.95. Performance comparison of the system
with other state-of-art-algorithms shows that the proposed system preforms better than the existing
systems in literature Keywords: Biometric | Authentication | Healthcare | Cloud | Healthcare cloud | Hadoop |
مقاله انگلیسی |
5 |
A new MapReduce solution for associative classification to handle scalability and skewness in vertical data structure
یک راه حل جدید MapReduce برای طبقه بندی انجمنی برای مقابله با مقیاس پذیری و پوستی در ساختار داده های عمودی-2020 Associative classification is a promising methodology in information mining that uses the association
rule discovery procedures to build the classifier. But they have some limitations like: they are not able
to handle big data as they have memory constraints, high time complexity, load imbalance and data
skewness. Data skewness occurs invariably when big data analytics comes in picture and affects the
efficiency of an approach. This paper presents the MapReduce solution for associative classification
in respect of vertical data layout. To handle these problems we have proposed two algorithms MRMCAR-
F (MapReduce-Multi Class Associative Classifier-MapReduce fast algorithm) and MR-MCAR-L
(MapReduce-Multi Class Associative Classifier Load parallel frequent pattern growth algorithm). Also
in this paper, MapReduce solution of Tid List and Database coverage has been proposed. We have used
three type of pruning techniques viz. database coverage, global and distributed pruning. The proposed
approaches have been compared with latest approach from the literature survey in terms of accuracy,
computation time and data skewness. The existing scalable approaches cannot handle skewness while,
our proposed method handles it in a very effective manner. All the experiments have been performed
on six datasets which have been extracted from UCI repositories on the Hadoop framework. Proposed
algorithms are scalable solutions for associative classification to handle big data and data skewness. Keywords: Associative classification | Scalability | Data skewness | Load balancing | Big data | Hadoop |
مقاله انگلیسی |
6 |
A big-data oriented recommendation method based on multi-objective optimization
یک روش توصیه داده های بزرگ گرا برای بهینه سازی چند هدفی-2019 Due to its successful application in recommender systems, collaborative filtering (CF) has become a
hot research topic in data mining. For traditional CF-based recommender systems, the accuracy of
recommendation results can be guaranteed while the diversity will be lost. An ideal recommender
system should be built with both accurate and diverse performance. Faced with accuracy–diversity
dilemma, we propose a novel recommendation method based on MapReduce framework. In MapReduce
framework, a block computational technique is used to shorten the operational time. And an
improved collaborative filtering model is refined with a novel similarity computational process which
considers many factors. By translating the procedure of generating personalized recommendation
results into a multi-objective optimization problem, the multiple conflicts between accuracy and
diversity are well handled. The experimental results demonstrate that our method outperforms other
state-of-the-art methods. Keywords: Recommender systems | Multi-objective optimization | MapReduce | Accuracy | Diversity |
مقاله انگلیسی |
7 |
PUMA: Parallel subspace clustering of categorical data using multi-attribute weights
PUMA: خوشه بندی موازی زیر فضای داده های دسته ای با استفاده از وزنهای چند صفته-2019 There are two main reasons why traditional clustering schemes are incompetent for high-dimensional categorical data. First, traditional methods usually represent each cluster by all dimensions without dif- ference; and second, traditional clustering methods only rely on an individual dimension of projection as an attribute’s weight ignoring relevance among attributes. We solve these two problems by a MapReduce- based subspace clustering algorithm (called PUMA ) using multi-attribute weights. The attribute subspaces are constructed in our PUMA by calculating an attribute-value weight based on the co-occurrence prob- ability of attribute values among different dimensions. PUMA obtains sub-clusters corresponding to re- spective attribute subspaces from each computing node in parallel. Lastly, PUMA measures various scale clusters by applying the hierarchical clustering method to iteratively merge sub-clusters. We implement PUMA on a 24-node Hadoop cluster. Experimental results reveal that using multi-attribute weights with subspace clustering can achieve better clustering accuracy on both synthetic and real-world high dimen- sional datasets. Experimental results also show that PUMA achieves high performance in terms of exten- sibility, scalability and the nearly linear speedup with respect to number of nodes. Additionally, exper- imental results demonstrate that PUMA is reasonable, effective, and practical to expert systems such as knowledge acquisition, word sense disambiguation, automatic abstracting and recommender systems. Keywords: Parallel subspace clustering | Multi-attribute weights | High dimension | Categorical data | MapReduce |
مقاله انگلیسی |
8 |
Toward modeling and optimization of features selection in Big Data based social Internet of Things
به سوی مدل سازی و بهینه سازی انتخاب ویژگی ها در داده های بزرگ مبتنی بر اینترنت اشیا اجتماعی-2018 The growing gap between users and the Big Data analytics requires innovative tools that address
the challenges faced by big data volume, variety, and velocity. Therefore, it becomes computationally
inefficient to analyze and select features from such massive volume of data. Moreover, advancements
in the field of Big Data application and data science poses additional challenges, where a selection of
appropriate features and High-Performance Computing (HPC) solution has become a key issue and has
attracted attention in recent years. Therefore, keeping in view the needs above, there is a requirement for
a system that can efficiently select features and analyze a stream of Big Data within their requirements.
Hence, this paper presents a system architecture that selects features by using Artificial Bee Colony (ABC).
Moreover, a Kalman filter is used in Hadoop ecosystem that is used for removal of noise. Furthermore,
traditional MapReduce with ABC is used that enhance the processing efficiency. Moreover, a complete
four-tier architecture is also proposed that efficiently aggregate the data, eliminate unnecessary data, and
analyze the data by the proposed Hadoop-based ABC algorithm. To check the efficiency of the proposed
algorithms exploited in the proposed system architecture, we have implemented our proposed system
using Hadoop and MapReduce with the ABC algorithm. ABC algorithm is used to select features, whereas,
MapReduce is supported by a parallel algorithm that efficiently processes a huge volume of data sets. The
system is implemented using MapReduce tool at the top of the Hadoop parallel nodes with near real
time. Moreover, the proposed system is compared with Swarm approaches and is evaluated regarding
efficiency, accuracy and throughput by using ten different data sets. The results show that the proposed
system is more scalable and efficient in selecting features.
Keywords: SIoT ، Big Data ، ABC algorithm، Feature selection |
مقاله انگلیسی |
9 |
A new architecture of Internet of Things and big data ecosystem for secured smart healthcare monitoring and alerting system
معماری جدید اینترنت اشیاء و اکوسیستم داده های بزرگ برای نظارت بر سیستم مراقبت سلامت هوشمند و سیستم هشدار دهنده امن-2018 Wearable medical devices with sensor continuously generate enormous data which is often called as
big data mixed with structured and unstructured data. Due to the complexity of the data, it is difficult
to process and analyze the big data for finding valuable information that can be useful in decision
making. On the other hand, data security is a key requirement in healthcare big data system. In order
to overcome this issue, this paper proposes a new architecture for the implementation of IoT to store and
process scalable sensor data (big data) for health care applications. The Proposed architecture consists
of two main sub architectures, namely, Meta Fog-Redirection (MF-R) and Grouping and Choosing (GC)
architecture. MF-R architecture uses big data technologies such as Apache Pig and Apache HBase for
collection and storage of the sensor data (big data) generated from different sensor devices. The proposed
GC architecture is used for securing integration of fog computing with cloud computing. This architecture
also uses key management service and data categorization function (Sensitive, Critical and Normal) for
providing security services. The framework also uses MapReduce based prediction model to predict
the heart diseases. Performance evaluation parameters such as throughput, sensitivity, accuracy, and
f-measure are calculated to prove the efficiency of the proposed architecture as well as the prediction
model.
Keywords: Wireless sensor networks ، Internet of Things ، Big data analytics ، Cloud computing and health car |
مقاله انگلیسی |
10 |
A novel adaptive e-learning model based on Big Data by using competence-based knowledge and social learner activities
یک مدل تطبیقی جدید یادگیری الکترونیکی مبتنی بر داده های بزرگ با استفاده ازدانش مبتنی بر شایستگی و فعالیت های یادگیرنده اجتماعی-2018 The e-learning paradigm is becoming one of the most important educational methods, which is a deci
sive factor for learning and for making learning relevant. However, most existing e-learning platforms
offer traditional e-learning system in order that learners access the same evaluation and learning con
tent. In response, Big Data technology in the proposed adaptive e-learning model allowed to consider
new approaches and new learning strategies. In this paper, we propose an adaptive e-learning model
for providing the most suitable learning content for each learner. This model based on two levels of
adaptive e-learning. The first level involves two steps: (1) determining the relevant future educational
objectives through the adequate learner e-assessment method using MapReduce-based Genetic Algo
rithm, (2) generating adaptive learning path for each learner using the MapReduce-based Ant Colony
Optimization algorithm. In the second level, we propose MapReduce-based Social Networks Analysis for
determining the learner motivation and social productivity in order to assign a specific learning rhythm
to each learner. Finally, the experimental results show that the presented algorithms implemented on
Big Data environment converge much better than those implementations with traditional concurrent
works. Also, this work provides main benefit because it describes how Big Data technology transforms
e-learning paradigm.
Keywords: Adaptive e-learning ، Big data ، MapReduce ، Genetic algorithm ، Personalized learning path ، Ant colony optimization algorithms ، Social networks analysis ، Motivation and productivity ، Learning content |
مقاله انگلیسی |