دانلود و نمایش مقالات مرتبط با خوشه بندی داده ها::صفحه 1
بلافاصله پس از پرداخت دانلود کنید

با سلام خدمت کاربران در صورتی که با خطای سیستم پرداخت بانکی مواجه شدید از طریق کارت به کارت (6037997535328901 بانک ملی ناصر خنجری ) مقاله خود را دریافت کنید (تا مشکل رفع گردد). 

نتیجه جستجو - خوشه بندی داده ها

تعداد مقالات یافته شده: 12
ردیف عنوان نوع
1 A Cryptographic Ensemble for secure third party data analysis: Collaborative data clustering without data owner participation
یک گروه رمزنگاری برای تجزیه و تحلیل داده های شخص ثالث امن: خوشه بندی داده های مشارکتی بدون مشارکت صاحب داده-2019
This paper introduces the twin concepts Cryptographic Ensembles and Global Encrypted Distance Matrices (GEDMs), designed to provide a solution to outsourced secure collaborative data clustering. The cryptographic ensemble comprises: Homomorphic Encryption (HE) to preserve raw data privacy, while supporting data analytics; and Multi-User Order Preserving Encryption (MUOPE) to preserve the privacy of the GEDM. Clustering can therefore be conducted over encrypted datasets without requiring decryption or the involvement of data owners once encryption has taken place, all with no loss of accuracy. The GEDM concept is applicable to large scale collaborative data mining applications that feature horizontal data partitioning. In the paper DBSCAN clustering is adopted for illustrative and evaluation purposes. The results demonstrate that the proposed solution is both efficient and accurate while maintaining data privacy.
Keywords: Data mining as a service | Privacy preserving data mining | Security | Data outsourcing
مقاله انگلیسی
2 Clustering of multi-view relational data based on particle swarm optimization
خوشه بندی داده های رابطه ای چند منظوره بر اساس بهینه سازی ازدحام ذرات-2019
Clustering of multi-view data has received increasing attention since it explores multiple views of data sets aiming at improving clustering accuracy. Particle Swarm Optimization (PSO) is a well-known population-based meta-heuristic successfully used in cluster analysis. This paper introduces two hybrid clustering methods for multi-view relational data. These hybrid methods combine PSO and hard clus- tering algorithms based on multiple dissimilarity matrices. These methods take advantage of the global convergence ability of PSO and the local exploitation of hard clustering algorithms in the position up- date step, aiming to improve the balance between exploitation and exploration processes. Moreover, the paper provides adapted versions of 11 fitness functions suitable for vector data aiming at dealing with multi-view relational data. Two performance criteria were used to evaluate the clustering quality using the two proposed methods over eleven real-world data sets including image and document data sets. Among new findings, it was observed that the top three fitness functions are Silhouette index, Xu index and Intra-cluster homogeneity. The performance of the proposed algorithms was compared with previ- ous single and multi-view relational clustering algorithms. The results show that the proposed methods significantly outperformed the other algorithms in the majority of cases. The results reinforce the im- portance of the application of techniques such as PSO-based clustering algorithms in the field of expert systems and machine learning. Such application enhances classification accuracy and cluster compactness. Besides, the proposed algorithms can be useful tools in content-based image retrieval systems, providing good categorizations and automatically learning relevance weights for each cluster of images and sets of views.
Keywords: PSO | Cluster analysis | Multi-view clustering | Relational data
مقاله انگلیسی
3 An efficient fuzzy c-means approach based on canonical polyadic decomposition for clustering big data in IoT
یک روش کارآمد فازی c-means بر اساس تجزیه polyadic کانونی برای خوشه بندی داده های بزرگ در اینترنت اشیا-2018
Mining smart data from the collected big data in Internet of Things which attempts to better human life by integrating physical devices into the information space. As one of the most important clus tering techniques for drilling smart data, the fuzzy c-means algorithm (FCM) assigns each object to multiple groups by calculating a membership matrix. However, each big data object has a large number of attributes, posing an remarkable challenge on FCM for IoT big data real-time cluster ing. In this paper, we propose an efficient fuzzy c-means approach based on the tensor canonical polyadic decomposition for clustering big data in Internet of Things. In the presented scheme, the traditional fuzzy c-means algorithm is converted to the high-order tensor fuzzy c-means algorithm (HOFCM) via a bijection function. Furthermore, the tensor canonical polyadic decomposition is utilized to reduce the attributes of every objects for enhancing the clustering efficiency. Finally, the extensive experiments are conducted to compare the developed scheme with the traditional fuzzy c-means algorithm on two large IoT datasets including sWSN and eGSAD regarding clus tering accuracy and clustering efficiency. The results argue that the developed scheme achieves a significantly higher clustering efficiency with a slight clustering accuracy drop compared with the traditional algorithm, indicating the potential of the developed scheme for drilling smart data from IoT big data.
Keywords: Big data, Internet of Things, Smart data, Fuzzy c-means algorithm, Canonical polyadic decomposition
مقاله انگلیسی
4 Clustering big IoT data by metaheuristic optimized mini-batch and parallel partition-based DGC in Hadoop
خوشه بندی داده های اینترنت اشیا بزرگ توسط بهینه سازی ماتریس های متمرکز و DGC مبتنی بر پارتیشن موازی در Hadoop-2018
Clustering algorithms are an important branch of data mining family which has been applied widely in IoT applications such as finding similar sensing patterns, detecting outliers, and segmenting large behavioral groups in real-time. Traditional full batch k-means for clustering IoT big data is confronted by large scaled storage and high computational complexity problems. In order to overcome the latency inherited from full batch k-means, two big data processing methods were often used: the first method is to use small batches as the input data to multiple computers for reducing the computation efforts. However, depending on the sensed data which may be heterogeneously fused from different sources in an IoT network, the size of each mini batch may vary in each iteration of clustering process. When these input data are subject to clustering their centers would shift drastically, which affects the final clustering results. The second method is parallel computing, it decreases the runtime while the overall computational effort remains the same. Furthermore, some centroid based clustering algorithm such as k-means converges easily into local optima. In light of this, in this paper, a new partitioned clustering method that is optimized by metaheuristic is proposed for IoT big data environment. The method has three main activities: Firstly, a sample of the dataset is partitioned into mini batches. It is followed by adjusting the centroids of the mini batches of data. The third step is collating the mini batches to form clusters, so the quality of the clusters would be maximized. How the positions of the centroids could be optimally attuned at the mini batches are governed by a metaheuristic called Dynamic Group Optimization. The data are processed in parallel in Hadoop. Extensive experiments are conducted to investigate the performance. The results show that our proposed method is a promising tool for clustering fused IoT data efficiently.
Keywords: Metaheuristic ، Partitioning ، Clustering ، Hadoop ، IoT data، Data fusion
مقاله انگلیسی
5 A parallel metaheuristic data clustering framework for cloud
یک چارچوب خوشه بندی داده های متا مکاشفه ای موازی برای ابر-2018
A high performance data analytics for internet of things (IoT) has been a promising research subject in recent years because traditional data mining algorithms may not be applicable to big data of IoT. One of the main reasons is that the data that need to be analyzed may exceed the storage size of a single machine. The computation cost of data analysis tasks that is too high for a single computer system is another critical problem we have to confront when analyzing data from an IoT system. That is why an efficient data clustering framework for metaheuristic algorithm on a cloud computing environment is presented in this paper for data analytics, which explains how to divide mining tasks of a mining algorithm into different nodes (i.e., the Map process) and then aggregate the mining results from these nodes (i.e., Reduce process). We further attempted to use the proposed framework to implement data clustering algorithms (e.g., k-means, genetic k-means, and particle swarm optimization) on a standalone system and Spark. The experimental results show that the performance of the proposed framework makes it useful to develop data clustering algorithms on a cloud computing environment.
Keywords: Metaheuristic algorithm ، Internet of things ، Data clustering problem
مقاله انگلیسی
6 Review on mining data from multiple data sources
مروری بر داده کاوی از منابع اطلاعاتی متعدد-2018
In this paper, we review recent progresses in the area of mining data from multiple data sources. The advancement of information communication technology has generated a large amount of data from dif ferent sources, which may be stored in different geological locations. Mining data from multiple data sources to extract useful information is considered to be a very challenging task in the field of data min ing, especially in the current big data era. The methods of mining multiple data sources can be divided mainly into four groups: (i) pattern analysis, (ii) multiple data source classification, (iii) multiple data source clustering, and (iv) multiple data source fusion. The main purpose of this review is to systemat ically explore the ideas behind current multiple data source mining methods and to consolidate recent research results in this field.
Keywords: Multiple data source mining ، Pattern analysis ، Data classification ، Data clustering ، Data fusion
مقاله انگلیسی
7 Data mining and clustering in chemical process databases for monitoring and knowledge discovery
داده کاوی و خوشه بندی در پایگاه داده های فرایند شیمیایی برای نظارت و کشف دانش-2018
Modern chemical plants maintain large historical databases recording past sensor measurements which advanced process monitoring techniques analyze to help plant operators and engineers interpret the meaning of live trends in databases. However, many of the best process monitoring methods require data organized into groups before training is possible. In practice, such organization rarely exists and the time required to create classified training data is an obstacle to the use of advanced process monitoring strate gies. Data mining and knowledge discovery techniques drawn from computer science literature can help engineers find fault states in historical databases and group them together with little detailed knowledge of the process. This study evaluates how several data clustering and feature extraction techniques work together to reveal useful trends in industrial chemical process data. Two studies on an industrial scale separation tower and the Tennessee Eastman process simulation demonstrate data clustering and feature extraction effectively revealing significant process trends from high dimensional, multivariate data. Pro cess knowledge and supervised clustering metrics compare the cluster results against true labels in the data to compare performance of different combinations of dimensionality reduction and data clustering approaches.
Keywords: Data mining ، Data clustering ، Dimensionality reduction ، Knowledge discovery
مقاله انگلیسی
8 A parallel metaheuristic data clustering framework for cloud
یک چارچوب خوشه بندی داده های مکاشفه ای موازی برای ابر-2018
A high performance data analytics for internet of things (IoT) has been a promising research subject in recent years because traditional data mining algorithms may not be applicable to big data of IoT. One of the main reasons is that the data that need to be analyzed may exceed the storage size of a single machine. The computation cost of data analysis tasks that is too high for a single computer system is another critical problem we have to confront when analyzing data from an IoT system. That is why an efficient data clustering framework for metaheuristic algorithm on a cloud computing environment is presented in this paper for data analytics, which explains how to divide mining tasks of a mining algorithm into different nodes (i.e., the Map process) and then aggregate the mining results from these nodes (i.e., Reduce process). We further attempted to use the proposed framework to implement data clustering algorithms (e.g., k-means, genetic k-means, and particle swarm optimization) on a standalone system and Spark. The experimental results show that the performance of the proposed framework makes it useful to develop data clustering algorithms on a cloud computing environment.
Keywords: Metaheuristic algorithm ، Internet of things ، Data clustering problem
مقاله انگلیسی
9 Secure weighted possibilistic c-means algorithm on cloud for clustering big data
الگوریتم C-Measure امکان سنجی امن در ابر برای خوشه بندی داده های بزرگ-2018
The weighted possibilistic c-means algorithm is an important soft clustering technique for big data analytics with cloud computing. However, the private data will be disclosed when the raw data is directly uploaded to cloud for efficient clustering. In this paper, a secure weighted possibilistic c-means algorithm based on the BGV encryption scheme is proposed for big data clustering on cloud. Specially, BGV is used to encrypt the raw data for the privacy preservation on cloud. Furthermore, the Taylor theorem is used to approximate the functions for calculating the weight value of each object and updating the membership matrix and the cluster centers as the polynomial functions which only include addition and multiplication operations such that the weighed possibilistic c-means algorithm can be securely and correctly performed on the encrypted data in cloud. Finally, the presented scheme is estimated on two big datasets, i.e., eGSAD and sWSN, by comparing with the traditional weighted possibilistic c-means method in terms of effectiveness, efficiency and scalability. The results show that the presented scheme performs more efficiently than the traditional weighted possiblistic c-means algorithm and it achieves a good scalability on cloud for big data clustering.
Keywords: Big data ، Possibilistic c-means algorithm ، Cloud computing ، BGV
مقاله انگلیسی
10 A novel data clustering algorithm based on modified gravitational search algorithm
یک الگوریتم نوین برای خوشه بندی داده ها برمبنای الگوریتم جستجوی گرانشی اصلاح شده-2017
Data clustering is a popular analysis tool for data statistics in many fields such as pattern recognition, data mining, machine learning, image analysis, and bioinformatics. The aim of data clustering is to represent large datasets by a fewer number of prototypes or clusters, which brings simplicity in modeling data and thus plays a central role in the process of knowledge discovery and data mining. In this paper, a novel data clustering algorithm based on modified Gravitational Search Algorithm is proposed, which is called Bird Flock Gravitational Search Algorithm (BFGSA). The BFGSA introduces a new mechanism into GSA to add diversity, a mechanism which is inspired by the collective response behavior of birds. This mechanism performs its diversity enhancement through three main steps including initialization, identification of the nearest neighbors, and orientation change. The initialization is to generate candidate populations for the second steps and the orientation change updates the position of objects based on the nearest neighbors. Due to the collective response mechanism, the BFGSA explores a wider range of the search space and thus escapes suboptimal solutions. The performance of the proposed algorithm is evaluated through 13 real benchmark datasets from the well-known UCI Machine Learning Repository. Its performance is compared with the standard GSA, the Artificial Bee Colony (ABC), the Particle Swarm Optimization (PSO), the Firefly Algorithm (FA), K-means, and other four clustering algorithms from the literature. The simulation results indicate that the BFGSA can effectively be used for data clustering.
Keywords: Gravitational search algorithm | Learning algorithm | Collective behavior | Data clustering | Clustering Validation | Nature-inspired algorithm
مقاله انگلیسی
rss مقالات ترجمه شده rss مقالات انگلیسی rss کتاب های انگلیسی rss مقالات آموزشی
logo-samandehi
بازدید امروز: 2017 :::::::: بازدید دیروز: 0 :::::::: بازدید کل: 2017 :::::::: افراد آنلاین: 48