عنوان انگلیسی مقاله:
Performance evaluation of a cost-sensitive differential evolutionclassifier using spark – Imbalanced binary classification
ترجمه فارسی عنوان مقاله:
ارزیابی عملکرد طبقه بندی تکامل دیفرانسیل حساس به هزینه با استفاده از spark- طبقه بندی باینری نامتعادل
Sciencedirect - Elsevier - Journal of Computational Science, 40 (2020) 101065: doi:10:1016/j:jocs:2019:101065
Jamil Al-Sawwa∗, Simone A. Ludwig
tNowadays, the amount of data that has been collected or generated in many sectors has been growingexponentially because of the rapid development of technologies such as the Internet of Things (IoT).Additionally, the nature of this data is imbalanced. The need for extracting valuable information for deci-sion support from this data poses a challenge to the scientific community to find a solution to cope withlarge imbalanced data. In previous work, our cost-sensitive differential evolution classification algorithmshowed efficient performance for handling highly imbalanced data sets. However, our algorithm showsinefficient performance when applied to big data sets, thus lacking to scale with data size increases. In thispaper, we design and implement a parallel version of our cost-sensitive differential evolution classifierusing the Apache Spark framework (SCDE). The aim is to handle large and binary imbalanced data. Themain idea of the algorithm is to find the optimal centroid for each target label using differential evolu-tion by minimizing the total misclassification cost and then assign unlabeled data points to the closestcentroid. Our experiments include a real data set that is based on intrusion detection in order to evaluateour algorithm’s scalability and performance. The experimental results show that SCDE efficiently handlesimbalanced binary data and scales very well with data size increases. Moreover, the speedup and scaleupresults that are obtained by SCDE are close to linear.
Keywords:Differential evolution | Classification | Cost-sensitive | Apache Spark | Imbalanced data set | Big data analytics | Intrusion detection