A dynamic classification unit for online segmentation of big data via small data buffers
واحد طبقه بندی پویا برای تقسیم آنلاین داده های بزرگ از طریق بافر داده های کوچک-2020
In many segmentation processes, we assign new cases according to a model that was built on the basis of past cases. As long as the new cases are “similar enough” to the past cases, segmentation proceeds normally. However, when a new case is substantially different from the known cases, a reexamination of the previously created segments is required. The reexamination may result in the creation of new segments or in the updating of the existing ones. In this paper, we assume that in big and dynamic data environments it is not possible to reexamine all past data and, therefore, we suggest using small groups of selected cases, stored in small data buffers, as an alternative to the collection of all past data. We present an incremental dynamic classifier that supports real-time unsupervised segmentation in big and dynamic data environments. In order to reduce the computational effort of unsupervised clustering in such environments, the suggested model performs calculations only on the relevant data buffers that store the relevant representative cases. In addition, the suggested model can serve as a dynamic classification unit (DCU) that can act as an autonomous agent, as well as collaborate with other DCUs. The evaluation is presented by comparing three approaches: static, dynamic, and incremental dynamic.
Keywords: Incremental dynamic classifier | Dynamic segmentation | Incremental data analysis | Cluster analysis | Classification | Big data
Rapid discrimination of Salvia miltiorrhiza according to their geographical regions by laser induced breakdown spectroscopy (LIBS) and particle swarm optimization-kernel extreme learning machine (PSO-KELM)
تبعیض سریع miltiorrhiza مریم گلی با توجه به مناطق جغرافیایی خود را با طیف سنجی شکست ناشی از لیزر (LIBS) و یادگیری ماشین افراطی بهینه سازی ازدحام ذرات (PSO-KELM)-2020
Laser-induced breakdown spectroscopy (LIBS) coupled with particle swarm optimization-kernel extreme learning machine (PSO-KELM) method was developed for classification and identification of six types Salvia miltiorrhiza samples in different regions. The spectral data of 15 Salvia miltiorrhiza samples were collected by LIBS spectrometer. An unsupervised classification model based on principal components analysis (PCA) was employed first for the classification of Salvia miltiorrhiza in different regions. The results showed that only Salvia miltiorrhiza samples from Gansu and Sichuan Province can be easily distinguished, and the samples in other regions present a bigger challenge in classification based on PCA. A supervised classification model based on KELM was then developed for the classification of Salvia miltiorrhiza, and two methods of random forest (RF) and PSO were used as the variable selection method to eliminate useless information and improve classification ability of the KELM model. The results showed that PSO-KELM model has a better classification result with a classification accuracy of 94.87%. Comparing the results with that obtained by particle swarm optimization-least squares support vector machines (PSO-LSSVM) and PSO-RF model, the PSO-KELM model possess the best classification performance. The overall results demonstrate that LIBS technique combined with PSO-KELM method would be a promising method for classification and identification of Salvia miltiorrhiza samples in different regions.
Keywords: Laser-induced breakdown spectroscopy | Particle swarm optimization | Kernel extreme learning machine | Salvia miltiorrhiza | Classification
Relationships between absenteeism, conservation group membership, and land management among family forest owners
روابط بین غیبت ، عضویت در گروه حفاظت و مدیریت زمین در میان صاحبان جنگل های خانوادگی-2020
Absentee landowners, or those who do not live on their forestland, own approximately 117 million acres of private forestland in the U.S. Thus, their land management decisions and activities influence the flow of forestbased goods and services. We explore the question of whether absentee family forest owners are less active land managers than resident landowners and whether membership in conservation organizations is associated with higher levels of land management activity by absentee owners. To examine these questions, we administered a mail survey to randomly-selected family forest landowners in Indiana. While we found some support for the contention that absentee owners are less active forestland managers than resident owners, we also found they are not necessarily inactive landowners. We found absentee owners were less likely to have: inspected their forestland for invasive plants, pulled or cut invasive plants, used herbicides to kill invasive plants, reduced fire hazard, or grazed livestock than resident owners. Absentee owners were more likely to be enrolled in the Indiana Classified Forest and Wildlands Program, a preferential forest property tax program. Absentee owners who are members of a conservation organization were more likely than absentee non-member owners to have undertaken a variety of land management activities, including: undertaking wildlife habitat improvement projects, inspecting their forestland for invasive plants, pulling or cutting invasive plants, enrolling in the Indiana Classified Forest and Wildlands program, and obtaining a management plan.
Keywords: Forest landowner association | Forest property tax | Indiana Classified Forest and Wildlands | Invasive plant | Non-industrial private forest landowner (NIPF) | Resident landowner
A different sleep apnea classification system with neural network based on the acceleration signals
یک سیستم طبقه بندی sleep apnea متفاوت با شبکه عصبی مبتنی بر سیگنال های شتاب-2020
Background and objective: The apnea syndrome is characterized by an abnormal breath pause or reduction in the airflow during sleep. It is reported in the literature that it affects 2% of middle-aged women and 4% of middle-aged men, approximately. This study has vital importance, especially for the elderly, the disabled, and pediatric sleep apnea patients. Methods: In this study, a new diagnostic method is developed to detect the apnea event by using a microelectromechanical system (MEMS) based acceleration sensor. It records the value of acceleration by measuring the movements of the diaphragm in three axes during the respiratory. The measurements are carried out simultaneously, a medical spirometer (Fukuda Sangyo), to test the validity of measurement results. An artificial neural network model was designed to determine the apnea event. For the number of neurons in the hidden layer, 1-3-5-10-18-20-25 values were tried, and the network with three hidden neurons giving the most suitable result was selected. In the designed ANN, three layers were formed that three neurons in the hidden layer, the two neurons at the input, and two neurons at the output layer. Results: A study group was formed of 5 patients (having different characteristics (age, height, and body weight)). The patients in the study group have sleep apnea (SA) in different grades. Several 12.723 acceleration data (ACC) in the XYZ-axis from 5 different patients are recorded for apnea event training and detection. The measured accelerometer (ACC) data from one of the patients (called H1) are used to train an ANN. During the training phase, MSE is used to calculate the fitness value of the apnea event. Then Apnea event is detected successfully for the other patients by using ANN trained only with H1’s ACC data. Conclusions: The sleep apnea event detection system is presented by using ANN from directly acceleration values. Measurements are performed by the MEMS-based accelerometer and Industrial Spirometer simultaneously. A total of 12723 acceleration data is measured from 5 different patients. The best result in 7000 iterations was reached (the number of iterations was tried up to 10.000 with 1000 steps). 605 data of only H1 measurements are used to train ANN, and then all data used to check the performance of the ANN as well as H2, H3, H4, and H5 measurement results. MSE performance benchmark shows us that trained ANN successfully detects apnea events. One of the contributions of this study to literature is that only ACC data are used in the ANN training step. After training for one patient, the ANN system can monitor the apnea event situation on-line for others.
Keywords: Sleep apnea | Acceleration sensor | Acceleration data | Artificial neural network | Medical decision making
Column generation based heuristic for learning classification trees
اکتشاف مبتنی بر تولید ستون برای یادگیری درختان طبقه بندی -2020
This paper explores the use of Column Generation (CG) techniques in constructing univariate binary de- cision trees for classification tasks. We propose a novel Integer Linear Programming (ILP) formulation, based on root-to-leaf paths in decision trees. The model is solved via a Column Generation based heuris- tic. To speed up the heuristic, we use a restricted instance data by considering a subset of decision splits, sampled from the solutions of the well-known CART algorithm. Extensive numerical experiments show that our approach is competitive with the state-of-the-art ILP-based algorithms. In particular, the pro- posed approach is capable of handling big data sets with tens of thousands of data rows. Moreover, for large data sets, it finds solutions competitive to CART.
Keywords: Machine learning | Decision trees | Column generation | Classification | CART | Integer linear programming
Using multi-features to partition users for friends recommendation in location based social network
استفاده از چند ویژگی برای توصیف دوستان برای توصیه دوستان در شبکه اجتماعی مبتنی بر مکان-2020
Friend recommendation is an important feature of social network applications to help people make new friends and expand their social circles. However, the user-location and user-user information in location based social network are both too sparse which contributes to a big challenge for recommendation. In this paper, a new multi-feature SVM based friend recommendation model (MF-SVM) is proposed which regarded as a binary classification problem to tackle this challenge. We extract three features of each user by new methods respectively. The kernel density estimation and information entropy are used to smooth the check-in data and highlight the activity level of users to extract spatial-temporal feature. Then the social feature is extracted by considering the diversity of common friends. After that a new topic model improved by LDA is proposed which both considers user reviews and corresponding service description to extract textual feature. Finally, these features are used to train the SVM and whether the users have a friend link can be predicted by our model. The experiments on real-world datasets demonstrate that the proposed method in this paper outperforms the state-of-art friend recommendation methods under different types of evaluation metrics.
Keywords: Friend recommendation | Binary classification | SVM | Multi-feature
A taxonomy of AI techniques for 6G communication networks
طبقه بندی تکنیک های هوش مصنوعی برای شبکه های ارتباطی 6G-2020
With 6G flagship program launched by the University of Oulu, Finland, for full future adaptation of 6G by 2030, many institutes worldwide have started to explore various issues and challenges in 6G communication networks. 6G offers ultra high-reliable and massive ultra-low latency while opening the doors for many applications currently not viable by today’s 4G and 5G communication standards. The current 5G technology has security and privacy issues which makes its usage in limited applications. In such an environment, we believe that AI can offer efficient solutions for the aforementioned issues having low communication overhead cost. Keeping focus on all these issues, in this paper, we presented a comprehensive survey on AI-enabled 6G communication technology, which can be used in wide range of future applications. In this article, we explore how AI can be integrated into different applications such as object localization, UAV communication, surveillance, security and privacy preservation etc. Finally, we discussed a use case that shows the adoption of AI techniques in intelligent transport system.
Keywords: Artificial Intelligence | 6G | Communication networks | Mobile edge computing | Intelligent transportation system
Simultaneous feature weighting and parameter determination of Neural Networks using Ant Lion Optimization for the classification of breast cancer
وزن همزمان ویژگی ها و تعیین پارامتر شبکه های عصبی با استفاده از بهینه سازی مورچه ها برای طبقه بندی سرطان پستان-2020
In this paper, feature weighting is used to develop an effective computer-aided diagnosis system for breast cancer. Feature weighting is employed because it boosts the classification performance more as compared to feature subset selection. Specifically, a wrapper method utilizing the Ant Lion Optimization algorithm is presented that searches for best feature weights and parametric values of Multilayer Neural Network simultaneously. The selection of hidden neurons and backpropagation training algorithms are used as parameters of neural networks. The performance of the proposed approach is evaluated on three breast cancer datasets. The data is initially normalized using tanh method to remove the effects of dominant features and outliers. The results show that the proposed wrapper method has a better ability to attain higher accuracy as compared to the existing techniques. The obtained high classification performance validates the work which has the potential for becoming an alternative to the other well-known techniques.
Keywords: Antlion optimization | Breast cancer | Feature weighting | Neural Networks
AI-based identification of low-frequency debris flow catchments in the Bailong River basin, China
شناسایی مبتنی بر هوش مصنوعی از حوضه های جریان باقی مانده با فرکانس پایین در حوضه رودخانه Bailong ، چین-2020
Debris flow is a major geohazard inmountainous regions and poses a significant threat to life and property. The damage caused by debris flows have increased with the expansion of human settlements and activity into the mountainous regions of China. In regards to risks from debris flows, previously unrecognized low-frequency debris flow catchments constitute an especially significant threat. According to our investigation, only about 500 catchments have debris flow records in N2000 catchments of Bailong River basin. The main purpose of this paper is to introduce a new methodology using Artificial Intelligence (AI) that can simultaneously input parameters related to geomorphological conditions andmaterial conditions to better distinguish low-frequency debris flow catchments (LFDs) frommedium-high frequency debris flow catchments (MHFDs). A total of 449 prototypical debris flow catchments, 15 parameters, and 9 commonly used learning machines were used to build identification models. Debris flow catchments are divided into 4 cases (LO1-LO4) based on different sample ratios of LFDs and MHFDs, which are input into each classifier one by one. Based on model evaluation, the CHAID model in the case LO2 performs best, which only uses five parameters (formation lithology index, land use index, vegetation coverage index, drainage density and landslide density index) to predict LFDs. The results indicate that LFDs are mainly distributed in areas with less landslide distribution and better vegetation coverage compared with MHFDs. However, the distribution of LFDs is concentrated on FLI (formation lithology index) = 4, which is the weak lithology area. The tree classifier seems to be better at classifying fluvial processes. The model developed in this paper can help us quickly find LFDs in similar areas, and help to assess the risk of debris flows.
Keywords: Low-frequency debris flow | Artificial Intelligence | Classification machine | Bailong River basin, China
Opportunities for fraudsters: When would profitable milk adulterations go unnoticed by common, standardized FTIR measurements?
فرصت ها برای کلاهبرداران: چه موقع تقلب های سودآور شیر بدون اندازه گیری های مشترک و استاندارد FTIR مورد توجه قرار نمی گیرند؟-2020
Milk is regarded as one of the top food products susceptible to adulteration where its valuable components are specifically identified as high-risk indicators for milk fraud. The current study explores the impact of common milk adulterants on the apparent compositional parameters of milk from the Dutch market as measured by standardized Fourier transform infrared (FTIR) spectroscopy. More precisely, it examines the detectability of these adulterants at various concentration levels using the compositional parameters individually, in a univariate manner, and together in a multivariate approach. In this study we used measured boundaries but also more practical variance-adjusted boundaries to set thresholds for detection of adulteration. The potential economic impact of these adulterations under a milk payment scheme is also evaluated. Twenty-four substances were used to produce various categories of milk adulterations, each at four concentration levels. These substances comprised five protein-rich adulterants, five nitrogen-based adulterants, seven carbohydrate-based adulterants, six preservatives and water, resulting in a set of 360 samples to be analysed. The results showed that the addition of protein-rich adulterants, as well as dicyandiamide and melamine, increased the apparent protein content, while the addition of carbohydrate-based adulterants, whey protein isolate, and skimmed milk powder, increased the apparent lactose content. When considering the compositional parameters univariately, especially protein- and nitrogen-based adulterants did not raise a flag of unusual apparent concentrations at lower concentration levels. Addition of preservatives also went unnoticed. The multivariate approach did not improve the level of detection. Regarding the potential profit of milk adulteration, whey protein and corn starch seem particularly interesting. Combining the artificial inflation of valuable components, the resulting potential profit, and the gaps in detection, it appears that the whey protein isolates deserve particular attention when thinking like a criminal.
Keywords: Fourier transform infrared | Milk adulteration | Milk composition | Milkoscan measurements | One class classification | Profitability