Designing a general type-2 fuzzy expert system for diagnosis of depression
طراحی سیستم تخصصی فازی نوع 2 برای تشخیص افسردگی-2019
Depression is a common and important mental disorder that affects the quality of human life. Since people with depression are not aware of their disorder and sometimes suffer from physical symptoms such as chronic pain, refer to a physician instead of a psychologist. Hence, physician’s diagnosis is not always correct in all patients. In the other words, misdiagnosis may occur by mislabeling their mental disorder as physical diseases. Delay in depression diagnosis may have irrecoverable outcomes such as suicide. Therefore, the most challenging aspect of depression diagnosis is to limit time loss and preserve accuracy. In this paper, a novel general type-2 fuzzy expert system for depression diagnosis, considering two main objectives, was developed. These objectives include accuracy of the system and diagnosis time. The proposed system might be a helpful guideline for the physician to lead patients toward psychologist by asking 15 questions from patients. The proposed general Type-2 expert system has five steps. In the first step, we generate general type-2 membership function by using zSlices method and interval agreement approach (IAA). Then fuzzy rules are extracted out of data gathered from hospital and we extend Mendel method briefly in the second step. Approximate reasoning is applied in the third step. In the fourth step, we solve a multi-objective problem to minimize time and maximize accuracy by using MOEA/D method. Accordingly, in order to minimize time, feature selection is applied. In this process, we use MIFS (Mutual Information Feature Selection) method and briefly, we extend it. In the final step, we choose an appropriate solution from achieved Pareto Front (PF). The proposed general type-2 expert system has been tested and evaluated to show its performance. This Intelligent system is able to diagnose depression accurately at a suitable time.
Keywords: Depression Computing with words (CWW) | General type-2 fuzzy sets | zSlices | MOEA/D algorithm | Feature selection | Beck Depression Inventory-II test (BDI-II) | Adaptive system | Expert system
Machine learning-based coronary artery disease diagnosis: A comprehensive review
تشخیص بیماری عروق کرونر مبتنی بر یادگیری ماشین: یک مرور جامع-2019
Coronary artery disease (CAD) is the most common cardiovascular disease (CVD) and often leads to a heart attack. It annually causes millions of deaths and billions of dollars in financial losses worldwide. Angiography, which is invasive and risky, is the standard procedure for diagnosing CAD. Alternatively, machine learning (ML) techniques have been widely used in the literature as fast, affordable, and noninvasive approaches for CAD detection. The results that have been published on ML-based CAD diagnosis differ substantially in terms of the analyzed datasets, sample sizes, features, location of data collection, performance metrics, and applied ML techniques. Due to these fundamental differences, achievements in the literature cannot be generalized. This paper conducts a comprehensive and multifaceted review of all relevant studies that were published between 1992 and 2019 for ML-based CAD diagnosis. The impacts of various factors, such as dataset characteristics (geographical location, sample size, features, and the stenosis of each coronary artery) and applied ML techniques (feature selection, performance metrics, and method) are investigated in detail. Finally, the important challenges and shortcomings of ML-based CAD diagnosis are discussed.
Keywords: CAD diagnosis | Machine learning | Data mining | Feature selection
A new machine learning technique for an accurate diagnosis of coronary artery disease
یک روش جدید یادگیری ماشین برای تشخیص دقیق بیماری عروق کرونر-2019
Background and objective: Coronary artery disease (CAD) is one of the commonest diseases around the world. An early and accurate diagnosis of CAD allows a timely administration of appropriate treatment and helps to reduce the mortality. Herein, we describe an innovative machine learning methodology that enables an accurate detection of CAD and apply it to data collected from Iranian patients. Methods: We first tested ten traditional machine learning algorithms, and then the three-best perform- ing algorithms (three types of SVM) were used in the rest of the study. To improve the performance of these algorithms, a data preprocessing with normalization was carried out. Moreover, a genetic algorithm and particle swarm optimization, coupled with stratified 10-fold cross-validation, were used twice: for optimization of classifier parameters and for parallel selection of features. Results: The presented approach enhanced the performance of all traditional machine learning algorithms used in this study. We also introduced a new optimization technique called N2Genetic optimizer (a new genetic training). Our experiments demonstrated that N2Genetic-nuSVM provided the accuracy of 93.08% and F1-score of 91.51% when predicting CAD outcomes among the patients included in a well-known Z-Alizadeh Sani dataset. These results are competitive and comparable to the best results in the field. Conclusions: We showed that machine-learning techniques optimized by the proposed approach, can lead to highly accurate models intended for both clinical and research use.
Keywords: Coronary artery disease (CAD) | Machine learning | Normalization | Genetic algorithm | Particle swarm optimization | Feature selection | Classification
Machine learning methods for MRI biomarkers analysis of pediatric posterior fossa tumors
روشهای یادگیری ماشین برای تحلیل نشانگرهای زیستی MRI از تومورهای حفره ای کودکان-2019
Medical imaging technologies provide an increasing number of opportunities for disease prediction and prognosis. Specifically, imaging biomarkers can quantify the entire tumor phenotypes to enhance the prediction. Machine learning technology can be explored to mine and analyze these biomarkers and to establish predictive models for the clinical applications. Several studies have applied various machine learning methods to imaging biomarkers based clinical predictions of different diseases. Here we seek to evaluate different machine learning methods in pediatric posterior fossa tumor prediction. We present a machine learning based magnetic resonance imaging biomarkers analysis framework for two kinds of pediatric posterior fossa tumors. In details, three feature extraction methods are used to obtain 300 imaging biomarkers. 10 feature selection methods and 11 classifiers are evaluated by the quantified predictive performance and stability, and importance consistency of features and the influence of the experimental factors are also analyzed. Our results demonstrate that the CFS feature selection method (accuracy: 83.85 5.51%, stability: [0.84, 0.06]) and SVM classifier (accuracy: 85.38 3.47%, RSD: 4.77%) show relatively better performance than others and should be preferred. Among all the biomarkers, 17 texture features seem to be more important. Multifactor analysis results indicate the choice of classifier accounts for the most contribution to the variability in performance (37.25%). The machine learning based framework is efficient for pediatric posterior fossa tumors biomarkers analysis and could provide valuable references and decision support for assisted clinical diagnosis.
Keywords: Pediatric posterior fossa tumor | Magnetic resonance imaging | Biomarker | Machine learning | Feature selection | Classifier
Determining relevant biomarkers for prediction of breast cancer using anthropometric and clinical features: A comparative investigation in machine learning paradigm
تعیین نشانگرهای زیستی مربوط به پیش بینی سرطان پستان با استفاده از خصوصیات آنتروپومتریک و کلینیکی: بررسی مقایسه ای در پارادایم یادگیری ماشین-2019
Early detection of breast cancer plays crucial role in planning and result of associated treatment. The purpose of this article is threefold: (i) to investigate whether or not clinical features obtained using routine blood analysis combined with anthropometric measure- ments can be utilized for envisaging breast cancer using predictive machine learning techniques; (ii) to explore the role of various machine learning components such as feature selection, data division protocols and classification to determine suitable biomarkers for breast cancer prediction; and (iii) to evaluate a recent database of clinical and anthropometric measurements acquired from normal individuals and individuals suffering from breast cancer. A database consisting of anthropometric and clinical attributes is used in the experiments. Various feature selection and statistical significance analysis methods are used to determine the relevance of various features. Furthermore, popular classifiers such as kernel based support vector machine (SVM), Naïve Bayesian, linear discriminant, quadratic discriminant, logistic regression, K-nearest neighbor (K-NN) and random forest were implemented and evaluated for breast cancer risk prediction using these features. Results of feature selection techniques indicate that among the nine features considered in this study, glucose, age and resistin are found to be most relevant and effective biomarkers for breast cancer prediction. Further, when these three features are used for classification, the medium K-NN classifier achieves the highest classification accuracy of 92.105% followed by medium Gaussian SVM which achieves classification accuracy of 83.684% under hold out data division protocol.
Keywords: Breast cancer biomarkers | Machine learning | Expert systems | Clinical features | Feature selection
A recommender system for component-based applications using machine learning techniques
یک سیستم توصیه گر برای برنامه های کاربردی مبتنی بر مؤلفه با استفاده از تکنیک های یادگیری ماشین-2019
Software designers are striving to create software that adapts to their users’ requirements. To this end, the development of component-based interfaces that users can compound and customize according to their needs is increasing. However, the success of these applications is highly dependent on the users’ ability to locate the components useful for them, because there are often too many to choose from. We propose an approach to address the problem of suggesting the most suitable components for each user at each moment, by creating a recommender system using intelligent data analysis methods. Once we have gathered the interaction data and built a dataset, we address the problem of transforming an original dataset from a real component-based application to an optimized dataset to apply machine learning algorithms through the application of feature engineering techniques and feature selection methods. Moreover, many aspects, such as contextual information, the use of the application across several devices with many forms of interaction, or the passage of time (components are added or removed over time), are taken into consideration. Once the dataset is optimized, several machine learning algorithms are applied to create recommendation systems. A series of experiments that create recommendation models are conducted applying several machine learning algorithms to the optimized dataset (before and after applying feature selection methods) to determine which recommender model obtains a higher accuracy. Thus, through the deployment of the recommendation system that has better results, the likelihood of success of a component-based application is increased by allowing users to find the most suitable components for them, enhancing their user experience and the application engagement.
Keywords: Machine learning | Recommender systems | Feature engineering | Feature selection | Component-based interfaces | Interaction information acquisition
Direct marketing campaigns in retail banking with the use of deep learning and random forests
کمپین های بازاریابی مستقیم در بانکداری خرده فروشی با استفاده از یادگیری عمیق و جنگل های تصادفی-2019
Credit products are a crucial part of business of banks and other financial institutions. A novel approach based on time series of customer’s data representation for predicting willingness to take a personal loan is shown. Proposed testing procedure based on moving window allows detection of complex, sequen- tial, time based dependencies between particular transactions. Moreover, this approach reduces noise by eliminating irrelevant dependencies that would occur due to the lack of time dimension analysis. The system for identifying customers interested in credit products, based on classification with random forests and deep neural networks is proposed. The promising results of empirical studies prove that the system is able to extract significant patterns from customers historical transfer and transactional data and predict credit purchase likelihood. Our approach, including the testing method, is not limited to banking sector and can be easily transferred and implemented as a general purpose direct marketing campaign system.
Keywords: Consumer credit | Retail banking | Direct marketing | Marketing campaigns | Database marketing | Random forest | Deep learning | Deep belief networks | Data mining | Time series | Feature selection | Boruta algorith
Comparison of machine learning classifiers for differentiation of grade 1 from higher gradings in meningioma: A multicenter radiomics study
مقایسه طبقه بندی کننده های یادگیری ماشین برای تمایز درجه 1 از درجه های بالاتر در مننژیوما: یک مطالعه رادیومتری چند متری-2019
Background and purpose: Advanced imaging analysis for the prediction of tumor biology and modelling of clinically relevant parameters using computed imaging features is part of the emerging field of radiomics research. Here we test the hypothesis that a machine learning approach can distinguish grade 1 from higher gradings in meningioma patients using radiomics features derived from a heterogenous multicenter dataset of multi-paramedic MRI. Methods: A total of 138 patients from 5 international centers that underwent MRI prior to surgical resection of intracranial meningiomas were included. Segmentation was performed manually on co-registered multi-parametric MR images using apparent diffusion coefficient (ADC) maps, T1-weighted (T1), post-contrast T1-weighted (T1c), subtraction maps (Sub, T1c – T1), T2-weighted fluid-attenuated inversion recovery (FLAIR) and T2- weighted (T2) images. Feature selection was performed and using cross-validation to separate training from testing data, four machine learning classifiers were scored on combinations of MRI modalities: random forest (RF), extreme gradient boosting (XGBoost), support vector machine (SVM) and multilayer perceptron (MLP). Results: The best AUC of 0.97 (1.0 and 0.97 for sensitivity and specificity) was observed for the combination of ADC, ADC of the peritumoral edema, T1, T1c, Sub and FLAIR-derived features using only 16 of the 10,914 possible features and XGBoost. Conclusions: Machine learning using radiomics features derived from multi-parametric MRI is capable of high AUC scores with high sensitivity and specificity in classifying meningiomas between low and higher gradings despite heterogeneous protocols across different centers. Feature selection can be performed effectively even when extracting a large amount of data for radiomics fingerprinting
Keywords: Random forest | Support vector machine | Multilayer perceptron | XGBoost | Machine learning | Meningioma | Grading | Feature selection
Design and field implementation of an impact detection system using committees of neural networks
طراحی و اجرای میدانی یک سیستم تشخیص ضربه با استفاده از کمیته های شبکه های عصبی-2019
Many critical societal functions depend on uninterrupted service of civil engineering infrastructure. Rail- roads represent important infrastructure components of the transportation sector and provide both pas- senger and freight services. Railroad bridges over roadways are susceptible to impacts from overheight vehicles and equipment, which may damage bridge girders or supports and must be investigated after each event. One method of monitoring for vehicle-bridge collisions utilizes accelerometers to monitor for abnormal bridge vibrations corresponding to abnormal activity. Passing trains under normal operat- ing conditions frequently produce significant bridge responses that have similar response characteristics to bridge strikes, but do not need to be investigated. This paper presents an expert system which com- prises committees of artificial neural networks trained to interrogate data collected from accelerometers mounted on the bridge, assess the nature of the acceleration signal, and classify the event as either a passing train or a potentially damaging impact. This system is trained using acceleration time histories from accelerometers installed on 8 low-clearance rail bridges; no finite element model simulations were used for network training or data stream creation. The presented system accurately detects and classifies impacts with average impact detection performance ranging from 91–100% with average false positive rates limited to 0.00–0.75%.
Keywords: Bridge impacts Impact detection | Signal classification | Feature selection | Artificial neural networks
Hybrid fast unsupervised feature selection for high-dimensional data
انتخاب ویژگی بدون نظارت هیبریدی سریع برای داده های با ابعاد بالا-2019
The emergence of “curse of dimensionality”issue as a result of high reduces datasets deteriorates the ca- pability of learning algorithms, and also requires high memory and computational costs. Selection of fea- tures by discarding redundant and irrelevant features functions as a crucial machine learning technique aimed at reducing the dimensionality of these datasets, which improves the performance of the learning algorithm. Feature selection has been extensively applied in many application areas relevant to expert and intelligent systems, such as data mining and machine learning. Although many algorithms have been developed so far, they are still unsatisfying confronting high-dimensional data. This paper presented a new hybrid filter-based feature selection algorithm based on acombination of clustering and the modi- fied Binary Ant System (BAS), called FSCBAS, to overcome the search space and high-dimensional data processing challenges efficiently. This model provided both global and local search capabilities between and within clusters. In the proposed method, inspired by genetic algorithm and simulated annealing, a damped mutation strategy was introduced that avoided falling into local optima, and a new redundancy reduction policy adopted to estimate the correlation between the selected features further improved the algorithm. The proposed method can be applied in many expert system applications such as microar- ray data processing, text classification and image processing in high-dimensional data to handle the high dimensionality of the feature space and improve classification performance simultaneously. The perfor- mance of the proposed algorithm was compared to that of state-of-the-art feature selection algorithms using different classifiers on real-world datasets. The experimental results confirmed that the proposed method reduced computational complexity significantly, and achieved better performance than the other feature selection methods.
Keywords: Feature selection | High-dimensional data | Binary ant system | Clustering | Mutation