Identifying influential factors distinguishing recidivists among offender patients with a diagnosis of schizophrenia via machine learning algorithms
شناسایی عوامل موثر در تشخیص تکرار مجدد در بین بیماران مجرم با تشخیص اسکیزوفرنی از طریق الگوریتم های یادگیری ماشین-2020
Purpose: There is a lack of research on predictors of criminal recidivism of offender patients diagnosed with schizophrenia. Methods: 653 potential predictor variables were anlyzed in a set of 344 offender patients with a diagnosis of schizophrenia (209 reconvicted) using machine learning algorithms. As a novel methodological approach, null hypothesis significance testing (NHST), backward selection, logistic regression, trees, support vector machines (SVM), and naive bayes were used for preselecting variables. Subsequently the variables identified as most influential were used for machine learning algorithm building and evaluation. Results: The two final models (with/without imputation) predicted criminal recidivism with an accuracy of 81.7 % and 70.6 % and a predictive power (area under the curve, AUC) of 0.89 and 0.76 based on the following predictors: prescription of amisulpride prior to reoffending, suspended sentencing to imprisonment, legal complaints filed by relatives/therapists/public authorities, recent legal issues, number of offences leading to forensic treatment, anxiety upon discharge, being single, violence toward care team and constant breaking of rules during treatment, illegal opioid use, middle east as place of birth, and time span since the last psychiatric inpatient treatment. Conclusion: Results provide new insight on possible factors influencing persistent offending in a specific subgroup of patients with a schizophrenic spectrum disorder.
Keywords: Criminal justice | Criminal recidivism | Machine learning | Offender | Schizophrenia
Big data analytics for financial Market volatility forecast based on support vector machine
تجزیه و تحلیل داده های بزرگ برای پیش بینی نوسانات مالی بازار بر اساس دستگاه بردار پشتیبانی-2020
High-frequency data provides a lot of materials and broad research prospects for in-depth research and understanding on financial market behavior, but the problems solved in the research of high-frequency data are far less than the problems faced and encountered, and the research value of high-frequency data will be greatly reduced without solving these problems. Volatility is an important measurement index of market risk, and the research and forecasting on the volatility of high-frequency data is of great significance to investors, government regulators and capital markets. To this end, by modelling the jump volatility of high-frequency data, the shortterm volatility of high-frequency data are predicted.
Keywords: Big data | Financial market | Volatility | Support vector machine
Development of a chemometric-assisted spectrophotometric method for quantitative simultaneous determination of Amlodipine and Valsartan in commercial tablet
توسعه یک روش اسپکتروفتومتری با کمک شیمیایی برای تعیین کمی همزمان آملودیپین و والرسارتان در قرص تجاری-2020
In this study, two drugs named Amlodipine (AML) and Valsartan (VAL) related to the high blood pressure were simultaneously determined in synthetic mixtures and Valzomix tablet. For this purpose, the chemometric-assisted spectrophotometric method was developed without any prepreparation. Artificial intelligence techniques, including artificial neural network (ANN) and least squares support vector machine (LS-SVM) as chemometrics procedures were proposed. Feed-forward back-propagation neural network (FFBP-NN) with two different algorithms, containing Levenberg–Marquardt (LM) and gradient descent with momentum and adaptive learning rate backpropagation (GDX) was applied. To select the best model, several layers and neurons were investigated. The results revealed that layer = 5 with 6 neurons and layer = 2 with 10 neurons had lower mean square error (MSE) (1.41 × 10−24, 1.16 × 10-23) for AML and VAL, respectively. In the LS-SVM method, gamma (γ) and sigma (σ) parameters were optimized. γ and σ were obtained 50, 30 and 40, 40 with the root mean square error (RMSE) of 0.4290 and 0.5598 for AML and VAL, respectively. Analysis of the pharmaceutical formulation was evaluated through the chemometrics methods and high-performance liquid chromatography (HPLC) as a reference technique. The obtained results were statistically compared with each other using the one-way analysis of variance (ANOVA) test. There were no significant differences between them and the proposed method was satisfactory for estimating the components of the Valzomix tablet.
Keywords: Spectrophotometry | Amlodipine | Valsartan | Artificial neural network | Least squares support vector machine
Prediction of the ground temperature with ANN, LS-SVM and fuzzy LS-SVM for GSHP application
پیش بینی دمای زمین با شبکه های عصبی، LS-SVM و LS-SVM فازی برای استفاده GSHP-2020
Ground source heat pump (GSHP) system has received more and more attentions for its energy-conserving and environmental-friendly properties. Acquisition of the undisturbed ground temperature is the prerequisite for designing of GSHP system. Measurement by burying temperature sensors underground is the conventional means for obtaining the ground temperature data. However, this way is usually time consuming and high investment, and also easily encounter with certain technical difficulties. The rapid development of intelligent computation algorithm provides solutions for many realistic difficult problems. Basing on a great number of the measured data of the ground temperature from two boreholes with 100m depth located in Chongqing, ground temperature prediction models basing on artificial neural network (ANN) and support vector machine based on least square (LS-SVM) are established, respectively. And then, two kinds of validation works, i.e., holdout validation and k-fold validation are conducted toward the two models, respectively. Furthermore, a new method that correlating fuzzy theory with LS-SVM is proposed to solve the big computation burden problem encountered by LS-SVM model. By comparing with the above two models, it is concluded that the newly proposed model can not only improve the calculation speed obviously but also be able to promote the prediction accuracy, especially superior to the single LS-SVM model.
Keywords: Ground temperature | Fuzzy | Support vector machine | Ground source heat pump
Multiple AI model integration strategy : Application to saturated hydraulic conductivity prediction from easily available soil properties
استراتژی یکپارچه سازی مدل هوش مصنوعی چندگانه: کاربرد در پیش بینی هدایت هیدرولیکی اشباع شده از خصوصیات خاک که به راحتی در دسترس است-2020
A multiple model integration scheme driven by artificial neural network (ANN) (MM-ANN) was developed and tested to improve the prediction accuracy of soil hydraulic conductivity (Ks) in Tabriz plain, an arid region of Iran. The soil parameters such as silt, clay, organic matter (OM), bulk density (BD), pH and electrical conductivity (EC) were used as model inputs to predict soil Ks. Standalone models including multivariate adaptive regression splines (MARS), M5 model tree (M5Tree), support vector machine (SVM) and extreme learning machine (ELM) were also implemented for comparative evaluation with MM-ANN model predictions. Based on several performance indicators such as Nash Sutcliffe Efficiency (NSE), results showed that the calibrated MMANN model involving the predictions of MARS, M5Tree, SVM and ELM models by considering all the soil parameters used in this study as inputs provided superior soil Ks estimates. The proposed hybrid model (MMANN) emerged as a reliable intelligence model for the assessment of soil hydraulic conductivity with an NSE=0.939 & 0.917 during training and testing, respectively. Accurate prediction of field-scale soil hydraulic conductivity is crucial from the view point of agricultural sustainability and management prospects.
Keywords: Saturated hydraulic conductivity | Extreme learning machine | Multiple model strategy | Multivariate adaptive regression splines | M5Tree | Support | vector machine | Prediction
AI-based optimization of PEM fuel cell catalyst layers for maximum power density via data-driven surrogate modeling
بهینه سازی مبتنی بر هوش مصنوعی لایه های کاتالیزور سلول سوختی PEM برای حداکثر چگالی توان از طریق مدل سازی جایگزین داده محور-2020
Catalyst layer (CL) is the core electrochemical reaction region of proton exchange membrane fuel cells (PEMFCs). Its composition directly determines PEMFC output performance. Existing experimental or modeling methods are still insufficient on the deep optimization of CL composition. This work develops a novel artificial intelligence (AI) framework combining a data-driven surrogate model and a stochastic optimization algorithm to achieve multi-variables global optimization for improving the maximum power density of PEMFCs. Simulation results of a three-dimensional computational fluid dynamics (CFD) PEMFC model coupled with the CL agglomerate model constitutes the database, which is then used to train the data-driven surrogate model based on Support Vector Machine (SVM), a typical AI algorithm. Prediction performance shows that the squared correlation coefficient (R-square) and mean percentage error in the test set are 0.9908 and 3.3375%, respectively. The surrogate model has demonstrated comparable accuracy to the physical model, but with much greater computation- resource efficiency: the calculation of one polarization curve will be within one second by the surrogate model, while it may cost hundreds of processor-hours by the physical CFD model. The surrogate model is then fed into a Genetic Algorithm (GA) to obtain the optimal solution of CL composition. For verification, the optimal CL composition is returned to the physical model, and the percentage error between the surrogate model predicted and physical model simulated maximum power densities under the optimal CL composition is only 1.3950%. The results indicate that the proposed framework can guide the multi-variables optimization of complex systems.
Keywords: Proton exchange membrane fuel cell | Catalyst layer composition | Agglomerate model | Data-driven surrogate model | Stochastic optimization algorithm
Predicting and explaining corruption across countries: A machine learning approach
پیش بینی و توضیح فساد در سراسر کشور: رویکرد یادگیری ماشینی-2020
In the era of Big Data, Analytics, and Data Science, corruption is still ubiquitous and is perceived as one of the major challenges of modern societies. A large body of academic studies has attempted to identify and explain the potential causes and consequences of corruption, at varying levels of granularity, mostly through theoretical lenses by using correlations and regression-based statistical analyses. The present study approaches the phenomenon from the predictive analytics perspective by employing contemporary machine learning techniques to discover the most important corruption perception predictors based on enriched/enhanced nonlinear models with a high level of predictive accuracy. Specifically, within the multiclass classification modeling setting that is employed herein, the Random Forest (an ensemble-type machine learning algorithm) is found to be the most accurate prediction/classification model, followed by Support Vector Machines and Artificial Neural Networks. From the practical standpoint, the enhanced predictive power of machine learning algorithms coupled with a multi-source database revealed the most relevant corruption-related information, contributing to the related body of knowledge, generating actionable insights for administrator, scholars, citizens, and politicians. The variable importance results indicated that government integrity, property rights, judicial effectiveness, and education index are the most influential factors in defining the corruption level of significance
Keywords: Corruption perception | Machine learning | Predictive modeling | Random forest | Society policies and regulations |Government integrity | Social development
Predicting academic performance of students from VLE big data using deep learning models
پیش بینی عملکرد علمی دانش آموزان از داده های بزرگ VLE با استفاده از مدل های یادگیری عمیق-2020
The abundance of accessible educational data, supported by the technology-enhanced learning platforms, provides opportunities to mine learning behavior of students, addressing their issues, optimizing the educational environment, and enabling data-driven decision making. Virtual learning environments complement the learning analytics paradigm by effectively providing datasets for analysing and reporting the learning process of students and its reflection and contribution in their respective performances. This study deploys a deep artificial neural network on a set of unique handcrafted features, extracted from the virtual learning environments clickstream data, to predict at-risk students providing measures for early intervention of such cases. The results show the proposed model to achieve a classification accuracy of 84%–93%. We show that a deep artificial neural network outperforms the baseline logistic regression and support vector machine models. While logistic regression achieves an accuracy of 79.82%–85.60%, the support vector machine achieves 79.95%–89.14%. Aligned with the existing studies - our findings demonstrate the inclusion of legacy data and assessment-related data to impact the model significantly. Students interested in accessing the content of the previous lectures are observed to demonstrate better performance. The study intends to assist institutes in formulating a necessary framework for pedagogical support, facilitating higher education decision-making process towards sustainable education.
Keywords: Learning analytics | Predicting success | Educational data | Machine learning | Deep learning | Virtual learning environments (VLE)
The impact of entrepreneurship orientation on project performance: A machine learning approach
تأثیر گرایش کارآفرینی بر عملکرد پروژه: یک رویکرد یادگیری ماشینی-2020
Recent studies in project management have shown the important role of entrepreneurship orientation of the individuals in project performance. Although identifying the role of entrepreneurship orientation as a critical success factor in project performance has been considered as an important issue, it is also important to develop a measurement system for predicting performance based on the degree of an individual’s entrepreneurial orientation. In this study, we use predictive analytics by proposing a machine learning approach to predict individuals’ project performance based on measures of several aspects of entrepreneurial orientation and entrepreneurial attitude of the individuals. We investigated this relationship using a sample of 185 observations and a range of machine learning algorithms including lasso, ridge, support vector machines, neural networks, and random forest. Our results showed that the best method for predicting project performance is lasso. After identifying the best predictive model, we then used the Bayesian Information Criterion and the Akaike Information Criterion to identify the most significant factors. Our results identify all three aspects of entrepreneurial attitude (social self-efficacy, appearance self-efficacy, and comparativeness) and one aspect of entrepreneurial orientation (proactiveness) as the most important factors. This study contributes to the relationship between entrepreneurship skills and project performance and provides insights into the application of emerging tools in data science and machine learning in operations management and project management research.
Keywords: Project performance | Entrepreneurship orientation | Machine learning | Supervised learning | Predictive analytics
Internet of energy-based demand response management scheme for smart homes and PHEVs using SVM
اینترنت برنامه پاسخگویی به تقاضای مبتنی بر انرژی برای خانه های هوشمند و PHEV با استفاده از SVM-2020
The usage of information and communication technology (ICT) in the power sector has led to the emergence of smart grid (SG). The connected loads in SG are able to communicate their consumption data to the grid using ICT and thus forming a large Internet of Energy (IoE) network. However, various issues such as–increasing demand–supply gap, grid instability, and deteriorating quality of service persist in this network which degrade its performance. These issues can be handled in an efficient way by managing the demand response (DR) of different types of loads. For this purpose, cloud computing can be leveraged to gather the data generated in IoE network and perform analytics to manage DR. Working in this direction, a novel scheme to handle the DR of smart homes (SHs) and plug-in hybrid electric vehicles (PHEVs) is presented in this paper. The proposed scheme is based on analyzing the demand of these users at the cloud server for flattening the overall load profile of grid. This scheme is divided into two hierarchical stages which work as follows. In the first stage, the residential and PHEV users are identified whose demands can be regulated. This task is achieved with the help of a binary-class support vector machine (SVM) which uses Gaussian kernel function to classify these users. In the next stage, the load in SHs is curtailed on the basis of a pre-defined rule-base after analyzing the consumption data of various devices; whereas PHEVs are managed by controlling their charging rates. The efficacy of proposed scheme has been tested on PJM benchmark data and Open Energy Information dataset. The simulation results prove that the proposed scheme is effective in maintaining the overall load profile of SG by managing the DR of SHs and PHEV users.
Keywords: Data analytics | Demand response | Plug-in hybrid electric vehicles | Smart grid | Smart homes | Support vector machine