A dynamic classification unit for online segmentation of big data via small data buffers
واحد طبقه بندی پویا برای تقسیم آنلاین داده های بزرگ از طریق بافر داده های کوچک-2020
In many segmentation processes, we assign new cases according to a model that was built on the basis of past cases. As long as the new cases are “similar enough” to the past cases, segmentation proceeds normally. However, when a new case is substantially different from the known cases, a reexamination of the previously created segments is required. The reexamination may result in the creation of new segments or in the updating of the existing ones. In this paper, we assume that in big and dynamic data environments it is not possible to reexamine all past data and, therefore, we suggest using small groups of selected cases, stored in small data buffers, as an alternative to the collection of all past data. We present an incremental dynamic classifier that supports real-time unsupervised segmentation in big and dynamic data environments. In order to reduce the computational effort of unsupervised clustering in such environments, the suggested model performs calculations only on the relevant data buffers that store the relevant representative cases. In addition, the suggested model can serve as a dynamic classification unit (DCU) that can act as an autonomous agent, as well as collaborate with other DCUs. The evaluation is presented by comparing three approaches: static, dynamic, and incremental dynamic.
Keywords: Incremental dynamic classifier | Dynamic segmentation | Incremental data analysis | Cluster analysis | Classification | Big data
Rapid discrimination of Salvia miltiorrhiza according to their geographical regions by laser induced breakdown spectroscopy (LIBS) and particle swarm optimization-kernel extreme learning machine (PSO-KELM)
تبعیض سریع miltiorrhiza مریم گلی با توجه به مناطق جغرافیایی خود را با طیف سنجی شکست ناشی از لیزر (LIBS) و یادگیری ماشین افراطی بهینه سازی ازدحام ذرات (PSO-KELM)-2020
Laser-induced breakdown spectroscopy (LIBS) coupled with particle swarm optimization-kernel extreme learning machine (PSO-KELM) method was developed for classification and identification of six types Salvia miltiorrhiza samples in different regions. The spectral data of 15 Salvia miltiorrhiza samples were collected by LIBS spectrometer. An unsupervised classification model based on principal components analysis (PCA) was employed first for the classification of Salvia miltiorrhiza in different regions. The results showed that only Salvia miltiorrhiza samples from Gansu and Sichuan Province can be easily distinguished, and the samples in other regions present a bigger challenge in classification based on PCA. A supervised classification model based on KELM was then developed for the classification of Salvia miltiorrhiza, and two methods of random forest (RF) and PSO were used as the variable selection method to eliminate useless information and improve classification ability of the KELM model. The results showed that PSO-KELM model has a better classification result with a classification accuracy of 94.87%. Comparing the results with that obtained by particle swarm optimization-least squares support vector machines (PSO-LSSVM) and PSO-RF model, the PSO-KELM model possess the best classification performance. The overall results demonstrate that LIBS technique combined with PSO-KELM method would be a promising method for classification and identification of Salvia miltiorrhiza samples in different regions.
Keywords: Laser-induced breakdown spectroscopy | Particle swarm optimization | Kernel extreme learning machine | Salvia miltiorrhiza | Classification
Forensic analysis of condom traces: Chemical considerations and review of the literature
تجزیه و تحلیل پزشکی قانونی از اثار کاندوم: ملاحظات شیمیایی و بررسی ادبیات-2020
The analysis of condom traces has recently been added to the standard forensic examination protocol of sexual assault and rape cases. Several recent studies have thus focussed on the detection of condom components and classification of the chemical profiles using statistics, obtaining very promising results. The purpose of the present article is to critically review the literature regarding condom chemical analysis. A large analytical panel of both destructive and non-destructive methods has been proposed for the analysis of condom traces, each offering completely different analysis type and thus complementary information. However, few studies have considered these traces within a human matrix, which is necessary to establish an accessible protocol for forensic laboratories to allow this type of analysis. Additionally, issues remain concerning reproducibility, sensitivity, and the validation of analytical parameters. Considering that the demand for condom residue analysis is increasing, there is a definite need for further research on the forensic analysis of condom traces in order to offer quality services to the criminal justice system.
Keywords: Rape | Sexual assault | Protection | Analytical chemistry | Condom evidence
Refined composite multivariate multiscale symbolic dynamic entropy and its application to fault diagnosis of rotating machine
آنتروپی پویای نمادین چند متغیره کامپوزیت تصفیه شده و کاربرد آن در تشخیص خطای ماشین چرخشی-2020
Accurate and efficient identification of various fault categories, especially for the big data and multisensory system, is a challenge in rotating machinery fault diagnosis. For the diagnosis problems with massive multivariate data, extracting discriminative and stable features with high efficiency is the significant step. This paper proposes a novel feature extraction method, called Refined Composite multivariate Multiscale Symbolic Dynamic Entropy (RCmvMSDE), based on the refined composite analysis and multivariate multiscale symbolic dynamic entropy. Specifically, multivariate multiscale symbolic dynamic entropy can capture more identification information from multiple sensors with superior computational efficiency, while refine composite analysis guarantees its stability. The abilities of the proposed method to measure the complexity of multivariate time series and identify the signals with different components are discussed based on adequate simulation analysis. Further, to verify the effectiveness of the proposed method on fault diagnosis tasks, a centrifugal pump dataset under constant speed condition and a ball bearing dataset under time-varying speed condition are applied. Compared with the existing methods, the proposed method improves the classification accuracy and F-score to 99.81% and 0.9981, respectively. Meanwhile, the proposed method saves at least half of the computational time. The result shows that the proposed method is effective to improve the efficiency and classification accuracy dealing with the massive multivariate signals.
Keywords: Multivariate multiscale symbolic dynamic | entropy | Random forest | Time-varying speed conditions | Fault diagnosis
A model for big spatial rural data infrastructure in Turkey: Sensor-driven and integrative approach
یک مدل برای زیرساخت های داده های بزرگ فضایی روستایی در ترکیه: رویکرد حسگر محور و یکپارچه-2020
A Spatial Data Infrastructure (SDI) enables the effective spatial data flow between providers and users for their prospective land use analyses. The need for an SDI providing soil and land use inventories is crucial in order to optimize sustainable management of agricultural, meadow and forest lands. In an SDI where datasets are static, it is not possible to make quick decisions about land use. Therefore, SDIs must be enhanced with online data flow and the capabilities to store big volumes of data. This necessity brings the concepts of the Internet of Things (IoT) and Big Data (BD) into the discussion. Turkey needs to establish an SDI to monitor and manage its rural lands. Even though Turkish decision-makers and scientists have constructed a solid national SDI standardization, a conceptual model for rural areas has not been developed yet. In accordance with the international agreements, this model should adopt the INSPIRE Directive and Land Parcel Identification System (LPIS) standards. In order to manage rural lands in Turkey, there are several legislations which characterize the land use planning, land classification and restrictions. Especially, the Soil Protection and Land Use Law (SPLUL) enforces to use a lot and a variety of land use parameters that should be available in a big rural SDI. Moreover, this model should be enhanced with IoT, which enables to use of smart sensors to collect data for monitoring natural occurrences and other parameters that may help to classify lands. This study focuses on a conceptual model of a Turkish big rural SDI design that combines the sensor usage and attribute datasets for all sorts of rural lands. The article initially reviews Turkish rural reforms, current enterprises to a national SDI and sensor-driven agricultural monitoring. The suggested model integrates rural land use types, such as agricultural lands, meadowlands and forest lands. During the design process, available data sets and current legislation for Turkish rural lands are taken into consideration. This schema is associated with food security databases (organic and good farming practices), non-agricultural land use applications and local/ European subsidies in order to monitor the agricultural parcels on which these practices are implemented. To provide a standard visualization of this conceptual schema, the Unified Modeling Language (UML) class diagrams are used and a supplementary data dictionary is prepared to make clear definitions of the attributes, data types and code lists used in the model. This conceptual model supports the LPIS, ISO 19156 International Standard (Geographic Information: Observations and Measurements) catalogue and INSPIRE data theme specifications due to the fact that Turkey is negotiating the accession to EU; however, it also provides a local understanding that enables to manage rural lands holistically for sustainable development goals. It suggests an expansion for the sensor variety of Turkish agricultural monitoring project (TARBIL) and it specifies a rural theme for Turkish National SDI (TUCBS).
Keywords: Spatial data infrastructures | Big data | Internet of things | Rural land use | INSPIRE | LPIS
Improving high-tech enterprise innovation in big data environment: A combinative view of internal and external governance
بهبود نوآوری شرکتهای پیشرفته در محیط داده های بزرگ: نمای ترکیبی از حاکمیت داخلی و خارجی-2020
The emergence of big data brings both opportunities and challenges to high-tech enterprises. How to keep competitive advantages and improve innovation performance is important for enterprises in big data environment. Except from organizational learning ability and the use of advanced technology, the corporate governance also plays an important role in the process of enterprise’s innovation practice. This article creatively combines with the insights of internal and external governance, and explores how the managerial power and network centrality affects enterprise’s innovation performance in big data environment. Considering about the differences among distinct regional big data environment (strong/weak), this paper also takes classification research on it. The research findings show that managerial power has a significant positive impact on innovation performance, managerial power could enhance enterprise’s centrality in network, and the enterprise which located in network central position has more advantages in obtaining resources and significantly improves firm’s innovation performance. Network centrality plays a mediating role on managerial power and innovation performance. Further research finds that the positive effects of managerial power and network centrality are more significantly in the strong big data environment. These findings enrich the research of high-tech enterprise innovation from a combinative governance view, and contribute to the literatures on enterprise innovation in big data environment
Keywords: Big data environment | High-tech enterprises | Innovation performance | Managerial power | Network centrality
Big Data Analytics for Venture Capital Application：Towards Innovation Performance Improvement
تجزیه و تحلیل داده های بزرگ برای برنامه های سرمایه گذاری: به سمت بهبود عملکرد نوآوری-2020
By using the panel date of Chinese enterprises, this paper analyzes the influence of venture capital on innovation performance. In this paper, the number of patent application and the patent quality(invention patent applications, number of effective patents, IPC number of international patent classification, and patent claims) are used to measure the innovation performance of enterprises, and the regression results show that the innovation performance is significantly promoted by the venture capital; for industries with higher dependence on external financing and high technology intensity and areas with better protection of property rights, venture capital promotes innovation performance more significantly. In this paper, it further distinguishes the characteristics of venture capital institutions, and finds that the promotion effect of non-state-owned venture capital on innovation performance is significantly greater than that of state-owned venture capital; the venture capital institutions with high reputation and high network capital play a more significant role in promoting innovation performance.
Keywords: Data panel model | Big data | Innovation performance
Does government information release really matter in regulating contagionevolution of negative emotion during public emergencies? From the perspective of cognitive big data analytics
آیا انتشار اطلاعات دولتی در تنظیم تکامل منفی احساسات منفی در مواقع اضطراری عمومی اهمیت دارد؟ از منظر تجزیه و تحلیل داده های بزرگ شناختی-2020
The breeding and spreading of negative emotion in public emergencies posed severe challenges to social governance. The traditional government information release strategies ignored the negative emotion evolution mechanism. Focusing on the information release policies from the perspectives of the government during public emergency events, by using cognitive big data analytics, our research applies deep learning method into news framing framework construction process, and tries to explore the influencing mechanism of government information release strategy on contagion-evolution of negative emotion. In particular, this paper first uses Word2Vec, cosine word vector similarity calculation and SO-PMI algorithms to build a public emergenciesoriented emotional lexicon; then, it proposes a emotion computing method based on dependency parsing, designs an emotion binary tree and dependency-based emotion calculation rules; and at last, through an experiment, it shows that the emotional lexicon proposed in this paper has a wider coverage and higher accuracy than the existing ones, and it also performs a emotion evolution analysis on an actual public event based on the emotional lexicon, using the emotion computing method proposed. And the empirical results show that the algorithm is feasible and effective. The experimental results showed that this model could effectively conduct fine-grained emotion computing, improve the accuracy and computational efficiency of sentiment classification. The final empirical analysis found that due to such defects as slow speed, non transparent content, poor penitence and weak department coordination, the existing government information release strategies had a significant negative impact on the contagion-evolution of anxiety and disgust emotion, could not regulate negative emotions effectively. These research results will provide theoretical implications and technical supports for the social governance. And it could also help to establish negative emotion management mode, and construct a new pattern of the public opinion guidance.
Keywords: Government information release | Cognitive big data analytics | E-government | Sentiment analysis | Public emergency events
Thorough state-of-the-art analysis of electric and hybrid vehicle powertrains: Topologies and integrated energy management strategies
تجزیه و تحلیل دقیق و پیشرفته از موتورهای برقی و هیبریدی خودرو:: توپولوژی و استراتژی های مدیریت انرژی یکپارچه-2020
Hybrid and electric vehicles have been demonstrated as auspicious solutions for ensuring improvements in fuel saving and emission reductions. From the system design perspective, there are numerous indicators affecting the performance of such vehicles, in which the powertrain type, component configuration, and energy management strategy (EMS) play a key role. Achieving an energy-efficient powertrain requires tackling several conflicting control objectives such as the drivability, fuel economy, reduced emissions, and battery state of charge preservation, which make the EMS the most crucial aspect of powertrain system design. Accordingly, in the present study, various powertrain systems and topologies of (plug-in) hybrid electric vehicles and full-electric vehicles are assessed. In addition, EMSs as applied in the literature are systematically surveyed for a qualitative investigation, classification, and comparison of existing approaches in terms of the principles, advantages, and drawbacks through a comprehensive review. Furthermore, potential challenges considering the gaps in research are addressed, and directives paving the way toward further development of powertrains and EMSs in all respects are thoroughly provided.
Keywords: Plug-in hybrid electric vehicle | Full-electric vehicle | Energy management strategy optimisation | Online EMS | Offline EMS | Optimal control strategy
A different sleep apnea classification system with neural network based on the acceleration signals
یک سیستم طبقه بندی sleep apnea متفاوت با شبکه عصبی مبتنی بر سیگنال های شتاب-2020
Background and objective: The apnea syndrome is characterized by an abnormal breath pause or reduction in the airflow during sleep. It is reported in the literature that it affects 2% of middle-aged women and 4% of middle-aged men, approximately. This study has vital importance, especially for the elderly, the disabled, and pediatric sleep apnea patients. Methods: In this study, a new diagnostic method is developed to detect the apnea event by using a microelectromechanical system (MEMS) based acceleration sensor. It records the value of acceleration by measuring the movements of the diaphragm in three axes during the respiratory. The measurements are carried out simultaneously, a medical spirometer (Fukuda Sangyo), to test the validity of measurement results. An artificial neural network model was designed to determine the apnea event. For the number of neurons in the hidden layer, 1-3-5-10-18-20-25 values were tried, and the network with three hidden neurons giving the most suitable result was selected. In the designed ANN, three layers were formed that three neurons in the hidden layer, the two neurons at the input, and two neurons at the output layer. Results: A study group was formed of 5 patients (having different characteristics (age, height, and body weight)). The patients in the study group have sleep apnea (SA) in different grades. Several 12.723 acceleration data (ACC) in the XYZ-axis from 5 different patients are recorded for apnea event training and detection. The measured accelerometer (ACC) data from one of the patients (called H1) are used to train an ANN. During the training phase, MSE is used to calculate the fitness value of the apnea event. Then Apnea event is detected successfully for the other patients by using ANN trained only with H1’s ACC data. Conclusions: The sleep apnea event detection system is presented by using ANN from directly acceleration values. Measurements are performed by the MEMS-based accelerometer and Industrial Spirometer simultaneously. A total of 12723 acceleration data is measured from 5 different patients. The best result in 7000 iterations was reached (the number of iterations was tried up to 10.000 with 1000 steps). 605 data of only H1 measurements are used to train ANN, and then all data used to check the performance of the ANN as well as H2, H3, H4, and H5 measurement results. MSE performance benchmark shows us that trained ANN successfully detects apnea events. One of the contributions of this study to literature is that only ACC data are used in the ANN training step. After training for one patient, the ANN system can monitor the apnea event situation on-line for others.
Keywords: Sleep apnea | Acceleration sensor | Acceleration data | Artificial neural network | Medical decision making