Adaptation of the idea of concept drift to some behavioral biometrics: Preliminary studies
انطباق ایده رانش مفهوم با برخی از بیومتریک های رفتاری: مطالعات اولیه-2021
In this paper we present a novel strategy that utilizes concept drift to improve some biometric procedures. The proposed method can be applied whenever behavioral signals change and those changes need to be detected. From a security point of view, this is important because detection of and appropriate response to change should result in some alteration in the operation of the biometric system. As one example, this allows for the detection of legitimate and illegitimate users. Experiments performed on real biometric signals have demonstrated that the proposed techniques could be introduced into existing professional biometric systems based on behavioral features.
Keywords: Concept drift | Biometrics | Classifiers | Ensemble of classifiers
Dynamic imbalanced business credit evaluation based on Learn++ with sliding time window and weight sampling and FCM with multiple kernels
ارزیابی اعتبار نامتعادل پویای تجاری بر اساس یادگیری ++ با پنجره زمانی کشویی و نمونه برداری از وزن و FCM با چندین هسته-2020
A good model of business credit evaluation is an important tool for risk management. Although the dynamic imbalanced data flow is more consistent with the form of collected financial data in the actual situation, existing studies seldom research financial data as this form. This paper proposes a new ensemble model for dynamic imbalanced business credit evaluation based on the improved Learn++ and fuzzy c-means (FCM). To handle dynamic imbalanced financial data, Learn++ is improved by using a sliding time window (STW) and weight sampling (WS). This method is termed Learn++.STW-WS. STW can divide data with the same concept into the same dataset to solve the problem of concept drift which characteristic in dynamic data. Additionally, WS can redistribute the weights for samples of different classes to resolve the issue of imbalance. To satisfy the demand of Learn++.STWWS on the prediction accuracy of a base classifier, FCM is improved by multiple kernels (MK), and is designated as MK-FCM. Several kernel functions are integrated to construct MK by the mean method, and MK is adopted to improve the calculation method of distances among points for FCM. Therefore, this new ensemble model can solve the problems of dynamic data and imbalanced classes at the same time. In the empirical research, financial data from Chinese listed companies are selected to evaluate business credit risk, and the associated models are adopted to make comparative analysis. The experiment results can fully demonstrate the good performance of the new ensemble model in terms of handling dynamic imbalanced financial data.
Keywords: Business credit evaluation | Dynamic imbalanced financial data | Ensemble model | Learn++ | Fuzzy c-means
Financial portfolio optimization with online deep reinforcement learning and restricted stacked autoencoder-DeepBreath
بهینه سازی سبد مالی با یادگیری تقویتی عمیق آنلاین و محدود کردن خودکار رمزگذار-DeepBreath-2020
The process of continuously reallocating funds into financial assets, aiming to increase the expected re- turn of investment and minimizing the risk, is known as portfolio management. In this paper, a portfolio management framework is developed based on a deep reinforcement learning framework called Deep- Breath. The DeepBreath methodology combines a restricted stacked autoencoder and a convolutional neu- ral network (CNN) into an integrated framework. The restricted stacked autoencoder is employed in order to conduct dimensionality reduction and features selection, thus ensuring that only the most informative abstract features are retained. The CNN is used to learn and enforce the investment policy which consists of reallocating the various assets in order to increase the expected return on investment. The framework consists of both offline and online learning strategies: the former is required to train the CNN while the latter handles concept drifts i.e. a change in the data distribution resulting from unforeseen circum- stances. These are based on passive concept drift detection and online stochastic batching. Settlement risk may occur as a result of a delay in between the acquisition of an asset and its payment failing to deliver the terms of a contract. In order to tackle this challenging issue, a blockchain is employed. Finally, the performance of the DeepBreath framework is tested with four test sets over three distinct investment periods. The results show that the return of investment achieved by our approach outperforms current expert investment strategies while minimizing the market risk.
Keywords: Portfolio management | Deep reinforcement learning | Restricted stacked autoencoder | Online leaning | Settlement risk | Blockchain
A non-canonical hybrid metaheuristic approach to adaptive data stream classification
یک روش متاوریستی ترکیبی غیر متعارف برای طبقه بندی جریان داده تطبیقی-2020
Data stream classification techniques have been playing an important role in big data analytics recently due to their diverse applications (e.g. fraud and intrusion detection, forecasting and healthcare monitoring systems) and the growing number of real-world data stream generators (e.g. IoT devices and sensors, websites and social network feeds). Streaming data is often prone to evolution over time. In this context, the main challenge for computational models is to adapt to changes, known as concept drifts, using data mining and optimisation techniques. We present a novel ensemble technique called RED-PSO that seamlessly adapts to different concept drifts in non-stationary data stream classification tasks. RED-PSO is based on a three-layer architecture to produce classification types of different size, each created by randomly selecting a certain percentage of features from a pool of features of the target data stream. An evolutionary algorithm, namely, Replicator Dynamics (RD), is used to seamlessly adapt to different concept drifts; it allows good performing types to grow and poor performing ones to shrink in size. In addition, the selected feature combinations in all classification types are optimised using a non-canonical version of the Particle Swarm Optimisation (PSO) technique for each layer individually. PSO allows the types in each layer to go towards local (within the same type) and global (in all types) optimums with a specified velocity. A set of experiments are conducted to compare the performance of the proposed method to state-of-the-art algorithms using real-world and synthetic data streams in immediate and delayed prequential evaluation settings. The results show a favourable performance of our method in different environments.
Keywords: Ensemble learning | Data stream mining | Concept drifts | Bio-inspired algorithms | Non-stationary environments | Particle swarm optimisation | Replicator dynamics
Machine learning based concept drift detection for predictive maintenance
مفهوم یادگیری ماشین مبتنی بر تشخیص رانش برای تعمیر و نگهداری پیشگویانه-2019
In this work we present a machine learning based approach for detecting drifting behavior – so-called concept drifts – in continuous data streams. The motivation for this contribution originates from the currently intensively investigated topic Predictive Maintenance (PdM), which refers to a proactive way of triggering servicing actions for industrial machinery. The aim of this maintenance strategy is to identify wear and tear, and consequent malfunctioning by analyzing condition monitoring data, recorded by sensor equipped machinery, in real-time. Recent developments in this area have shown potential to save time and material by preventing breakdowns and improving the overall predictability of industrial processes. However, due to the lack of high quality monitoring data and only little experience concerning the applicability of analysis methods, real-world implementations of Predictive Maintenance are still rare. Within this contribution, we present a method, to detect concept drift in data streams as potential indication for defective system behavior and depict initial tests on synthetic data sets. Further on, we present a real-world case study with industrial radial fans and discuss promising results gained from applying the detailed approach in this scope.
Keywords: Predictive maintenance | Machine learning | Concept drift detection | Time series regression | Industrial radial fans
LSC: Online auto-update smart contracts for fortifying blockchain-based log systems
LSC: به روز رسانی خودکار قراردادهای هوشمند برای تقویت سیستم های ورود به سیستم بلاکچین-2019
Smart contracts allow verifiable operations to be executed in blockchains, bringing new possibilities for trust establishment in trustless scenarios. However, smart contracts are cumbersome when used as security mechanisms in security scenarios due to two reasons: they have limited power and are inert to changes . In order to mitigate the two problems of employed smart contracts, we propose LSC, a framework for online auto-update smart contracts in blockchain-based log systems, to en- able self-adaptive log anomaly detection via smart contracts. Time-varying log anomaly de- tection patterns are extracted by self-adaptive machine learning log anomaly analysis and are continuously fed to the contracts. The framework allows smart contracts to be auto- matically updated to express the patterns in low-cost ways. The anomaly detection strate- gies for audit log systems are shared and collaboratively enforced amongst network nodes to defend against targeted detection evasion. We provide a plain prototype as a proof of the feasibility and efficiency of LSC in log system.
Keywords: Smart contracts | Anomaly detectiony | Blockchain security | Security dynamics | Concept drift
The Gradual Resampling Ensemble for mining imbalanced data streams with concept drift
اثر کلی مجموعه تلفیقی گسسته برای کاوش معادلات ناپایدار جریان با مفهوم رانش-2018
Knowledge extraction from data streams has received increasing interest in recent years. However, most of the existing studies assume that the class distribution of data streams is relatively balanced. The reac tion of concept drifts is more difficult if a data stream is class imbalanced. Current oversampling methods generally selectively absorb the previously received minority examples into the current minority set by evaluating similarities of past minority examples and the current minority set. However, the similarity evaluation is easily affected by data difficulty factors. Meanwhile, these oversampling techniques have ignored the majority class distribution, thus risking class overlapping. To overcome these issues, we propose an ensemble classifier called Gradual Resampling Ensemble (GRE). GRE could handle data streams which exhibit concept drifts and class imbalance. On the one hand, a selectively resampling method, where drifting data can be avoidable, is applied to select a part of pre vious minority examples for amplifying the current minority set. The disjuncts can be discovered by the DBSCAN clustering, and thus the influences of small disjuncts and outliers on the similarity evaluation can be avoidable. Only those minority examples with low probability of overlapping with the current majority set can be selected for resampling the current minority set. On the other hand, previous com ponent classifiers are updated using latest instances. Thus, the ensemble could quickly adapt to a new condition, regardless types of concept drifts. Through the gradual oversampling of previous chunks us ing the current minority events, the class distribution of past chunks can be balanced. Favorable results in comparison to other algorithms suggest that GRE can maintain good performance on minority class, without sacrificing majority class performance.
Keywords: Concept drift ، Data stream mining ، Ensemble classifier ، Class imbalance
Countering the concept-drift problems in big data by an incrementally optimized stream mining model
مقابله با مشکلات مفهوم رانش در داده های بزرگ توسط یک مدل کاوش جریان بهینه سازی شده بصورت افزایشی-2015
Article history:Received 7 March 2014Revised 29 May 2014Accepted 6 July 2014 Available online 22 July 2014Keywords:Concept driftData stream mining Very fast decision treeMining the potential value hidden behind big data has been a popular research topic around the world. For an inﬁnite big data scenario, the underlying data distribution of newly arrived data may be appeared differently from the old one in the real world. This phenomenon is so-called the concept-drift problem that exists commonly in the scenario of big data mining. In the past decade, decision tree inductions use multi-tree learning to detect the drift using alternative trees as a solution. However, multi-tree algorithms consume more computing resources than the singletree. This paper proposes a singletree with an optimized node-splitting mechanism to detect the drift in a test-then-training tree-building process. In the experiment, we compare the performance of the new method to some state-of-art singletree and multi-tree algorithms. Result shows that the new algorithm performs with good accuracy while a more compact model size and less use of memory than the others.© 2014 Elsevier Inc. All rights reserved.
Keywords: Concept drift | Data stream mining | Very fast decision tree