Semi-supervised gear fault diagnosis using raw vibration signal based on deep learning
تشخیص خطای دنده نیمه نظارت شده با استفاده از سیگنال لرزش خام بر اساس یادگیری عمیق-2019
In aerospace industry, gears are the most common parts of a mechanical transmission system. Gear pitting faults could cause the transmission system to crash and give rise to safety disaster. It is always a challenging problem to diagnose the gear pitting condition directly through the raw signal of vibration. In this paper, a novel method named augmented deep sparse autoencoder (ADSAE) is proposed. The method can be used to diagnose the gear pitting fault with relatively few raw vibration signal data. This method is mainly based on the theory of pitting fault diagnosis and creatively combines with both data augmentation ideology and the deep sparse autoencoder algorithm for the fault diagnosis of gear wear. The effectiveness of the proposed method is validated by experiments of six types of gear pitting conditions. The results show that the ADSAE method can effectively increase the network generalization ability and robustness with very high accuracy. This method can effectively diagnose different gear pitting conditions and show the obvious trend according to the severity of gear wear faults. The results obtained by the ADSAE method proposed in this paper are compared with those obtained by other common deep learning methods. This paper provides an important insight into the field of gear fault diagnosis based on deep learning and has a potential practical application value.
KEYWORDS : Deep learning | Gear pitting diagnosis | Gear teeth | Raw vibration signal | Semi-supervised learning | Sparse autoencoder
Energy consumption modelling using deep learning embedded semi-supervised learning
مدل سازی مصرف انرژی با استفاده از یادگیری عمیق یادگیری نیمه نظارت تعبیه شده-2019
Reduction of energy consumption in the steel industry is a global issue where government is actively taking measures to pursue. A steel plant can manage its energy better if the consumption can be modelled and predicted. The existing methods used for energy consumption modelling rely on the quantity of labelled data. However, if the labelled energy consumption data is deficient, its underlying process of modelling and prediction tends to be difficult. The purpose of this study is to establish an energy value prediction model through a big data-driven approach. Owing to the fact that labelled energy data is often limited and expensive to obtain, while unlabelled data is abundant in the real-world industry, a semi-supervised learning approach, i.e., deep learning embedded semi-supervised learning (DLeSSL), is proposed to tackle the issue. Based on DLeSSL, unlabelled data can be labelled and compensated using a semi-supervised learning approach that has a deep learning technique embedded so to expand the labelled data set. An experimental study using a large amount of furnace energy consumption data shows the merits of the proposed approach. Results derived using the proposed method reveal that deep learning (DLeSSL based) outperforms the deep learning (supervised) and deep learning (label propagation based) when the labelled data is limited. In addition, the effect on performance due to the size of labelled data and unlabelled data is also reported.
Keywords: Energy modelling | Intelligent manufacturing | Deep learning | Semi-supervised learning | Data mining
Semi-Supervised Learning Based Big Data-Driven Anomaly Detection in Mobile Wireless Networks
تشخیص ناهنجاری های رانده شده با داده های نیمه نظارت بر اساس داده ها در شبکه های بی سیم سیار-2018
With rising capacity demand in mobile networks, the infrastructure is also becoming increasingly denser and complex. This results in collection of larger amount of raw data (big data) that is generated at different levels of network architecture and is typically underutilized. To unleash its full value, innovative machine learning algorithms need to be utilized in order to extract valuable insights which can be used for improving the overall network’s performance. Additionally, a major challenge for network operators is to cope up with increasing number of complete (or partial) cell outages and to simultaneously reduce operational expenditure. This paper contributes towards the aforementioned problems by exploiting big data generated from the core network of 4G LTE-A to detect network’s anomalous behavior. We present a semi-supervised statistical-based anomaly detection technique to identify in time: first, unusually low user activity region depicting sleeping cell, which is a special case of cell outage; and second, unusually high user traffic area corresponding to a situation where special action such as additional resource allocation, fault avoidance solution etc. may be needed. Achieved results demonstrate that the proposed method can be used for timely and reliable anomaly detection in current and future cellular networks.
Keywords: 5G; 4G LTE-A; anomaly detec tion; call detail record; machine learning; big data analytics; network behavior analysis; sleeping cell
A rejection inference technique based on contrastive pessimistic likelihood estimation for P2P lending
یک روش رد استنباط برمبنای تخمین احتمال بدبینی مخالف برای وام دهی P2P-2018
The majority of current credit-scoring models are built solely on accepted samples and thus cause sample bias. Sample bias is particularly severe in the peer-to-peer (P2P) lending domain due to its comparatively high rejection rate. Reject inference solves sample bias by inferring the possible outcomes of rejected samples and incorporating them into credit score modeling. This study addresses the problem of reject inference in a specific P2P lending domain from the perspective of semi-supervised learning. A novel reject inference method (CPLE-LightGBM) is proposed by combining the contrastive pessimistic likelihood estimation framework and an advanced gradient boosting decision tree classifier (LightGBM). The performance of the proposed CPLE-LightGBM method is validated on multiple datasets, and results demonstrate the efficiency of our proposal. Analysis of the influence of rejection rate on predictive accuracy reveals the usefulness of sampling in rejected datasets.
keywords: Big data applications |Contrastive pessimistic likelihood |Credit scoring |Data analytics |Gradient boosting decision tree estimation |Machine learning |P2P lending |Reject inference |Semi-supervised learning
From big data to knowledge: A spatio temporal approach to malware detection
از داده های بزرگ به دانش: یک رویکرد زمان فضایی به تشخیص نرم افزارهای مخرب-2018
The deployment of endpoint protection has been gradually migrated from individual clients to remote cloud servers, which is termed as cloud based security service. The new para digm of security defense produces a large amount of data and log files, and motivates data driven techniques for detecting malicious software. This paper conducts an empirical study on the log of a real cloud based security service to characterize the occurrence of execut able files in end hosts, which concerns 124,782 benign and 113,305 malicious executable files occurred in 165,549,417 end hosts. The end hosts and the timestamps that an execut able file occurs in provide insights into the distribution of software in wild from spatial and temporal perspectives, respectively. Meanwhile, we investigate the strategies behind the char acterizations, and observe the preferential attachment process and the periodicity of file occurrence in end hosts. The observed different occurrence patterns of benign and mali cious files in end hosts inspire us a new scalable approach to malware detection. We learn from the characterizations that, the associated files shared more spatial and temporal in formation in common are more likely to be same in their labels, either benign or malicious. Thus, we devise a graph based semi-supervised learning algorithm for real-time malware detection by taking into account the spatio-temporal information of the distribution of ex ecutable files. Experimental results demonstrate that our approach increases the performance on malware detection by 14.7% over previous techniques on average.
Keywords: Malware detection ، Data-driven security analysis ، File co-occurrence ، Graph based semi-supervised ، learning ، Content-agnostic
Semi-supervised learning for big social data analysis
یادگیری شبه نظارت شده برای تجزیه و تحلیل داده های اجتماعی بزرگ-2018
In an era of social media and connectivity, web users are becoming increasingly enthusiastic about inter acting, sharing, and working together through online collaborative media. More recently, this collective intelligence has spread to many different areas, with a growing impact on everyday life, such as in ed ucation, health, commerce and tourism, leading to an exponential growth in the size of the social Web. However, the distillation of knowledge from such unstructured Big data is, an extremely challenging task. Consequently, the semantic and multimodal contents of the Web in this present day are, whilst being well suited for human use, still barely accessible to machines. In this work, we explore the potential of a novel semi-supervised learning model based on the combined use of random projection scaling as part of a vector space model, and support vector machines to perform reasoning on a knowledge base. The latter is developed by merging a graph representation of commonsense with a linguistic resource for the lexical representation of affect. Comparative simulation results show a significant improvement in tasks such as emotion recognition and polarity detection, and pave the way for development of future semi-supervised learning approaches to big social data analytics.
Keywords: Semi-supervised learning ، Big social data analysis ، Sentiment analysis