Characterizing Linux-based malware: Findings and recent trends
مشخص کردن بدافزار مبتنی بر لینوکس: یافته ها و روندهای اخیر-2020
Malware targeting interconnected infrastructures has surged in recent years. A major factor driving this phenomenon is the proliferation of large networks of poorly secured IoT devices. This is exacerbated by the commoditization of the malware development industry, in which tools can be readily obtained in specialized hacking forums or underground markets. However, despite the great interest in targeting this infrastructure, there is little understanding of what the main features of this type of malware are, or the motives of the criminals behind it, apart from the classic denial of service attacks. This is vital to modern malware forensics, where analyses are required to measure the trustworthiness of files collected at large during an investigation, but also to confront challenges posed by tech-savvy criminals (e.g., Trojan Horse Defense). In this paper, we present a comprehensive characterization of Linux-based malware. Our study is tailored to IoT malware and it leverages automated techniques using both static and dynamic analysis to classify malware into related threats. By looking at the most representative dataset of Linux-based malware collected by the community to date, we are able to show that our system can accurately characterize known threats. As a key novelty, we use our system to investigate a number of threats unknown to the community. We do this in two steps. First, we identify known patterns within an unlabeled dataset using a classifier trained with the labeled dataset. Second, we combine our features with a custom distance function to discover new threats by clustering together similar samples. We further study each of the unknown clusters by using state-of-the-art reverse engineering and forensic techniques and our expertise as malware analysts. We provide an in-depth analysis of what the most recent unknown trends are through a number of case studies. Among other findings, we observe that: i) crypto-mining malware is permeating the IoT infrastructure, ii) the level of sophistication is increasing, and iii) there is a rapid proliferation of new variants with minimal investment in infrastructure.
Keywords: Malware forensics | IoT | Embedded systems | Data analytics | Machine learning | Expert systems
In this paper, a novel problem in transshipment networks has been proposed. The main aims of this pa- per are to introduce the problem and to give useful tools for solving it both in exact and approximate ways. In a transshipment network it is important to decide which are the best paths between each pair of nodes. Representing the network by a graph, the union of thesepaths is a delivery subgraph of the original graph which has all the nodes and some edges. Nodes in this subgraph which are adjacent to more than two nodes are called switches because when sending the flow between any pair of nodes, switches on the path must adequately direct it. Switches are facilities which direct flows among users. The installation of a switch involves the installation of adequate equipment and thus an allocation cost. Furthermore, traversing a switch also implies a service cost or allocation cost. The Switch Location Prob- lem is defined as the problem of determining which is the delivery subgraph with the total lowest cost. Two of the three solutions approaches that we propose are decomposition algorithms based on articula- tion vertices, the exact and the math-heuristic ones. These two approaches could be embedded in expert systems for locating switches in transshipment networks. The results should help a decision maker to select the adequate approach depending on the shape and size of the network and also on the exter- nal time-limit. Our results show that the exact approach is a valuable tool if the network has less than 10 0 0 nodes. Two upsides of our heuristics are that they do not require special networks and give good solutions, gap-wise. The impact of this paper is twofold: it highlights the difficulty of adequately locating switches and it emphasizes the benefit of decomposing algorithms.
Keywords: Discrete location | Math-heuristic | Articulation vertex | Block-Cutpoint graph
On initial population generation in feature subset selection
تولید جمعیت اولیه در انتخاب زیر مجموعه ویژگی-2019
Performance of evolutionary algorithms depends on many factors such as population size, number of generations, crossover or mutation probability, etc. Generating the initial population is one of the impor- tant steps in evolutionary algorithms. A poor initial population may unnecessarily increase the number of searches or it may cause the algorithm to converge at local optima. In this study, we aim to find a promis- ing method for generating the initial population, in the Feature Subset Selection (FSS) domain. FSS is not considered as an expert system by itself, yet it constitutes a significant step in many expert systems. It eliminates redundancy in data, which decreases training time and improves solution quality. To achieve our goal, we compare a total of five different initial population generation methods; Information Gain Ranking (IGR), greedy approach and three types of random approaches. We evaluate these methods using a specialized Teaching Learning Based Optimization searching algorithm (MTLBO-MD), and three super- vised learning classifiers: Logistic Regression, Support Vector Machines, and Extreme Learning Machine. In our experiments, we employ 12 publicly available datasets, mostly obtained from the well-known UCI Machine Learning Repository. According to their feature sizes and instance counts, we manually classify these datasets as small, medium, or large-sized. Experimental results indicate that all tested methods achieve similar solutions on small-sized datasets. For medium-sized and large-sized datasets, however, the IGR method provides a better starting point in terms of execution time and learning performance. Finally, when compared with other studies in literature, the IGR method proves to be a viable option for initial population generation.
Keywords: Feature subset selection | Initial population | Multiobjective optimization
Combining hierarchical clustering approaches using the PCA method
ترکیب روشهای خوشه بندی سلسله مراتبی با استفاده از روش PCA-2019
In expert systems, data mining methods are algorithms that simulate humans’ problem-solving capabil- ities. Clustering methods as unsupervised machine learning methods are crucial approaches to catego- rize similar samples in the same categories. The use of different clustering algorithms to a given dataset produces clusters with different qualities. Hence, many researchers have applied clustering combination methods to reduce the risk of choosing an inappropriate clustering algorithm. In these methods, the out- puts of several clustering algorithms are combined. In these research works, the input hierarchical clus- terings are transformed to descriptor matrices and their combination is achieved by aggregating their descriptor matrices. In previous works, only element-wise aggregation operators have been used and the relation between the elements of each descriptor matrix has been ignored. However, the value of each element of the descriptor matrix is meaningful in comparison with its other elements. The current study proposes a novel method of combining hierarchical clustering approaches based on principle component analysis (PCA). PCA as an aggregator allows considering all elements of the descriptor matrices. In the proposed approach, basic clusters are made and transformed to descriptor matrices. Then, a final ma- trix is extracted from the descriptor matrices using PCA. Next, a final dendrogram is constructed from the matrix that is used to summarize the results of the diverse clustering. The experimental results on popular available datasets show the superiority of the clustering accuracy of the proposed method over basic clustering methods such as single, average and centroid linkage and previously combined hierar- chical clustering methods. In addition, statistical tests show that the proposed method significantly out- performed hierarchical clustering combination methods with element-wise averaging operators in almost all tested datasets. Several experiments have also been conducted which confirm the robustness of the proposed method for its parameter setting.
Keywords: Clustering | Hierarchical clustering | Principle component analysis | PCA
Double Q-PID algorithm for mobile robot control
الگوریتم دابل Q-PID برای کنترل ربات های موبایل-2019
Many expert systems have been developed for self-adaptive PID controllers of mobile robots. However, the high computational requirements of the expert systems layers, developed for the tuning of the PID controllers, still require previous expert knowledge and high efficiency in algorithmic and software exe- cution for real-time applications. To address these problems, in this paper we propose an expert agent- based system, based on a reinforcement learning agent, for self-adapting multiple low-level PID con- trollers in mobile robots. For the formulation of the artificial expert agent, we develop an incremental model-free algorithm version of the double Q -Learning algorithm for fast on-line adaptation of multi- ple low-level PID controllers. Fast learning and high on-line adaptability of the artificial expert agent is achieved by means of a proposed incremental active-learning exploration-exploitation procedure, for a non-uniform state space exploration, along with an experience replay mechanism for multiple value functions updates in the double Q -learning algorithm. A comprehensive comparative simulation study and experiments in a real mobile robot demonstrate the high performance of the proposed algorithm for a real-time simultaneous tuning of multiple adaptive low-level PID controllers of mobile robots in real world conditions.
Keywords: Reinforcement learning | Double Q -learning | Incremental learning | Double Q-PID | Mobile robots | Multi-platforms
Analytical games for knowledge engineering of expert systems in support to Situational Awareness: The Reliability Game case study
بازی های تحلیلی برای مهندسی دانش سیستم های خبره در حمایت از آگاهی وضعیتی: مطالعه موردی بازی اطمینان-2019
Knowledge Acquisition (KA) methods are of paramount importance in the design of intelligent systems. Research is ongoing to improve their effectiveness and efficiency. Analytical games appear to be a promis- ing tool to support KA. In fact, in this paper we describe how analytical games could be used for Knowl- edge Engineering of Bayesian networks, through the presentation of the case study of the Reliability Game. This game has been developed with the aim of collecting data on the impact of meta-knowledge about sources of information upon human Situational Assessment in a maritime context. In this paper we describe the computational model obtained from the dataset and how the card positions, which reflect a player belief, can be easily converted in subjective probabilities and used to learn latent constructs, such as the source reliability, by applying the Expectation-Maximisation algorithm.
Keywords: Source reliability | Expert knowledge | Knowledge acquisition | Bayesian networks | Parameter learning | Analytical game
A systematic survey of computer-aided diagnosis in medicine: Past and present developments
مرور سیستماتیک تشخیص کمک به رایانه در پزشکی: تحولات گذشته و حال-2019
Computer-aided diagnosis (CAD) in medicine is the result of a large amount of effort expended in the interface of medicine and computer science. As some CAD systems in medicine try to emulate the diag- nostic decision-making process of medical experts, they can be considered as expert systems in medicine. Furthermore, CAD systems in medicine may process clinical data that can be complex and/or massive in size. They do so in order to infer new knowledge from data and use that knowledge to improve their diagnostic performance over time. Therefore, such systems can also be viewed as intelligent systems be- cause they use a feedback mechanism to improve their performance over time. The main aim of the literature survey described in this paper is to provide a comprehensive overview of past and current CAD developments. This survey/review can be of significant value to researchers and professionals in medicine and computer science. There are already some reviews about specific aspects of CAD in medicine. How- ever, this paper focuses on the entire spectrum of the capabilities of CAD systems in medicine. It also identifies the key developments that have led to today’s state-of-the-art in this area. It presents an ex- tensive and systematic literature review of CAD in medicine, based on 251 carefully selected publica- tions. While medicine and computer science have advanced dramatically in recent years, each area has also become profoundly more complex. This paper advocates that in order to further develop and im- prove CAD, it is required to have well-coordinated work among researchers and professionals in these two constituent fields. Finally, this survey helps to highlight areas where there are opportunities to make significant new contributions. This may profoundly impact future research in medicine and in select areas of computer science.
Keywords: Computer-aided diagnosis | Computer-aided detection | Expert and intelligent systems | Computerized signal analysis | Segmentation | Classification
Globally-biased BIRECT algorithm with local accelerators for expensive global optimization
الگوریتم BIRECT مغرضانه جهانی با شتاب دهنده های محلی برای بهینه سازی جهانی ارزشمند-2019
In this paper, black-box global optimization problem with expensive function evaluations is considered. This problem is challenging for numerical methods due to the practical limits on computational budget often required by intelligent systems. For its efficient solution, a new DIRECT-type hybrid technique is proposed. The new algorithm incorporates a novel sampling on diagonals and bisection strategy (instead of a trisection which is commonly used in the existing DIRECT-type algorithms), embedded into the globally-biased framework, and enriched with three different local minimization strategies. The numerical results on a test set of almost 900 problems from the literature and on a real-life application regarding nonlinear regression show that the new approach effectively addresses well-known DIRECT weaknesses, has beneficial effects on the overall performance, and on average, gives significantly better results compared to several DIRECT-type methods widely used in decision-making expert systems.
Keywords: Nonlinear global optimization| DIRECT-type algorithms | BIRECT algorithm | hybrid optimization algorithms | nonlinear regression
Analytic network process: Academic insights and perspectives analysis
فرآیند شبکه تحلیلی: بینش دانشگاهی و تحلیل چشم اندازها-2019
Diversity multi-criteria decision-making methods have been developed to address different complex decision-making problems, and the analytic network process has been found to be one of the most effective techniques. There is an increase in the quality and quantity of publications related to the analytic network process. This detailed overview can provide the research status and development characteristics of analytic network process research and will be useful to researchers for future research directions. To achieve these goals, bibliometric techniques were used. In addition, past and present hotspots of analytic network process research were concluded, and future research trends were determined. The bibliometric analysis was carried out from various aspects including countries and regions, institutions, journals, authors, research areas, articles and author keywords based on data harvested from the Web of Science database. There were 1485 analytic network process-related publications retrieved from theWeb of Science. The results show that Expert Systems with Applications was the most productive journal publishing articles in analytic network process research (118); its number of publications has decreased dramatically since 2013, while Journal of Cleaner Production has shown an upward trend in recent years and ranks second with 47 publications. The most collaborative country is the United States. Taiwan takes a leading position in analytic network process research with 436 publications (29.36%), and National Chiao Tung University, which is located in Taiwan, produced the most articles and has gained the highest h-index (28). The major hot topics that employ analytic network process are sustainability, environmental management and supply chain management. These topics may continue to attract more attention in the future.
Keywords: Analytic Network Process | Web of science | Bibliometrics | Hot topics | Sustainability | Environmental management | Supply chain management
Supervisory control strategies evaluated on a pilot Jameson flotation cell
استراتژی های کنترل نظارت بر روی یک سلول شناور جیمسون خلبان ارزیابی شده-2019
An L-150 pilot Jameson flotation cell was instrumented and a distributed control system was developed. The parameters of a metallurgic phenomenological model were estimated from industrial data. A steady state simulator was built based on this nonlinear model. This hybrid system combines on-line measured operating variables with virtual variables, characterizing the feed. All these variables are fed on-line to a simulator to predict the characteristics of the concentrate and tailings. The expert system modifies the set points of the distributed control system, including two routines: expert feedback and feed forward control. Several cases for different feed conditions are discussed.
Keywords: Control | Flotation | Expert systems | Supervision | Jameson cell