Highway crash detection and risk estimation using deep learning
تشخیص تصادف بزرگراه و تخمین ریسک با استفاده از یادگیری عمیق-2020
Crash Detection is essential in providing timely information to traffic management centers and the public to reduce its adverse effects. Prediction of crash risk is vital for avoiding secondary crashes and safeguarding highway traffic. For many years, researchers have explored several techniques for early and precise detection of crashes to aid in traffic incident management. With recent advancements in data collection techniques, abundant real-time traffic data is available for use. Big data infrastructure and machine learning algorithms can utilize this data to provide suitable solutions for the highway traffic safety system. This paper explores the feasibility of using deep learning models to detect crash occurrence and predict crash risk. Volume, Speed and Sensor Occupancy data collected from roadside radar sensors along Interstate 235 in Des Moines, IA is used for this study. This real-world traffic data is used to design feature set for the deep learning models for crash detection and crash risk prediction. The results show that a deep model has better crash detection performance and similar crash prediction performance than state of the art shallow models. Additionally, a sensitivity analysis was conducted for crash risk prediction using data 1-minute, 5-minutes and 10-minutes prior to crash occurrence. It was observed that is hard to predict the crash risk of a traffic condition, 10 min prior to a crash.
Keywords: Crash detection | Crash prediction | Deep learning
Scientific Authors in a Changing World of Scholarly Communication: What Does the Future Hold?
نویسندگان علمی در دنیای متغیر ارتباطات علمی: آینده چیست؟-2020
Scholarly communication in science, technology, and medicine has been organized around journal-based scientific publishing for the past 350 years. Scientific publishing has unique business models and includes stakeholders with conflicting interests—publishers, funders, libraries, and scholars who create, curate, and consume the literature. Massive growth and change in scholarly communication, coinciding with digitalization, have amplified stresses inherent in traditional scientific publishing, as evidenced by overwhelmed editors and reviewers, increased retraction rates, emergence of pseudo-journals, strained library budgets, and debates about the metrics of academic recognition for scholarly achievements. Simultaneously, several open access models are gaining traction and online technologies offer opportunities to augment traditional tasks of scientific publishing, develop integrated discovery services, and establish global and equitable scholarly communication through crowdsourcing, software development, big data management, and machine learning. These rapidly evolving developments raise financial, legal, and ethical dilemmas that require solutions, while successful strategies are difficult to predict. Key challenges and trends are reviewed from the authors’ perspective about how to engage the scholarly community in this multifaceted process.
KEYWORDS: Open access | Peer review | Predatory publishing | Preprint repository | Self-archiving
Automatic bad channel detection in implantable brain-computer interfaces using multimodal features based on local field potentials and spike signals
تشخیص خودکار کانال بد در رابط های قابل کاشت مغز با کامپیوتر با استفاده از ویژگی های چند حالته بر اساس پتانسیل های محلی و سیگنال های لبه-2020
“Bad channels” in implantable multi-channel recordings bring troubles into the precise quantitative description and analysis of neural signals, especially in the current “big data” era. In this paper, we combine multimodal features based on local field potentials (LFPs) and spike signals to detect bad channels automatically using machine learning. On the basis of 2632 pairs of LFPs and spike recordings acquired from five pigeons, 12 multimodal features are used to quantify each channel’s temporal, frequency, phase and firing-rate properties. We implement seven classifiers in the detection tasks, in which the synthetic minority oversampling technique (SMOTE) system and Fisher weighted Euclidean distance sorting (FWEDS) are used to cope with the class imbalance problem. The results of the two-dimensional scatterplots and classifications demonstrate that correlation coefficient, phase locking value, and coherence have good discriminability. For the multimodal features, almost all the classifiers can obtain high accuracy and bad channel detection rate after the SMOTE operation, in which the Random Forests classifier shows relatively better comprehensive performance (accuracy: 0.9092 � 0.0081, precision: 0.9123 � 0.0100, and recall: 0.9057 � 0.0121). The proposed approach can automatically detect bad channels based on multimodal features, and the results provide valuable references for larger datasets.
Keywords: Bad channel | Multimodal feature | LFP | Spike | Machine learning
An empirical case study on Indian consumers sentiment towards electric vehicles: A big data analytics approach
یک مطالعه موردی تجربی در مورد احساسات مصرف کنندگان هندی نسبت به وسایل نقلیه برقی: یک رویکرد تحلیل داده های بزرگ-2020
Today, climate change due to global warming is a significant concern to all of us. Indias rate of greenhouse gas emissions is increasing day by day, placing India in the top ten emitters in the world. Air pollution is one of the significant contributors to the greenhouse effect. Transportation contributes about 10% of the air pollution in India. The Indian government is taking steps to reduce air pollution by encouraging the use of electric vehicles. But, success depends on consumers sentiment, perception and understanding towards Electric Vehicles (EV). This case study tried to capture the feeling, attitude, and emotions of Indian consumers towards electric vehicles. The main objective of this study was to extract opinions valuable to prospective buyers (to know what is best for them), marketers (for determining what features should be advertised) and manufacturers (for deciding what features should be improved) using Deep Learning techniques (e.g Doc2Vec Algorithm, Recurrent Neural Network (RNN), Convolutional Neural Network (CNN)). Due to the very nature of social media data, big data platform was chosen to analyze the sentiment towards EV. Deep Learning based techniques were preferred over traditional machine learning algorithms (Support Vector Machine, Logistic regression and Decision tree, etc.) due to its superior text mining capabilities. Two years data (2016 to 2018) were collected from different social media platform for this case study. The results showed the efficiency of deep learning algorithms and found CNN yield better results in-compare to others. The proposed optimal model will help consumers, designers and manufacturers in their decision-making capabilities to choose, design and manufacture EV.
Keywords: Electric vehicles | Deep learning | Big data | Sentiment analysis | India
Data-driven software defined network attack detection : State-of-the-art and perspectives
تشخیص حمله به شبکه تعریف شده نرم افزار داده محور: حالت پیشرفته و چشم انداز-2020
SDN (Software Defined Network) has emerged as a revolutionary technology in network, a substantial amount of researches have been dedicated to security of SDNs to support their various applications. The paper firstly analyzes State-of-the-Art of SDN security from data perspectives. Then some typical network attack detection (NAD) methods are surveyed, in- cluding machine learning based methods and statistical methods. After that, a novel tensor based network attack detection method named tensor principal component analysis (TPCA) is proposed to detect attacks. After surveying the last data-driven SDN frameworks, a ten- sor based big data-driven SDN attack detection framework is proposed for SDN security. In the end, a case study is illustrated to verify the effectiveness of the proposed framework.
Keywords: Network attack detection | Data-driven | Tensor | Network security | Software defined network (SDN)
A fully scalable big data framework for Botnet detection based on network traffic analysis
چارچوب داده های بزرگ کاملاً مقیاس پذیر برای تشخیص Botnet مبتنی بر آنالیز ترافیک شبکه-2020
Many traditional Botnet detection methods have trouble scaling up to meet the needs of multi-Gbps networks. This scalability challenge is not just limited to bottlenecks in the detection process, but across all individual components of the Botnet detection system in- cluding data gathering, storage, feature extraction, and analysis. In this paper, we propose a fully scalable big data framework that enables scaling for each individual component of Botnet detection. Our framework can be used with any Botnet detection method - includ- ing statistical methods, machine learning methods, and graph-based methods. Our experi- mental results show that the proposed framework successfully scales in live tests on a real network with 5Gbps of traffic throughput and 50 millions IP addresses visits. In addition, our run time scales logarithmically with respect to the volume of the input for example, when the scale of the input data multiplies by 4 ×, the total run time increases by only 31%. This is significant improvement compared to schemes such as Botcluster in which run time increases by 86% under similar scale condition.
Keywords: Botnet detection | Big data | Hadoop | Spark | Machine learning | Scalability
A hybrid deep learning model for efficient intrusion detection in big data environment
یک مدل یادگیری عمیق ترکیبی برای تشخیص نفوذ موثر در محیط داده های بزرگ-2020
The volume of network and Internet traffic is expanding daily, with data being created at the zettabyte to petabyte scale at an exceptionally high rate. These can be character- ized as big data, because they are large in volume, variety, velocity, and veracity. Security threats to networks, the Internet, websites, and organizations are growing alongside this growth in usage. Detecting intrusions in such a big data environment is difficult. Various intrusion-detection systems (IDSs) using artificial intelligence or machine learning have been proposed for different types of network attacks, but most of these systems either cannot recognize unknown attacks or cannot respond to such attacks in real time. Deep learning models, recently applied to large-scale big data analysis, have shown remarkable performance in general but have not been examined for detection of intrusions in a big data environment. This paper proposes a hybrid deep learning model to efficiently detect network intrusions based on a convolutional neural network (CNN) and a weight-dropped, long short-term memory (WDLSTM) network. We use the deep CNN to extract mean- ingful features from IDS big data and WDLSTM to retain long-term dependencies among extracted features to prevent overfitting on recurrent connections. The proposed hybrid method was compared with traditional approaches in terms of performance on a publicly available dataset, demonstrating its satisfactory performance.
Keywords: Big data | Intrusion | detection Deep learning | Convolution neural network | Weight-dropped long short-term memory | network
Data imbalance in classification: Experimental evaluation
عدم تعادل داده ها در طبقه بندی: ارزیابی تجربی-2020
The advent of Big Data has ushered a new era of scientific breakthroughs. One of the com- mon issues that affects raw data is class imbalance problem which refers to imbalanced distribution of values of the response variable. This issue is present in fraud detection, network intrusion detection, medical diagnostics, and a number of other fields where neg- atively labeled instances significantly outnumber positively labeled instances. Modern ma- chine learning techniques struggle to deal with imbalanced data by focusing on minimizing the error rate for the majority class while ignoring the minority class. The goal of our pa- per is demonstrate the effects of class imbalance on classification models. Concretely, we study the impact of varying class imbalance ratios on classifier accuracy. By highlighting the precise nature of the relationship between the degree of class imbalance and the cor- responding effects on classifier performance we hope to help researchers to better tackle the problem. To this end, we carry out extensive experiments using 10-fold cross validation on a large number of datasets. In particular, we determine that the relationship between the class imbalance ratio and the accuracy is convex.
Keywords: Classification | Class imbalance | Data analysis | Machine learning | Statistical analysis | Supervised learning
Graph Deconvolutional Networks
شبکه Deconvolutional گراف-2020
Graphs and networks are very common data structure for modelling complex systems that are composed of a number of nodes and topologies, such as social networks, citation networks, biological protein-protein interactions networks, etc. In recent years, machine learning has become an efficient technique to obtain representation of graph for downstream graph analysis tasks, including node classification, link prediction, and community detection. Different with traditional graph analytical models, the representation learning on graph tries to learn low dimensional embeddings by means of machine learning models that could be trained in supervised, unsupervised or semi-supervised manners. Compared with traditional approaches that directly use input node attributes, these embeddings are much more informative and helpful for graph analysis. There are a number of developed models in this respect, that are different in the ways of measuring similarity of vertexes in both original space and feature space. In order to learn more efficient node representation with better generalization property, we propose a task-independent graph representation model, called as graph deconvolutional network (GDN), and corresponding unsupervised learning algorithm in this paper. Different with graph convolution network (GCN) from the scratch, which produces embeddings by convolving input attribute vec- tors with learned filters, the embeddings of the proposed GDN model are desired to be convolved with filters so that reconstruct the input node attribute vectors as far as possible. The embeddings and filters are alternatively optimized in the learning procedure. The correctness of the proposed GDN model is verified by multiple tasks over several datasets. The experimental results show that the GDN model outperforms existing alternatives with a big margin
Keywords: graph representation | representation learning | unsupervised learning |node embedding | machine learning
The digital surgeon: How big data, automation, and artificial intelligence will change surgical practice
جراح دیجیتال: داده های بزرگ ، اتوماسیون و هوش مصنوعی چقدر عمل جراحی را تغییر می دهند-2020
Exponential growth in computing power, data storage, and sensing technology has led to a world in which we can both capture and analyze incredibleamounts of data. The evolution of machine learning has further advanced the ability of computers to develop insights from massive data sets that are beyond the capacity of human analysis. The convergence of computational power, data storage, connectivity, and Artificial Intelligence (AI) has led to health technologies that, to date, have focused on diagnostic areas such as radiology and pathology. The question remains how the digital revolution will translate in the realm of surgery. There are three main areas where the authors believe that AI could impact surgery in the near future: enhancement of trainingmodalities, cognitive enhancement of the surgeon, and procedural automation.While the promise of Big Data, AI, and Automation is high, there have been unanticipated missteps in the use of such technologies that are worth considering as we evaluate how such technologies could/should be adopted in surgical practice. Surgeons must be prepared to adopt smarter training modalities, supervise the learning of machines that can enhance cognitive function, and ultimately oversee autonomous surgery without allowing for a decay in the surgeon’s operating skills.
Key words: Future pediatric surgery | Automation and artificial intelligence in | pediatric surgery