Dynamic occupant density models of commercial buildings for urban energy simulation
مدلهای چگالی اشغال پویا ساختمانهای تجاری برای شبیه سازی انرژی شهری-2020
The number of occupants and its changing pattern over time are key information for building and urban energy simulation. However, the commonly used assumption and simplification of a fixed occupancy schedule does not reflect the complicated reality, leading to significant errors in energy simulation. Therefore, dynamic occupant density models which describe the real-world situation more accurately should be developed. This paper presents a methodology to develop such a model for commercial buildings and expand it from the building level to urban level. First, a total of 2275 commercial buildings in Nanjing, a major city in China, are identified and classified into three sub-categories using Points of Interest and logistic regression. Then field measurement is conducted to obtain the hourly occupant density for 12 sample commercial buildings. The building-level dynamic occupant density model is developed by fitting normal distribution functions into the measured data. Finally, transportation accessibility and population level, two urban parameters, are defined and used to expand the buildinglevel occupant density model to the urban-level one. The dynamic urban-level occupant density model is verified for all three sub-categories of commercial buildings and the overall results are acceptable.
Keywords: Big data | Commercial buildings | Urban-level | Dynamic occupant density models
Impact factors of the real-world fuel consumption rate of light duty vehicles in China
عوامل مؤثر بر میزان مصرف سوخت در دنیای واقعی از وسایل نقلیه سبک وزن در چین-2020
Measuring real-world fuel consumption of light duty vehicles can be challenging due to the limited collection of actual data. In this paper, we use big data retrieved from the record of real-world fuel consumptions of different brands of vehicles in different areas (n ¼ 106,809 samples from 201 brands of vehicles and 34 cities) in China to build up a real-world fuel consumption rate (RFCR) model to estimate the fuel consumption given the driving conditions and figure out the main factors that affect actual fuel consumption in the real world.We find the average deviation of actual fuel consumptions and the fitting results of RFCR model is 4.22% , which does not significantly differ from zero, and the fuel consumptions calculated by RFCR model tend to be 1.40 L/100 km (about 25%) higher than the official reported data. Furthermore, we find that annual average temperature and altitude factors significantly influence the fuel consumption rate. The results indicate that there is a real world performance discrepancy between the theoretical fuel consumption released by authorities and that in the real world, and some green behaviors (choose light duty vehicles, reduce the use of air conditioning and change to manual transmission type) can reduce energy consumption of vehicles.
Keywords: Real-world fuel consumption rate | Energy consumption | Private passenger vehicles | Big data | China
Analysis of substance use and its outcomes by machine learning I: Childhood evaluation of liability to substance use disorder
تجزیه و تحلیل استفاده از مواد و نتایج آن با یادگیری ماشین I: ارزیابی کودک از مسئولیت در برابر اختلال در مصرف مواد-2020
Background: Substance use disorder (SUD) exacts enormous societal costs in the United States, and it is important to detect high-risk youths for prevention. Machine learning (ML) is the method to find patterns and make prediction from data. We hypothesized that ML identifies the health, psychological, psychiatric, and contextual features to predict SUD, and the identified features predict high-risk individuals to develop SUD. Method: Male (N=494) and female (N=206) participants and their informant parents were administered a battery of questionnaires across five waves of assessment conducted at 10–12, 12–14, 16, 19, and 22 years of age. Characteristics most strongly associated with SUD were identified using the random forest (RF)algorithm from approximately 1000 variables measured at each assessment. Next, the complement of features was validated, and the best models were selected for predicting SUD using seven ML algorithms. Lastly, area under the receiver operating characteristic curve (AUROC) evaluated accuracy of detecting individuals who develop SUD +/- up to thirty years of age. Results: Approximately thirty variables strongly predict SUD. The predictors shift from psychological dysregulation and poor health behavior in late childhood to non-normative socialization in mid to late adolescence. In 10–12-year-old youths, the features predict SUD+/- with 74% accuracy, increasing to 86% at 22 years of age. The RF algorithm optimally detects individuals between 10–22 years of age who develop SUD compared to other ML algorithms. Conclusion: These findings inform the items required for inclusion in instruments to accurately identify high risk youths and young adults requiring SUD prevention
Keywords: Substance use disorder | Machine learning | Substance abuse prevention | Big data | Screening addiction risk
A novel intelligent option price forecasting and trading system by multiple kernel adaptive filters
رویکرد پیش بینی قیمت و گزینه سیستم تجاری با فیلترهای انطباقی چند هسته ای-2020
Derivatives such as options are complex financial instruments. The risk in option trading leads to the demand of trading support systems for investors to control and hedge their risk. The nonlinearity and non-stationarity of option dynamics are the main challenge of option price forecasting. To address the problem, this study develops a multi-kernel adaptive filters (MKAF) for online option trading. MKAF is an improved version of the adaptive filter, which employs multiple kernels to enhance the richness of nonlinear feature representation. The MKAF is a fully adaptive online algorithm. The strength of MKAF is that the weights to the kernels are simultaneous optimally determined in filter coefficient updates. We do not need to design the weights separately. Therefore, MKAF is good at tracking nonstationary nonlinear option dynamics. Moreover, to reduce the computation time in updating the filter, and prevent overadaptation, the number of kernels is restricted by using coherence-based sparsification, which constructs a set of dictionary and uses a coherence threshold to restrict the dictionary size. This study compared the new method with traditional ones, we found the performance improvement is significant and robust. Especially, the cumulated trading profits are substantially increased
Keywords: Artificial intelligence | Adaptive filter | Multiple Kernel Machine | Big data analysis | Data mining | Financial forecasting
Understanding the impact of business analytics on innovation
درک تأثیر تحلیل های تجاری بر نوآوری-2020
Advances in Business Analytics in the era of Big Data have provided unprecedented opportunities for or- ganizations to innovate. With insights gained from Business Analytics, companies are able to develop new or improved products/services. However, few studies have investigated the mechanism through which Business Analytics contributes to a firm’s innovation success. This research aims to address this gap by theoretically and empirically investigating the relationship between Business Analytics and innovation. To achieve this aim, absorptive capacity theory is used as a theoretical lens to inform the development of a research model. Absorptive capacity theory refers to a firm’s ability to recognize the value of new, external information, assimilate it and apply it to commercial ends. The research model covers the use of Business Analytics, environmental scanning, data-driven culture, innovation (new product newness and meaningfulness), and competitive advantage. The research model is tested through a questionnaire survey of 218 UK businesses. The results suggest that Business Analytics directly improves environmental scan- ning which in turn helps to enhance a company’s innovation. Business Analytics also directly enhances data-driven culture that in turn impacts on environmental scanning. Data-driven culture plays another important role by moderating the effect of environmental scanning on new product meaningfulness. The findings demonstrate the positive impact of business analytics on innovation and the pivotal roles of en- vironmental scanning and data-driven culture. Organizations wishing to realize the potential of Business Analytics thus need changes in both their external and internal focus.
Keywords: Analytics | Innovation | Big Data | Data-driven culture | Absorptive capacity
An analytic infrastructure for harvesting big data to enhance supply chain performance
یک زیرساخت تحلیلی برای برداشت داده های بزرگ به منظور افزایش عملکرد زنجیره تأمین-2020
Big data has already received a tremendous amount of attention from managers in every industry, policy and decision makers in governments, and researchers in many different areas. However, the current big data analytics have conspicuous limitations, especially when dealing with information silos. In this pa- per, we synthesise existing researches on big data analytics and propose an integrated infrastructure for breaking down the information silos, in order to enhance supply chain performance. The analytic infras- tructure effectively leverages rich big data sources (i.e. databases, social media, mobile and sensor data) and quantifies the related information using various big data analytics. The information generated can be used to identify a required competence set (which refers to a collection of skills and knowledge used for specific problem solving) and to provide roadmaps to firms and managers in generating actionable supply chain strategies, facilitating collaboration between departments, and generating fact-based opera- tional decisions. We showcase the usefulness of the analytic infrastructure by conducting a case study in a world-leading company that produces sports equipment. The results indicate that it enabled managers: (a) to integrate information silos in big data analytics to serve as inputs for new product ideas; (b) to capture and interrelate different competence sets to provide an integrated perspective of the firm’s op- erations capabilities; and (c) to generate a visual decision path that facilitated decision making regarding how to expand competence sets to support new product development.
Keywords: Decision support systems | Big data | Analytic infrastructure | Competence set | Deduction graph
Attacking and defending multiple valuable secrets in a big data world
حمله و دفاع از اسرار چند ارزشمندی در جهان داده های بزرگ-2020
This paper studies the attack-and-defence game between a web user and a whole set of players over this user’s ‘valuable secrets.’ The number and type of these valuable secrets are the user’s private information. Attempts to tap information as well as privacy protection are costly. The multiplicity of secrets is of strategic value for the holders of these secrets. Users with few secrets keep their secrets private with some probability, even though they do not protect them. Users with many secrets protect their secrets at a cost that is smaller than the value of the secrets protected. The analysis also accounts for multiple redundant information channels with cost asymmetries, relating the analysis to attack-and-defence games with a weakest link.
Keywords: Big-data | Privacy | Conflict | Valuable secrets | Attack-and-defence
Aggregation of inputs and outputs prior to Data Envelopment Analysis under big data
جمع شدن ورودی ها و خروجی ها قبل از تجزیه و تحلیل پوششی داده ها تحت داده های بزرگ-2020
The main goal of this paper is to explore the possible solutions to a ‘big data’ problem related to the very large dimensions of input–output data. In particular, we focus on the cases of severe ‘curse of di- mensionality’ problem that require dimension-reduction prior to using Data Envelopment Analysis. To achieve this goal, we have presented some theoretical grounds and performed a new to the literature simulation study where we explored the price-based aggregation as a solution to address the problem of very large dimensions.
Keywords: Data Envelopment Analysis | Productivity | Efficiency | Big data
A new approach for identifying the Kemeny median ranking
یک روش جدید برای شناسایی رتبه بندی متوسط Kemeny-2020
Condorcet consistent rules were originally developed for preference aggregation in the theory of social choice. Nowadays these rules are applied in a variety of fields such as discrete multi-criteria analysis, defence and security decision support, composite indicators, machine learning, artificial intelligence, queries in databases or internet multiple search engines and theoretical computer science. The cycle issue, known also as Condorcets paradox, is the most serious problem inherent in this type of rules. Solutions for dealing with the cycle issue properly already exist in the literature; the most important one being the identification of the median ranking, often called the Kemeny ranking. Unfortunately its identification is a NP-hard problem. This article has three main objectives: (1) to clarify that the Kemeny median order has to be framed in the context of Condorcet consistent rules; this is important since in the current practice sometimes even the Borda count is used as a proxy for the Kemeny ranking. (2) To present a new exact algorithm, this identifies the Kemeny median ranking by providing a searching time guarantee. (3) To present a new heuristic algorithm identifying the Kemeny median ranking with an optimal trade-off between convergence and approximation .
Keywords : Decision analysis | Combinatorial optimisation | Social choice| Multiple criteria | Artificial intelligence| Defence and security| Big data
Forecasting crude oil price with multilingual search engine data
پیش بینی قیمت نفت خام با داده های موتور جستجو چند زبانه-2020
In the big data era, search engine data (SED) has presented new opportunities for improving crude oil price prediction; however, the existing research were confined to single-language (mostly English) search keywords in SED collection. To address such a language bias and grasp worldwide investor attention, this study proposes a novel multilingual SED-driven forecasting methodology from a global perspective. The proposed methodology includes three main steps: (1) multilingual index construction, based on multilingual SED; (2) relationship investigation, between the multilingual index and crude oil price; and (3) oil price prediction, with the multilingual index as an informative predictor. With WTI spot price as studying samples, the empirical results indicate that SED have a powerful predictive power for crude oil price; nevertheless, multilingual SED statistically demonstrate better performance than single-language SED, in terms of enhancing prediction accuracy and model robustness.
Keywords: Big data | Multilingual search engine index | Crude oil price forecasting | Google Trends | Artificial intelligence