Managing complex engineering projects: What can we learn from the evolving digital footprint?
مدیریت پروژه های پیچیده مهندسی: از ردپای دیجیتال در حال تحول چه می توانیم یاد بگیریم؟-2020
The challenges of managing large complex engineering projects, such as those involving the design of infrastructure, aerospace and industrial systems; are widely acknowledged. While there exists a mature set of project management tools and methods, many of todays projects overrun in terms of both time and cost. Existing literature attributes these overruns to factors such as: unforeseen dependencies, a lack of understanding, late changes, poor communication, limited resource availability (inc. personnel), incomplete data and aspects of culture and planning. Fundamental to overcoming these factors belies the challenge of how management information relating to them can be provided, and done so in a cost eff ;ective manner. Motivated by this challenge, recent research has demonstrated how management information can be automatically generated from the evolving digital footprint of an engineering project, which encompasses a broad range of data types and sources. In contrast to existing work that reports the generation, verification and application of methods for generating management information, this paper reviews all the reported methods to appraise the scope of management information that can be automatically generated from the digital footprint. In so doing, the paper presents a reference model for the generation of managerial information from the digital footprint, an appraisal of 27 methods, and a critical reflection of the scope and generalisability of data-driven project management methods. Key findings from the appraisal include the role of email in providing insights into potential issues, the role of computer models in automatically eliciting process and product dependencies, and the role of project documentation in assessing project norms. The critical reflection also raises issues such as privacy, highlights the enabling technologies, and presents opportunities for new Business Intelligence tools that are based on real-time monitoring and analysis of digital footprints.
Keywords: Big Data | Project Management | Business Intelligence | Knowledge Workers
Big data analytics for financial Market volatility forecast based on support vector machine
تجزیه و تحلیل داده های بزرگ برای پیش بینی نوسانات مالی بازار بر اساس دستگاه بردار پشتیبانی-2020
High-frequency data provides a lot of materials and broad research prospects for in-depth research and understanding on financial market behavior, but the problems solved in the research of high-frequency data are far less than the problems faced and encountered, and the research value of high-frequency data will be greatly reduced without solving these problems. Volatility is an important measurement index of market risk, and the research and forecasting on the volatility of high-frequency data is of great significance to investors, government regulators and capital markets. To this end, by modelling the jump volatility of high-frequency data, the shortterm volatility of high-frequency data are predicted.
Keywords: Big data | Financial market | Volatility | Support vector machine
Text mining of industry 4:0 job advertisements
استخراج متن آگهی های شغلی صنعت 4:0-2020
Since changes in job characteristics in areas such as Industry 4.0 are rapid, fast tool for analysis of job advertisements is needed. Current knowledge about competencies required in Industry 4.0 is scarce. The goal of this paper is to develop a profile of Industry 4.0 job advertisements, using text mining on publicly available job advertisements, which are often used as a channel for collecting relevant information about the required knowledge and skills in rapid-changing industries. We searched website, which publishes job advertisements, related to Industry 4.0, and performed text mining analysis on the data collected from those job advertisements. Analysis of the job advertisements revealed that most of them were for full time entry; associate and mid-senior level management positions and mainly came from the United States and Germany. Text mining analysis resulted in two groups of job profiles. The first group of job profiles was focused solely on the knowledge related to Industry 4.0: cyberphysical systems and the Internet of things for robotized production; and smart production design and production control. The second group of job profiles was focused on more general knowledge areas, which are adapted to Industry 4.0: supply change management, customer satisfaction, and enterprise software. Topic mining was conducted on the extracted phrases generating various multidisciplinary job profiles. Higher educational institutions, human resources professionals, as well as experts that are already employed or aspire to be employed in Industry 4.0 organizations, would benefit from the results of our analysis.
Keywords: Human resource management | Text mining | Job profiles | Big data analytics | Industry 4.0 | Education | Smart factory
Big data analytics in health sector: Theoretical framework, techniques and prospects
تجزیه و تحلیل داده های بزرگ در بخش بهداشت و درمان: چارچوب نظری ، تکنیک ها و چشم انداز-2020
Clinicians, healthcare providers-suppliers, policy makers and patients are experiencing exciting opportunities in light of new information deriving from the analysis of big data sets, a capability that has emerged in the last decades. Due to the rapid increase of publications in the healthcare industry, we have conducted a structured review regarding healthcare big data analytics. With reference to the resource-based view theory we focus on how big data resources are utilised to create organization values/capabilities, and through content analysis of the selected publications we discuss: the classification of big data types related to healthcare, the associate analysis techniques, the created value for stakeholders, the platforms and tools for handling big health data and future aspects in the field. We present a number of pragmatic examples to show how the advances in healthcare were made possible. We believe that the findings of this review are stimulating and provide valuable information to practitioners, policy makers and researchers while presenting them with certain paths for future research.
Keywords: Big data analytics | Health-Medicine | Decision-making | Machine learning | Operations research (OR) techniques
Big Data Everywhere
داده های بزرگ در همه جا-2020
Big Data and machine-learning approaches to analytics are an important new frontier in laboratory medicine. Direct-to-consumer (DTC) testing raises specific challenges in applying these new tools of data analytics. Because DTC data are not centralized by default, there is a need for data repositories to aggregate these values to develop appropriate predictive models. The lack of a default linkage between DTC results and medical outcomes data also limits the ability to mine these data for predictive modeling of disease risk. Issues of standardization and harmonization, which are a significant issue across all laboratory medicine, may be particularly difficult to correct in aggregated sets of DTC data
KEYWORDS : Big Data | Laboratory medicine | Machine learning | Direct-to-consumer testing | DTC | Harmonization
A full-disk image standardization of the chromosphere observation at Huairou Solar Observing Station
استاندارد سازی دیسک کامل تصویر از مشاهده کروموسفر در ایستگاه مشاهده خورشیدی Huairou-2020
Observations of local features in the solar chromosphere began in 1992 at Huairou Solar Observing Station, while the full-disk chromosphere observations were carried out since 2000. In order to facilitate researchers to use full-disk chromosphere observation, algorithms have been developed to standardize the full-disk images. The algorithms include the determination of the center of the image and size standardization, geometric correction and intensity normalization. The solar limb of each image is determined from a histogram analysis of its intensity distribution. The center and radius are then calculated and the image is corrected for geometric distortions. Images are re-scaled to have a fixed radius of 500 pixels and centered within the 1024 1024 frame. Finally, large-scale variations in intensity, such as limb-darkening, are removed using a median filter. This paper provides a detailed description of these algorithms, and a summary of the properties of these chromospheric full-disk observations to be used for further scientific investigations.
Keywords: Chromosphere | Data standardization | Physical parameters | Big data
Guest satisfaction & dissatisfaction in luxury hotels: An application of big data
رضایت و نارضایتی مهمان در هتل های لوکس: استفاده از داده های بزرگ-2020
In order to understand the pivotal attributes of luxury hotel service in Malaysia, this study analyses big data in the form of online reviews, as available in TripAdvisor. The content analysis, which was performed using the word frequency analysis has revealed that the main themes of luxury hotel service quality include hotel-related attributes, room-related attributes, staff-related attributes, travel-related attributes, and possible outcomes. The critical incident technique has also been performed to examine the antecedents and outcomes of hotel guests’ satisfaction and dissatisfaction. In this study, quality of rooms and interaction with employees have been determined as major drivers of customers’ word of mouth and revisit intentions. This study contributes with an empirical analysis of particular features of textual context and discussion of the concept of luxury service in the developing countries has been largely neglected so far.
Keywords: Luxury hotel service | Online review | Service quality | Satisfaction | Dissatisfaction | Post-purchase behavior
Big data and stream processing platforms for Industry 4:0 requirements mapping for a predictive maintenance use case
چهارچوب داده های بزرگ و پردازش جریان برای نگاشت الزامات صنعت 4:0 برای یک مورد استفاده نگهداری پیشگویانه-2020
Industry 4.0 is considered to be the fourth industrial revolution introducing a new paradigm of digital, autonomous, and decentralized control for manufacturing systems. Two key objectives for Industry 4.0 applications are to guarantee maximum uptime throughout the production chain and to increase productivity while reducing production cost. As the data-driven economy evolves, enterprises have started to utilize big data techniques to achieve these objectives. Big data and IoT technologies are playing a pivotal role in building data-oriented applications such as predictive maintenance. In this paper, we use a systematic methodology to review the strengths and weaknesses of existing opensource technologies for big data and stream processing to establish their usage for Industry 4.0 use cases. We identified a set of requirements for the two selected use cases of predictive maintenance in the areas of rail transportation and wind energy. We conducted a breadth-first mapping of predictive maintenance use-case requirements to the capabilities of big data streaming technologies focusing on open-source tools. Based on our research, we propose some optimal combinations of open-source big data technologies for our selected use cases.
Keywords: Industry 4.0 | Big Data | Stream processing | Predictive maintenance | Railway | Wind turbines
A multi-scale method for forecasting oil price with multi-factor search engine data
یک روش چند مقیاس برای پیش بینی قیمت نفت با داده های موتور جستجوی چند عاملی-2020
With the boom in big data, a promising idea for using search engine data has emerged and improved international oil price prediction, a hot topic in the fields of energy system modelling and analysis. Since different search engine data drive the oil price in different ways at different timescales, a multi-scale forecasting methodology is proposed that carefully explores the multi-scale relationship between the oil price and multi-factor search engine data. In the proposed methodology, three major steps are involved: (1) a multi-factor data process, to collect informative search engine data, reduce dimensionality, and test the predictive power via statistical analyses; (2) multi-scale analysis, to extract matched common modes at similar timescales from the oil price and multi-factor search engine data via multivariate empirical mode decomposition; (3) oil price prediction, including individual prediction at each timescale and ensemble prediction across timescales via a typical forecasting technique. With the Brent oil price as a sample, the empirical results show that the novel methodology significantly outperforms its original form (without multi-factor search engine data and multi-scale analysis), semi-improved versions (with either multi-factor search engine data or multi-scale analysis), and similar counterparts (with other multi-scale analysis), in both the level and directional predictions.
Keywords: Big data | Search engine data | Google trends | Multivariate empirical mode decomposition | Oil price forecasting
Pivot-based approximate k-NN similarity joins for big high-dimensional data
پیوندهای شباهت تقریبی k-NN مبتنی بر محوری برای داده های بزرگ با ابعاد بزرگ-2020
Given an appropriate similarity model, the k-nearest neighbor similarity join represents a useful yet costly operator for data mining, data analysis and data exploration applications. The time to evaluate the operator depends on the size of datasets, data distribution and the dimensionality of data representations. For vast volumes of high-dimensional data, only distributed and approximate approaches make the joins practically feasible. In this paper, we investigate and evaluate the performance of multiple MapReduce-based approximate k-NN similarity join approaches on two leading Big Data systems Apache Hadoop and Spark. Focusing on the metric space approach relying on reference dataset objects (pivots), this paper investigates distributed similarity join techniques with and without approximation guarantees and also proposes high-dimensional extensions to previously proposed algorithms. The paper describes the design guidelines, algorithmic details, and key theoretical underpinnings of the compared approaches and also presents the empirical performance evaluation, approximation precision, and scalability properties of the implemented algorithms. Moreover, the Spark source code of all these algorithms has been made publicly available. Key findings of the experimental analysis are that randomly initialized pivot-based methods perform well with big highdimensional data and that, in general, the selection of the best algorithm depends on the desired levels of approximation guarantee, precision and execution time.
Keywords: Hadoop | Spark | MapReduce | k-NN | Approximate similarity join | High-dimensional data