دانلود و نمایش مقالات مرتبط با مقاله انگیسی رایگان داده کاوی 2015::صفحه 1
بلافاصله پس از پرداخت دانلود کنید

با سلام خدمت کاربران در صورتی که با خطای سیستم پرداخت بانکی مواجه شدید از طریق کارت به کارت (6037997535328901 بانک ملی ناصر خنجری ) مقاله خود را دریافت کنید (تا مشکل رفع گردد). 

نتیجه جستجو - مقاله انگیسی رایگان داده کاوی 2015

تعداد مقالات یافته شده: 167
ردیف عنوان نوع
1 Robust causal dependence mining in big data network and its application to traffic flow predictions
کاوش وابستگی سببی قوی در شبکه داده های بزرگ و کاربرد آن در پیش بینی جریان ترافیک-2015
In this paper, we focus on a special problem in transportation studies that concerns the so called ‘‘Big Data’’ challenge, which is: how to build concise yet accurate traffic flow prediction models based on the massive data collected by different sensors? The size of the data, the hid- den causal dependence and the complexity of traffic time series are some of the obstacles that affect making reliable forecast at a reasonable cost, both time-wise and computation- wise. To better prepare the data for traffic modeling, we introduce a multiple-step strategy to process the raw ‘‘Big Data’’ into compact time series that are better suited for regression and causality analysis. First, we use the Granger causality to define and determine the potential dependence among data, and produce a much condensed set of times series who are also highly dependent. Next, we deploy a decomposition algorithm to separate daily-similar trend and nonstationary bursts components from the traffic flow time series yielded by the Granger test. The decomposition results are then treated by two rounds of Lasso regression: the standard Lasso method is first used to quickly filter out most of the irrelevant data, followed by a robust Lasso method to further remove the disturbance caused by bursts components and recover the strongest dependence among the remaining data. Test results show that the proposed method significantly reduces the costs of building prediction models. Moreover, the obtained causal dependence graph reveals the relation- ship between the structure of road networks and the correlations among traffic time series. All these findings are useful for building better traffic flow prediction models.© 2015 Elsevier Ltd. All rights reserved.
Keywords: Big Data | Traffic flow prediction | Causal dependence | Lasso regression | Robust
مقاله انگلیسی
2 A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring
یک مدل داده کاوی ترکیبی الگوریتم های انتخاب ویژگی و طبقه بندی یادگیری گروه برای امتیازدهی اعتباری-2015
Data mining techniques have numerous applications in credit scoring of customers in the banking field. One of the most popular data mining techniques is the classification method. Previous researches have demonstrated that using the feature selection (FS) algorithms and ensemble classifiers can improve the banks' performance in credit scoring problems. In this domain, the main issue is the simultaneous and the hybrid utilization of several FS and ensemble learning classification algorithms with respect to their parameters setting, in order to achieve a higher performance in the proposed model. As a result, the present paper has developed a hybrid data mining model of feature selection and ensemble learning classification algorithms on the basis of three stages. The first stage, as expected, deals with the data gathering and pre-processing. In the second stage, four FS algorithms are employed, including principal component analysis (PCA), genetic algorithm (GA), information gain ratio, and relief attribute evaluation function. In here, parameters setting of FS methods is based on the classification accuracy resulted from the implementation of the support vector machine (SVM) classification algorithm. After choosing the appropriate model for each selected feature, they are applied to the base and ensemble classification algorithms. In this stage, the best FS algorithm with its parameters setting is indicated for the modeling stage of the proposed model. In the third stage, the classification algorithms are employed for the dataset prepared from each FS algorithm. The results exhibited that in the second stage, PCA algorithm is the best FS algorithm. In the third stage, the classification results showed that the artificial neural network (ANN) adaptive boosting (AdaBoost) method has higher classification accuracy. Ultimately, the paper verified and proposed the hybrid model as an operative and strong model for performing credit scoring.& 2015 Elsevier Ltd. All rights reserved.1.
Keywords: Credit scoring | Classification | Feature selection | Ensemble learning | Data mining
مقاله انگلیسی
3 DISEASES: Text mining and data integration of disease–gene associations
متن کاوی و یکپارچه سازی داده های انجمنی ژن بیماری-2015
Text mining is a flexible technology that can be applied to numerous different tasks in biology and med- icine. We present a system for extracting disease–gene associations from biomedical abstracts. The sys- tem consists of a highly efficient dictionary-based tagger for named entity recognition of human genes and diseases, which we combine with a scoring scheme that takes into account co-occurrences both within and between sentences. We show that this approach is able to extract half of all manually curated associations with a false positive rate of only 0.16%. Nonetheless, text mining should not stand alone, but be combined with other types of evidence. For this reason, we have developed the DISEASES resource, which integrates the results from text mining with manually curated disease–gene associations, cancer mutation data, and genome-wide association studies from existing databases. The DISEASES resource is accessible through a web interface at http://diseases.jensenlab.org/, where the text-mining software and all associations are also freely available for download.© 2014 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/3.0/).
Keywords: Text mining | Named entity recognition | Information extraction | Data integration | Web resource
مقاله انگلیسی
4 Mining the British Isles oak tree-ring data set: Part A: Rationale, data, software, and proof of concept
کاوش مجموعه داده های حلقه های درخت بلوط جزایر بریتانیا : قسمت A: منطق، داده ها، نرم افزار، و اثبات مفهوم-2015
Article history:Received 25 February 2015Received in revised form 21 May 2015 Accepted 21 May 2015Available online 11 June 2015Keywords: Dendrochronology OakFrom stuttering early beginnings, archaeological oak dendrochronology in the British Isles developed rapidly in the latter decades of the 20th century, to the present situation where dozens of new crossdated site chronologies are produced each year. Although unevenly spread in both space and time, the available data set is now so large (several thousand sites) that it has the potential to be mined for applications that were not envisaged when the data were originally collected. Here we compile available data into an oak database of archaeological and modern (living) sites, develop a software tool to visualise spatial patterns and correlations, and explore six potential data-mining applications (crossdating methodology, cross- dating error detection, regional chronologies, pointer years, provenancing, past climate reconstruction). Results indicate variable data-mining potential, but with viable prospects in each case.© 2015 Elsevier GmbH. All rights reserved.
Keywords: Dendrochronology | Oak
مقاله انگلیسی
5 The opportunities of mining historical and collective data in drug discovery
فرصت های کاوش داده های تاریخی و جمعی در کشف مواد مخدر-2015
Vast amounts of bioactivity data have been generated for small molecules across public and corporate domains. Biological signatures, either derived from systematic profiling efforts or from existing historical assay data, have been successfully employed for small molecule mechanism-of-action elucidation, drug repositioning, hit expansion and screening subset design. This article reviews different types of biological descriptors and applications, and we demonstrate how biological data can outlive the original purpose or project for which it was generated. By comparing 150 HTS campaigns run at Novartis over the past decade on the basis of their active and inactive chemical matter, we highlight the opportunities and challenges associated with cross-project learning in drug discovery.
مقاله انگلیسی
6 A method of data mining for selection of site for wind turbines
روش های داده کاوی برای انتخاب سایت برای توربین های بادی-2015
The paper aims at proposing a data mining framework which will help in selection of suitable site for wind turbine's installation. After thorough the literature review, a list of indicators is prepared which is used for analysis of a particular site consisting of wind speed, different built-ups, and distance of existing energy installations from the prospect site, cost dependent factors, and ecological impacts. These dimensions caused increase in the attributes of dataset which are then reduced by using principal component analysis. The resulting components are regressed by using multiple regression technique. These techniques are applied on an integrated database which is prepared by storing the data of 39 sites in Pakistan. The prediction of the model developed for the wind energy site has been found to be significantly accurate when compared with expert opinion and previous studies.& 2015 Elsevier Ltd. All rights reserved.
Keywords: Site selection | Data mining | Wind turbine | Principal component analysis | Regression analysis | Renewable energy
مقاله انگلیسی
7 A hybrid cost estimation framework based on feature-oriented data mining approach
چارچوب برآورد هزینه های ترکیبی بر اساس رویکرد ویژگی گرای داده کاوی-2015
This paper presents an informatics framework to apply feature-based engineering concept for cost esti- mation supported with data mining algorithms. The purpose of this research work is to provide a prac- tical procedure for more accurate cost estimation by using the commonly available manufacturing process data associated with ERP systems. The proposed method combines linear regression and data-mining techniques, leverages the unique strengths of the both, and creates a mechanism to discover cost features. The final estimation function takes the user’s confidence level over each member technique into consideration such that the application of the method can phase in gradually in reality by building up the data mining capability. A case study demonstrates the proposed framework and compares the results from empirical cost prediction and data mining. The case study results indicate that the combined method is flexible and promising for determining the costs of the example welding features. With the result comparison between the empirical prediction and five different data mining algorithms, the ANN algorithm shows to be the most accurate for welding operations.© 2015 Elsevier Ltd. All rights reserved.
Keywords: Cost estimation | Feature modeling | Data mining | ERP | Welding feature
مقاله انگلیسی
8 Product concept evaluation and selection using data mining and domain ontology in a crowdsourcing environment
ارزیابی مفهوم محصولات و انتخاب با استفاده از داده کاوی و حوزه هستی شناسی در یک محیط جمعی سپاری-2015
For product design and development, crowdsourcing shows huge potential for fostering creativity and has been regarded as one important approach to acquiring innovative concepts. Nevertheless, prior to the approach could be effectively implemented, the following challenges concerning crowdsourcing should be properly addressed: (1) burdensome concept review process to deal with a large amount of crowd-sourced design concepts; (2) insufficient consideration in integrating design knowledge and prin- ciples into existing data processing methods/algorithms for crowdsourcing; and (3) lack of a quantitative decision support process to identify better concepts. To tackle these problems, a product concept evalu- ation and selection approach, which comprises three modules, is proposed. These modules are respec- tively: (1) a data mining module to extract meaningful information from online crowd-sourced concepts; (2) a concept re-construction module to organize word tokens into a unified frame using domain ontology and extended design knowledge; and (3) a decision support module to select better concepts in a simplified manner. A pilot study on future PC (personal computer) design was conducted to demonstrate the proposed approach. The results show that the proposed approach is promising and may help to improve the concept review and evaluation efficiency; facilitate data processing using design knowledge; and enhance the reliability of concept selection decisions.‌‌‌‌© 2015 Elsevier Ltd. All rights reserved.
Keywords: Product concept evaluation and selection | Data mining | Domain ontology | Crowdsourcing
مقاله انگلیسی
9 Clinical Diabetes Research Using Data Mining: A Canadian Perspective
تحقیقات دیابت بالینی با استفاده از داده کاوی: دیدگاه کانادا-2015
With the advent of the digitization of large amounts of information and the computer power capable of analyzing this volume of information, data mining is increasingly being applied to medical research. Datasets created for administration of the healthcare system provide a wealth of information from different healthcare sectors, and Canadian provinces’ single-payer universal healthcare systems mean that data are more comprehensive and complete in this country than in many other jurisdictions. The increasing ability to also link clinical information, such as electronic medical records, laboratory test results and disease registries, has broadened the types of data available for analysis. Data-mining methods have been used in many different areas of diabetes clinical research, including classic epide- miology, effectiveness research, population health and health services research. Although methodologic challenges and privacy concerns remain important barriers to using these techniques, data mining remains a powerful tool for clinical research.r é s u m é © 2015 Canadian Diabetes AssociationMots clés :données administratives exploration de données méthodes de rechercheAvec l’avènement de la numérisation de grandes quantités d’informations et la puissance informatique capable d’analyser ce volume d’informations, l’exploration de données est de plus en plus appliquée à la recherche médicale. Les jeux de données créés pour l’administration du système de santé fournissent une mine d’informations provenant de différents secteurs de la santé, et les systèmes de santé universels à payeur unique des provinces canadiennes signifie que les données sont plus générales et complètes dans ce pays que dans beaucoup d’autres juridictions. L’augmentation de la capacité de lier l’information d’origine clinique, tels les dossiers médicaux électroniques, les résultats des tests de laboratoire et les registres des maladies, a élargi les types de données disponibles pour analyse. Les méthodes d’extraction de données ont été utilisées dans de nombreux domaines de la recherche clinique du diabète, y compris l’épidémiologie classique, la recherche d’efficacité, la recherche sur la santé des populations et des ser- vices de santé. Bien que les défis méthodologiques et de confidentialité restent d’importants obstacles à l’utilisation de ces techniques, l’exploration de données reste un outil puissant pour la recherche clinique.© 2015 Canadian Diabetes Association
Keywords: Privacy-preserving data mining | Ensemble learning | Electronic health record | Boosting | Machine learning | Healthcare
مقاله انگلیسی
10 Evolutionary collective behavior decomposition model for time series data mining
مدل تجزیه رفتار جمعی تکاملی برای داده کاوی سری زمانی-2015
Article history:Received 8 March 2013Received in revised form 24 August 2014 Accepted 22 September 2014Available online 7 October 2014Keywords: Minority games Mixed gamesCollective behavior decomposition Genetic algorithmsEvolutionary mixed games learningIn this research, we propose a novel framework referred to as collective game behavior decomposition where complex collective behavior is assumed to be generated by aggregation of several groups of agents following different strategies and complexity emerges from collaboration and competition of individuals. The strategy of an agent is modeled by certain simple game theory models with limited information. Genetic algorithms are used to obtain the optimal collective behavior decomposition based on history data. The trained model can be used for collective behavior prediction. For modeling individual behavior, two simple games, the minority game and mixed game are investigated in experiments on the real-world stock prices and foreign-exchange rate. Experimental results are presented to show the effectiveness of the new proposed model.© 2014 Elsevier B.V. All rights reserved.
Keywords: Minority games | Mixed games | Collective behavior decomposition | Genetic algorithms |Evolutionary mixed games learning
مقاله انگلیسی
rss مقالات ترجمه شده rss مقالات انگلیسی rss کتاب های انگلیسی rss مقالات آموزشی
logo-samandehi
بازدید امروز: 936 :::::::: بازدید دیروز: 0 :::::::: بازدید کل: 936 :::::::: افراد آنلاین: 56