دانلود و نمایش مقالات مرتبط با داده های ناهمگن::صفحه 1
سیزه به در
نتیجه جستجو - داده های ناهمگن

تعداد مقالات یافته شده: 6
ردیف عنوان نوع
1 Truth finding by reliability estimation on inconsistent entities for heterogeneous data sets
یافتن حقیقت با برآورد قابلیت اطمینان در واحدهای متناقض برای مجموعه داده های ناهمگن-2020
An important task in big data integration is to derive accurate data records from noisy and conflicting values collected from multiple sources. Most existing truth finding methods assume that the reliability is consistent on the whole data set, ignoring the fact that different attributes, objects and object groups may have different reliabilities even wrt the same source. These reliability differences are caused by the hardness differences in obtaining attribute values, non-uniform updates to objects and the differences in group privileges. This paper addresses the problem how to compute truths by effectively estimating the reliabilities of attributes, objects and object groups in a multi-source heterogeneous data environment. We first propose an optimization framework TFAR, its implementation and Lagrangian duality solution for Truth Finding by Attribute Reliability estimation. We then present a Bayesian probabilistic graphical model TFOR and an inference algorithm applying Collapsed Gibbs Sampling for Truth Finding by Object Reliability estimation. Finally we give an optimization framework TFGR and its implementation for Truth Finding by Group Reliability estimation. All these models lead to a more accurate estimation of the respective attribute, object and object group reliabilities, which in turn can achieve a better accuracy in inferring the truths. Experimental results on both real data and synthetic data show that our methods have better performance than the state-of-art truth discovery methods.
Keywords: Truth finding | Attribute reliability | Object reliability | Group reliability | Entity hardness | Probability graphical mod
مقاله انگلیسی
2 A framework for distributed data mining heterogeneous classifier
چارچوبی برای طبقه بندی ناهمگن داده کاوی توزیع شده-2019
Distributed Data Mining (DDM) emerged as a huge area by the tremendous growth of geographically distributed data and powerful computational capability of computing. In this, ENcryption, NORMalization, MApping (ENORMA), a privacy preserving heterogeneous classifier framework for universal DDM is proposed. Three algorithms are proposed for maintaining data privacy, retrieval and integration on DDM. For data privacy, privacy-preserving algorithm is designed for protection of data in both the levels; for data retrieval, an algorithm is developed for value normalization and for integration, Mapping algorithm is developed to map the data with schema in global level. Experimental implementation on Electronic Health Records (EHRs), Job Recruitment Records (JRRs) and Agriculture Weather Forecast Records (AWFRs) datasets shows an improved result compared to conventional frameworks.
Keywords: Distributed data mining framework | Heterogeneous datasites | Privacy-preserving | Data normalization | Data integration
مقاله انگلیسی
3 iFusion: Towards efficient intelligence fusion for deep learning from real-time and heterogeneous data
iFusion: به سمت تلفیق اطلاعاتی کارآمد برای یادگیری عمیق از داده های واقعی و ناهمگن-2019
Deep learning has shown great strength in many fields and has allowed people to live more conveniently and intelligently. However, deep learning requires a considerable amount of uniform training data, which introduces difficulties in many application scenarios. On the one hand, in real-time systems, training data are constantly generated, but users cannot immediately obtain this vast amount of training data. On the other hand, training data from heterogeneous sources have different data formats. Therefore, existing deep learning frameworks are not able to train all data together. In this paper, we propose the iFusion framework, which achieves efficient intelligence fusion for deep learning from real-time data and heterogeneous data. For real-time data, we train only newly arrived data to obtain a new discrimination model and fuse the previously trained models to obtain the discrimination result. For heterogeneous data, different types of data are trained separately; then, we fuse the different discrimination models so that it is not necessary to consider heterogeneous data formats. We use a method based on Dempster-Shafer theory (DST) to fuse the discrimination models. We apply iFusion to the deep learning of medical image data, and the results of the experiments show the effectiveness of the proposed method.
Keywords: Information| fusion | Real-time data | Heterogeneous data | Deep learning
مقاله انگلیسی
4 Description and classification for facilitating interoperability of heterogeneous data/events/services in the Internet of Things
شرح و طبقه بندی برای تسهیل قابلیت همکاری داده های ناهمگن / رویدادها / خدمات در اینترنت اشیاء-2017
The Internet of Things (IoT) refers to an infrastructure that integrates things over standard wired/wireless networks and allows them to exchange information with each other. The IoT is a very complex hetero geneous network, enabling seamless integration of these things is a huge challenge. A publish/subscribe method of integration can be formulated to solve the problems of interconnecting billions of heteroge neous things. In our work, an IoT framework that uses an abstraction layer that decouples an application from the service calls and network interfaces is required to send and receive messages on a particular thing. This paper provides definitions and classifications for heterogeneous data/events/services accord ing to the properties of the things in order to integrate them into a framework for description. Based on these definitions and classifications, heterogeneous data/events/services in the IoT were integrated via topic description through the Data Distribution Service (DDS) middleware standard for real-time pub lish/subscribe. This paper also concludes with general remarks and a discussion of future work.
Keywords: Internet of Things (IoT) | Data Distribution Service (DDS) | Topic | Description | Interoperability
مقاله انگلیسی
5 Traffic state and emission estimation for urban expressways based on heterogeneous data
وضعیت ترافیک و برآورد انتشار برای بزرگراه های شهری بر اساس داده های ناهمگن-2017
Urban expressways, as the backbone of a city’s transportation network, are critical for reducing traffic congestion and improving transportation efficiency of the whole network. The estimation of traffic states and emissions for urban expressways supports traveler information provision and system-wide traffic management. This paper aims to modify the extended generalized Treiber-Helbing filter (EGTF) to fuse GPS data (probe vehicles) and traditional traffic data (loop detectors), so as to enhance more accurate estimations of traffic states and emissions on urban expressways. The speed field is first reconstructed based on heterogeneous data, and then travel time and emissions are estimated using a vir tual trajectory method and the VT-Micro model, respectively. The algorithm is applied to a real-world case study for an urban expressway in Beijing, China. After the parameter tun ing, the proposed algorithm is compared with existing algorithms from the literature. Numerical results show that data fusion using the proposed algorithm could make better use of heterogeneous data and increase the accuracy of travel time and emissions estimations.
Keywords: Urban expressway | Travel time | Data fusion | EGTF | Vehicular emission
مقاله انگلیسی
6 فراتر از اغراق: مفاهیم داده های بزرگ، روش ها و تجزیه و تحلیل
سال انتشار: 2015 - تعداد صفحات فایل pdf انگلیسی: 8 - تعداد صفحات فایل doc فارسی: 30
ویژگی اندازه، اولین و تنها بعدی است که در هر زمان، در اشاره به داده های بزرگ مورد توجه قرار می گیرد. این مقاله سعی در ارائه تعریف گسترده ای از داده های بزرگ است که دیگر ویژگی های منحصر بفرد و تعریف شده خود را کسب می کند. تکامل سریع و پذیرش داده های بزرگ توسط صنعت، بحث را بسمت خروجی های مورد پسندی می کشاند، که مطبوعات علمی را مجبور به نائل شدن به ان می نماید. مجلات علمی در بسیاری رشته های علمی، که از بحث مربوط به داده های بزرگ سود می برند، هنوز به بررسی و تامین کامل این موضوع نپرداخته اند. این مقاله به توضیحی تلفیقی از داده های بزرگ با یکپارچه سازی تعاریف ارائه شده از مشاغل پزشکی و دانشگاهیان می پردازد. اولین هدف این مقاله توجه به روش های تحلیلی مورد استفاده برای داده های بزرگ است. ویژگی متمایز و خاص این مقاله توجه به تجزیه و تحلیل های مربوط به داده های بی ساخت است که 95% از داده های بزرگ را تشکیل می دهند. در این مقاله، بر نیاز به توسعه روش های تحلیلی مناسب و موثر برای نفوذ و استفاده در حجم های بزرگی از داده های ناهمگن در ساختارهای متنی، صوتی، و ویدئویی تاکید می شود. همچنین به اهمیت نیاز به تعبیه ابزارهای جدید برای تجزیه و تحلیل های پیش گویانه برای داده های بزرگ دارای ساختار، پرداخته می شد. در عمل، روش های اماری برای استنتاج از داده های نمونه طراحی شده اند. ناهمگنی، صدا، و اندازه حجیم داده های بزرگ دارای ساختار در توسعه الگوریتم های محاسباتی کارا که ممکن است از بروز مشکلات داده های بزرگ (مثل همبستگی کاذب) اجتناب نماید، نقش دارند.
کلیدواژه ها: تجزیه و تحلیل داده های بزرگ، تعریف داده های بزرگ، تجزیه و تحلیل داده های بدون ساختار، تجزیه و تحلیل پیشگویانه
مقاله ترجمه شده
rss مقالات ترجمه شده rss مقالات انگلیسی rss کتاب های انگلیسی rss مقالات آموزشی
logo-samandehi