دانلود و نمایش مقالات مرتبط با data pipeline::صفحه 1
دانلود بهترین مقالات isi همراه با ترجمه فارسی 2

با سلام خدمت کاربران در صورتی که با خطای سیستم پرداخت بانکی مواجه شدید از طریق کارت به کارت (6037997535328901 بانک ملی ناصر خنجری ) مقاله خود را دریافت کنید (تا مشکل رفع گردد). 

نتیجه جستجو - data pipeline

تعداد مقالات یافته شده: 3
ردیف عنوان نوع
1 Development of a national-scale real-time Twitter data mining pipeline for social geodata on the potential impacts of flooding on communities
توسعه یک خط لوله داده کاوی داده های توییتر در زمان واقعی در مقیاس ملی برای ژئو داده های اجتماعی در مورد اثرات احتمالی سیل بر جوامع-2019
Social media, particularly Twitter, is increasingly used to improve resilience during extreme weather events/ emergency management situations, including floods: by communicating potential risks and their impacts, and informing agencies and responders. In this paper, we developed a prototype national-scale Twitter data mining pipeline for improved stakeholder situational awareness during flooding events across Great Britain, by retrieving relevant social geodata, grounded in environmental data sources (flood warnings and river levels). With potential users we identified and addressed three research questions to develop this application, whose components constitute a modular architecture for real-time dashboards. First, polling national flood warning and river level Web data sources to obtain at-risk locations. Secondly, real-time retrieval of geotagged tweets, proximate to at-risk areas. Thirdly, filtering flood-relevant tweets with natural language processing and machine learning libraries, using word embeddings of tweets. We demonstrated the national-scale social geodata pipeline using over 420,000 georeferenced tweets obtained between 20 and 29th June 2016.
Keywords: Flood management | Twitter | Volunteered geographic information | Natural language processing | Word embeddings | Social geodata
مقاله انگلیسی
2 A Privacy Weaving Pipeline for Open Big Data
خط مشی ساخت و ساز پایپ لاین برای داده های بزرگ باز-2016
The power of big data gives us an unprecedented chance to understand, analyze, and recreate the world, while open data ensures that power be shared and widely exploited. Open and big data has become the emerging topics for researchers and governments. Thus, the related privacy issues also become an emerging urgent problem. In this work, we propose a conceptual framework of privacy weaving pipeline dedicated for producing open and big data while preserving privacy. Within the processing pipeline, each step of the process flow considers the privacy assurance to manipulate datasets. However, the complexity of process flow is the same as normal data pipeline. The experimental prototype confirms the feasibility of framework design. We hope this work will facilitate the development of open and big data industry.
Keywords: open data | big data | data pipeline | privacy breach
مقاله انگلیسی
3 Spark Versus Flink: Understanding Performance in Big Data Analytics Frameworks
Spark در مقابل Flink: درک عملکرد در چهارچوب های تحلیل داده های بزرگ تجزیه-2016
Big Data analytics has recently gained increasing popularity as a tool to process large amounts of data on-demand. Spark and Flink are two Apache-hosted data analytics frameworks that facilitate the development of multi-step data pipelines using directly acyclic graph patterns. Making the most out of these frameworks is challenging because efficient executions strongly rely on complex parameter configurations and on an in-depth understanding of the underlying architectural choices. Although extensive research has been devoted to improving and evaluating the performance of such analytics frameworks, most of them benchmark the platforms against Hadoop, as a baseline, a rather unfair comparison considering the fundamentally different design principles. This paper aims to bring some justice in this respect, by directly evaluating the performance of Spark and Flink. Our goal is to identify and explain the impact of the different architectural choices and the parameter configurations on the perceived end-to-end performance. To this end, we develop a methodology for correlating the parameter settings and the operators execution plan with the resource usage. We use this methodology to dissect the performance of Spark and Flink with several representative batch and iterative workloads on up to 100 nodes. Our key finding is that there none of the two framework outperforms the other for all data types, sizes and job patterns. This paper performs a fine characterization of the cases when each framework is superior, and we highlight how this performance correlates to operators, to resource usage and to the specifics of the internal framework design.
Index Terms: Big Data | performance evaluation | Spark | Flink
مقاله انگلیسی
rss مقالات ترجمه شده rss مقالات انگلیسی rss کتاب های انگلیسی rss مقالات آموزشی
logo-samandehi
بازدید امروز: 9063 :::::::: بازدید دیروز: 0 :::::::: بازدید کل: 9063 :::::::: افراد آنلاین: 76