دانلود و نمایش مقالات مرتبط با محاسبات موازی::صفحه 1
بلافاصله پس از پرداخت دانلود کنید
نتیجه جستجو - محاسبات موازی

تعداد مقالات یافته شده: 52
ردیف عنوان نوع
1 روش شبیه سازی موازی جدید برای جمعیت عظیم
سال انتشار: 2019 - تعداد صفحات فایل pdf انگلیسی: 5 - تعداد صفحات فایل doc فارسی: 11
پردازش محاسباتی وسیع موازی، با ظهور پردازنده های هسته ای و حتی چند هسته ای، روند ارزان قیمت و محبوب را ارائه می دهد. ابزار جدید و راه حل های قابل اتکا برای محاسبه سریع شبیه سازی حرکات گروه های بزرگ –مقیاس و مشکل مشابه در حرکات گروه های فوق العاده-بزرگ ارائه می دهد. در فرآیند محاسبات موازی، طراحی معماری موازی و الگوریتم موازی تقریبا مرتبط با الگوریتم شبیه سازی حرکت گروهی است و به دلیل انجام وظیفه اصلی باید تجزیه شود. در این مقاله یک الگوریتم شبیه سازی موازی برای جمعیت عظیم ارائه شده است. از الگوریتم قطعه سازی صحنه مبتنی بر بلوک در گره مدیریت برای تقسیم صحنه شبیه سازی شده پس از مقدار دهی اولیه صحنه مجازی و اختصاص هر بلوک به یک گره محاسباتی متفاوت برای پردازش استفاده کردیم. سپس گره محاسبه ،اطلاعات مربوط به فردی را که مسئول آن است، دریافت می کند. بر اساس روش، سیستم نمونه اولیه شبیه سازی موازی بر روی پلت فرم محاسبات Sugon با کارایی بالا ، توسعه داده شد. نتایج تجربی نشان می دهد که الگوریتم شبیه سازی موازی ما می تواند کارایی صحنه -رندر را افزایش دهد و محدودیت عملیات را در مقیاس گروهی حل کند.
کليدواژه: محاسبات موازی | جمعیت عظیم | گره محاسبه | قطعه سازی صحنه
مقاله ترجمه شده
2 High-Performance Correlation and Mapping Engine for rapid generating brain connectivity networks from big fMRI data
موتور همبستگی و نقشه برداری با سرعت بالا برای ایجاد سریع شبکه های ارتباطی مغز از داده های بزرگ fMRI-2018
Brain connectivity networks help physicians better understand the neurological effects of certain diseases and make improved treatment options for patients. Seed-based Correlation Analysis (SCA) of Functional Magnetic Resonance Imaging (fMRI) data has been used to create the individual brain connectivity net works. However, an outstanding issue is the long processing time to generate full brain connectivity maps. With close to a million individual voxels in a typical fMRI dataset, the number of calculations involved in a voxel-by-voxel SCA becomes very high. With the emergence of the dynamic time-varying functional connectivity analysis, the population-based studies, and the studies relying on real-time neurological feedbacks, the need for rapid processing methods becomes even more critical. This work aims to develop a new method which produces high-resolution brain connectivity maps rapidly. The new method accel erates the correlation processing by using an architecture that includes clustered FPGAs and an efficient memory pipeline, which is termed High-Performance Correlation and Mapping Engine (HPCME). The method has been tested with datasets from the Human Connectome Project. The preliminary results show that HPCME with four FPGAs can improve the SCA processing speed by a factor of 27 or more over that of a PC workstation with a multicore CPU.
Keywords: Brain Functional Connectivity ، FMRI ، Seed-based Correlation Analysis ، FPGA-based Parallel Computing ، Human Connectome Project
مقاله انگلیسی
3 Peak operation of hydropower system with parallel technique and progressive optimality algorithm
بهره برداری از سیستم نیروی برق آبی با تکنیک موازی و الگوریتم بهینه سازی پیشرفته-2018
With the rapid economic growth in recent years, the peak operation of hydropower system (POHS) is becoming one of the most important optimization problems in power system. However, the rapid expansion of system scale, refined management and operational constraints has greatly increased the optimization difficult of POHS. As a result, it is of great importance to develop effective methods that can ensure the computational efficiency of POHS. The progressive optimality algorithm (POA) is a commonly used technique for solving hydropower operation problem, but its execution time still grows sharply with the increasing number of hydropower plants, making it difficult to satisfy the efficiency requirement of POHS. To address this problem, a novel efficient method called parallel progressive optimality algorithm (PPOA) is presented in this paper. In PPOA, the complex problem is firstly divided into several two-stage optimization subproblems, and then the classical Fork/Join framework is used to realize parallel computation of subproblems, making a significant improvement on execution efficiency. The simulations in a real-world hydropower system demonstrate that as compared with the standard POA, PPOA can use abundant multi-core resources to reduce execution time while keeping the quality of solution, providing a new alternative to solve the complex hydropower peak operation problem.
Keywords: Hydropower reservoirs | Peak operation | Progressive optimality algorithm | Fork/Join framework | Parallel computing | Curse of dimensionality
مقاله انگلیسی
4 Scalable Distributed Semantic Network for knowledge management in cyber physical system
شبکه معنایی توزیع شده مقیاس پذیر برای مدیریت دانش در سیستم فیزیکی سایبری-2018
The remarkable growth of emerging technologies and computing paradigms in cyberspace and the cyber physical systems generate a huge mass of data sources. These different autonomous and heterogeneous data sources can contain complementary or semantically equivalent information stored under different formats that vary from structured, semi structured, to unstructured. These heterogeneities influence on data semantics and meaning. Therefore, knowledge management became more and more difficult and sometimes fruitless. In this paper, we propose a new scalable model, named Distributed Semantic Network (DSN), for heterogeneous data representation and can extract more semantic information from different data sources. We use the prior knowledge of WordNet and Wikipedia to scale out DSN horizontally and vertically. Furthermore, we proposed a MapReduce based framework to construct the knowledge base more effectively in Parallel and Distributed Computing (PDC). The experimental results show that DSN can better model the semantic information in the text. It can extract a larger amount of information from the text with a higher precision, achieving 34% increase in quantity and 15% promotion on precision than the best-performing alternative method on same datasets. On the three datasets, our proposed PDC framework shorten the process time by 5.8–11.5 times.
Keywords: Parallel and distributed computing ، Knowledge management ، Distributed semantic network ، MapReduce framework ، Cyber physical system
مقاله انگلیسی
5 Efficient spatial co-location pattern mining on multiple GPUs
استخراج الگوریتم مکان یابی فضایی کارآمد بر روی GPU های چندگانه-2018
In this paper, we investigate Co-location Pattern Mining (CPM) from big spatial datasets. CPM consists in searching for types of objects that are frequently located together in a spatial neighborhood. Knowl edge about such patterns is very important in fields like biology, environmental sciences, epidemiology etc. However, CPM is computationally challenging, mainly due to the large number of pattern instances hidden in spatial data. In this work, we propose a new solution that can utilize the power of multiple GPUs to increase the performance of CPM. The proposed solution is also capable of coping with the GPU memory limits by dividing the work into multiple packages and compressing internal data structures. Experiments performed on large synthetic and real-world datasets prove that we can achieve an order of magnitude speedups in comparison to the efficient multithreaded CPU implementation. Our solution can greatly improve the performance of data analysis, using widely available and energy efficient graph ics cards. As a result, CPM in large datasets is more viable for university researchers as well as smaller companies and organizations.
Keywords: GPGPU ، Spatial co-location ، Parallel computing ، Compression ، Data mining ، Co-location pattern mining
مقاله انگلیسی
6 On the role of latent variable models in the era of big data
بر روی نقش مدل متغیرهای پنهان در دوران داده های بزرگ-2018
We discuss how latent variable models are useful to deal with the complexities of big data from different perspectives: simplification of data structure; flexible representation of de pendence between variables; reduction of selection bias. Problems involved in parameter estimation are also discussed.
Keywords: Bayesian inference ، Complex data ، Maximum likelihood estimation ، Parallel computing ، Selection bias
مقاله انگلیسی
7 Parallel multiphase field simulations with OpenPhase
شبیه سازی زمینه چند فازی موازی با OpenPhase-2017
The open-source software project OpenPhase allows the three-dimensional simulation of microstructural evolution using the multiphase field method. The core modules of OpenPhase and their implementation as well as their parallelization for a distributed-memory setting are presented. Especially communication and load-balancing strategies are discussed. Synchronization points are avoided by an increased halo-size, i.e. additional layers of ghost cells, which allow multiple stencil operations without data exchange. Load balancing is considered via graph-partitioning and sub-domain decomposition. Results are presented for performance benchmarks as well as for a variety of applications, e.g. grain growth in polycrystalline materials, including a large number of phase fields as well as Mg–Al alloy solidification.
Keywords: Material science | Phase field | Parallel computing | Load-balancing
مقاله انگلیسی
8 A derivation and scalable implementation of the synchronous parallel kinetic Monte Carlo method for simulating long-time dynamics
اشتقاق و مقیاس پذیر اجرای روش همزمان موازی جنبشی مونت کارلو برای شبیه سازی دینامیک طولانی مدت-2017
Kinetic Monte Carlo (KMC) simulations are used to study long-time dynamics of a wide variety of systems. Unfortunately, the conventional KMC algorithm is not scalable to larger systems, since its time scale is inversely proportional to the simulated system size. A promising approach to resolving this issue is the synchronous parallel KMC (SPKMC) algorithm, which makes the time scale size-independent. This paper introduces a formal derivation of the SPKMC algorithm based on local transition-state and time dependent Hartree approximations, as well as its scalable parallel implementation based on a dual linked list cell method. The resulting algorithm has achieved a weak-scaling parallel efficiency of 0.935 on 1024 Intel Xeon processors for simulating biological electron transfer dynamics in a 4.2 billion-heme system, as well as decent strong-scaling parallel efficiency. The parallel code has been used to simulate a lattice of cytochrome complexes on a bacterial-membrane nanowire, and it is broadly applicable to other problems such as computational synthesis of new materials.
Keywords: Kinetic Monte Carlo simulation | Divide-and-conquer algorithm | Parallel computing
مقاله انگلیسی
9 Parallelization methods for efficient simulation of high dimensional population balance models of granulation
روش های موازی سازی برای شبیه سازی کارآمد مدل های توازن توزیع جمعیت ابعاد بزرگ دانه بندی-2017
In order to solve high resolution PBMs to simulate real systems, with high accuracy and speed, a comprehensive and robust parallelization framework is needed. In this work, parallelization using just Message Passing Interface (MPI) and a more advanced method using a hybrid MPI + OpenMP (Open MultiProcessing) technique, have been applied to simulate high resolution PBMs on the computing clusters, SOEHPC and Stampede. We study the speed up and the scale up of these parallelization techniques for different system sizes and different computer architectures to come up with one of the fastest ways to solve a PBM to date. Parallel PBMs ran approximately 50–60 times faster, when using 128 cores, than the serial PBMs ran. In this work it is found that hybrid MPI + OMP methods which account for socket affinities led to the fastest PBM compute times and about 80% less memory than a purely MPI approach.
Keywords: MPIئ | OpenMP | Parallel computing | Population balance model | Granulation | Pharmaceutical process design
مقاله انگلیسی
10 A parallel structure exploiting nonlinear programming algorithm for multiperiod dynamic optimization
یک ساختار موازی الگوریتم برنامه نویسی غیر خطی برای بهینه سازی پویای چند دوره ای-2017
This article develops a sequential quadratic programming (SQP) algorithm that utilizes a parallel interior point method (IPM) for the QP subproblems. Our approach is able to efficiently decompose and solve large-scale multiperiod nonlinear programming (NLP) formulations with embedded dynamic model representations, through the use of an explicit Schur-complement decomposition within the IPM. The algorithm implementation makes use of a computing environment that uses the parallel distributed computingmessagepassinginterface(MPI)andspecializedvector-matrixclassrepresentations,asimple mented in the third-party software package, OOPS. The proposed approach is assessed, with a focus on computational speedup, using several benchmark examples involving applications of parameter estima tion and design under uncertainty which utilize static and dynamic models. Results indicate significant improvements in the NLP solution speedup when moving from a serial full-space direct factorization approach, to a serial Schur-complement decomposition, to a parallelized Schur-complement decompo sition for the primal-dual linear system solution within the IPM.
Keywords: Multiperiod dynamic optimization | Multiple-shooting | Sequential quadratic programming | Interior-point methods | Parallel computing
مقاله انگلیسی
rss مقالات ترجمه شده rss مقالات انگلیسی rss کتاب های انگلیسی rss مقالات آموزشی