عنوان انگلیسی مقاله:
Towards a real-time processing framework based on improved distributed recurrent neural network variants with fastText for social big data analytics
ترجمه فارسی عنوان مقاله:
به سمت یک چارچوب پردازش در زمان واقعی بر اساس بهبود انواع شبکه عصبی مکرر توزیع شده با fastText برای تجزیه و تحلیل داده های بزرگ اجتماعی
Sciencedirect - Elsevier - Information Processing and Management, 57 (2020) 102122: doi:10:1016/j:ipm:2019:102122
Badr Ait Hammou⁎,a, Ayoub Ait Lahcena,b, Salma Moulinea
Big data generated by social media stands for a valuable source of information, which offers an
excellent opportunity to mine valuable insights. Particularly, User-generated contents such as
reviews, recommendations, and users’ behavior data are useful for supporting several marketing
activities of many companies. Knowing what users are saying about the products they bought or
the services they used through reviews in social media represents a key factor for making decisions.
Sentiment analysis is one of the fundamental tasks in Natural Language Processing.
Although deep learning for sentiment analysis has achieved great success and allowed several
firms to analyze and extract relevant information from their textual data, but as the volume of
data grows, a model that runs in a traditional environment cannot be effective, which implies the
importance of efficient distributed deep learning models for social Big Data analytics. Besides, it
is known that social media analysis is a complex process, which involves a set of complex tasks.
Therefore, it is important to address the challenges and issues of social big data analytics and
enhance the performance of deep learning techniques in terms of classification accuracy to obtain
In this paper, we propose an approach for sentiment analysis, which is devoted to adopting
fastText with Recurrent neural network variants to represent textual data efficiently. Then, it
employs the new representations to perform the classification task. Its main objective is to enhance
the performance of well-known Recurrent Neural Network (RNN) variants in terms of
classification accuracy and handle large scale data. In addition, we propose a distributed intelligent
system for real-time social big data analytics. It is designed to ingest, store, process,
index, and visualize the huge amount of information in real-time. The proposed system adopts
distributed machine learning with our proposed method for enhancing decision-making processes.
Extensive experiments conducted on two benchmark data sets demonstrate that our
proposal for sentiment analysis outperforms well-known distributed recurrent neural network
variants (i.e., Long Short-Term Memory (LSTM), Bidirectional Long Short-Term Memory
(BiLSTM), and Gated Recurrent Unit (GRU)). Specifically, we tested the efficiency of our approach
using the three different deep learning models. The results show that our proposed approach
is able to enhance the performance of the three models. The current work can provide
several benefits for researchers and practitioners who want to collect, handle, analyze and visualize
several sources of information in real-time. Also, it can contribute to a better understanding
of public opinion and user behaviors using our proposed system with the improved
variants of the most powerful distributed deep learning and machine learning algorithms.
Furthermore, it is able to increase the classification accuracy of several existing works based on
RNN models for sentiment analysis.
Keywords: Big data | FastText | Recurrent neural networks | LSTM | BiLSTM | GRU | Natural language processing | Sentiment analysis | Social big data analytics