عنوان انگلیسی مقاله:
Visual content-based web page categorization with deep transfer learning and metric learning
ترجمه فارسی عنوان مقاله:
طبقه بندی صفحه وب مبتنی بر محتوای تصویری با یادگیری انتقال عمیق و یادگیری متریک
Sciencedirect - Elsevier - Neurocomputing, 338 (2019) 418-431: doi:10:1016/j:neucom:2018:08:086
Daniel López-Sánchez a , ∗, Angélica González Arrieta a , Juan M. Corchado a , b
The growing amounts of online multimedia content challenge the current search, recommendation and information retrieval systems. Information in the form of visual elements is highly valuable in a range of web mining tasks. However, the mining of these resources is a difficult task due to the complexity and variability of images, and the cost of collecting big enough datasets to successfully train accurate deep learning models. This paper proposes a novel framework for the categorization of web pages on the basis of their visual content. This is achieved by exploring the joint application of a transfer learning strategy and metric learning techniques to build a Deep Convolutional Neural Network (DCNN) for feature extrac- tion, even when training data is scarce. The obtained experimental results evidence that the proposed approach outperforms the state-of-the-art handcrafted image descriptors and achieves a high categoriza- tion accuracy. In addition, we address the problem of over-time learning, so the proposed framework can learn to identify new web page categories as new labeled images are provided at test time. As a result, prior knowledge of the complete set of possible web categories is not necessary in the initial training phase.
Keywords: Web page categorization | Metric learning | Transfer learning | Deep learning