Remote sensing image captioning via Variational Autoencoder and Reinforcement Learning
زیرنویس تصویر سنجش از دور از طریق خودکارگذار تناسبی و یادگیری تقویتی-2020
Image captioning, i.e., generating the natural semantic descriptions of given image, is an essential task for machines to understand the content of the image. Remote sensing image captioning is a part of the field. Most of the current remote sensing image captioning models suffered the overfitting problem and failed to utilize the semantic information in images. To this end, we propose a Variational Autoencoder and Reinforcement Learning based Two-stage Multi-task Learning Model (VRTMM) for the remote sensing image captioning task. In the first stage, we finetune the CNN jointly with the Variational Autoencoder. In the second stage, the Transformer generates the text description using both spatial and semantic features. Reinforcement Learning is then applied to enhance the quality of the generated sentences. Our model surpasses the previous state of the art records by a large margin on all seven scores on Remote Sensing Image Caption Dataset. The experiment result indicates our model is effective on remote sensing image captioning and achieves the new state-of-the-art result.
Keywords: Transformer | Variational Autoencoder | Transfer learning | Remote sensing image captioning | Self-attention mechanisms | Convolutional neural network | Reinforcement learning
DECAF: Deep Case-based Policy Inference for knowledge transfer in Reinforcement Learning
DECAF: استنتاج سیاست های مبتنی بر مورد عمیق برای انتقال دانش در یادگیری تقویتی-2020
Having the ability to solve increasingly complex problems using Reinforcement Learning (RL) has prompted researchers to start developing a greater interest in systematic approaches to retain and reuse knowledge over a variety of tasks. With Case-based Reasoning (CBR) there exists a general methodology that provides a framework for knowledge transfer which has been underrepresented in the RL literature so far. We for- mulate a terminology for the CBR framework targeted towards RL researchers with the goal of facilitating communication between the respective research communities. Based on this framework, we propose the Deep Case-based Policy Inference (DECAF) algorithm to accelerate learning by building a library of cases and reusing them if they are similar to a new task when training a new policy. DECAF guides the train- ing by dynamically selecting and blending policies according to their usefulness for the current target task, reusing previously learned policies for a more effective exploration but still enabling the adaptation to particularities of the new task. We show an empirical evaluation in the Atari game playing domain depicting the benefits of our algorithm with regards to sample efficiency, robustness against negative transfer, and performance increase when compared to state-of-the-art methods.
Keywords: Deep Reinforcement Learning | Case-based Reasoning | Transfer Learning | Knowledge discovery | Knowledge management | Neural networks
AI-PLAX: AI-based placental assessment and examination usingphotos
AI-PLAX: ارزیابی و معاینه جفت مبتنی بر هوش مصنوعی با استفاده از عکس-2020
Post-delivery analysis of the placenta is useful for evaluating health risks of both the mother and baby. In the U.S., however, only about 20% of placentas are assessed by pathology exams, and placental data is often missed in pregnancy research because of the additional time, cost, and expertise needed. A computer-based tool that can be used in any delivery setting at the time of birth to provide an immediate and comprehensive placental assessment would have the potential to not only to improve health care, but also to radically improve medical knowledge. In this paper, we tackle the problem of automatic placental assessment and examination using photos. More concretely, we first address morphological characterization, which includes the tasks of placental image segmentation, umbilical cord insertion point localization, and maternal/fetal side classification. We also tackle clinically meaningful feature analysis of placentas, which comprises detection of retained placenta (i.e., incomplete placenta), umbilical cord knot, meconium, abruption, chorioamnionitis, and hypercoiled cord, and categorization of umbilical cord insertion type. We curated a dataset consisting of approximately 1300 placenta images taken at Northwestern Memorial Hospital, with hand-labeled pixel-level segmentation map, cord insertion point and other information extracted from the associated pathology reports. We developed the AI-based Placental Assessment and Examination system (AI-PLAX), which is a novel two-stage photograph-based pipeline for fully automated analysis. In the first stage, we use three encoder-decoder convolutional neural networks with a shared encoder to address morphological characterization tasks by employing a transfer-learning training strategy. In the second stage, we employ distinct sub-models to solve different feature analysis tasks by using both the photograph and the output of the first stage. We evaluated the effectiveness of our pipeline by using the curated dataset as well as the pathology reports in the medical record. Through extensive experiments, we demonstrate our system is able to produce accurate morphological characterization and very promising performance on aforementioned feature analysis tasks, all of which may possess clinical impact and contribute to future pregnancy research. This work is the first for comprehensive, automated, computer-based placental analysis and will serve as a launchpad for potentially multiple future innovations.
Keywords: Deep learning | Transfer learning | Placenta | Photo image analysis | Pathology
Statistical investigations of transfer learning-based methodology for shortterm building energy predictions
تحقیقات آماری از روش مبتنی بر یادگیری انتقال برای پیش بینی انرژی کوتاه مدت ساختنمان -2020
The wide availability of massive building operational data has motivated the development of advanced datadriven methods for building energy predictions. Existing data-driven prediction methods are typically customized for individual buildings and their performance are highly influenced by the training data amount and quality. In practice, buildings may only possess limited measurements due to the lack of advanced monitoring systems or data accumulation time. As a result, existing data-driven approaches may not present sufficient values for practical applications. A novel solution can be developed based on transfer learning, which utilizes the knowledge learnt from well-measured buildings to facilitate prediction tasks in other buildings. However, the potentials of transfer learning-based methods for building energy predictions have not been systematically examined. To address this research gap, a transfer learning-based methodology is proposed for 24-h ahead building energy demand predictions. Experiments have been designed to investigate the potentials of transfer learning in different scenarios with different implementation strategies. Statistical assessments have been performed to validate the value of transfer learning in short-term building energy predictions. Compared with standalone models, the transfer learning-based methodology could reduce approximately 15% to 78% of prediction errors. The research outcomes are useful for developing advanced transfer learning-based methods for typical tasks in building energy management. The insights obtained can help the building industry to fully realize the value of existing building data resources and advanced data analytics.
Keywords: Building energy predictions | Transfer learning | Deep learning | Data-driven models | Smart building energy management
Leveraging Google Earth Engine (GEE) and machine learning algorithms to incorporate in situ measurement from different times for rangelands monitoring
اهرم موتور زمین گوگل و الگوریتم های یادگیری ماشین برای ترکیب در اندازه گیری درجا از زمان های مختلف برای نظارت بر مراتع-2020
Mapping and monitoring of indicators of soil cover, vegetation structure, and various native and non-native species is a critical aspect of rangeland management. With the advancement in satellite imagery as well as cloud storage and computing, the capability now exists to conduct planetary-scale analysis, including mapping of rangeland indicators. Combined with recent investments in the collection of large amounts of in situ data in the western U.S., new approaches using machine learning can enable prediction of surface conditions at times and places when no in situ data are available. However, little analysis has yet been done on how the temporal relevancy of training data influences model performance. Here, we have leveraged the Google Earth Engine (GEE) platform and a machine learning algorithm (Random Forest, after comparison with other candidates) to identify the potential impact of different sampling times (across months and years) on estimation of rangeland indicators from the Bureau of Land Managements (BLM) Assessment, Inventory, and Monitoring (AIM) and Landscape Monitoring Framework (LMF) programs. Our results indicate that temporally relevant training data improves predictions, though the training data need not be from the exact same month and year for a prediction to be temporally relevant. Moreover, inclusion of training data from the time when predictions are desired leads to lower prediction error but the addition of training data from other times does not contribute to overall model error. Using all of the available training data can lead to biases, toward the mean, for times when indicator values are especially high or low. However, for mapping purposes, limiting training data to just the time when predictions are desired can lead to poor predictions of values outside the spatial range of the training data for that period. We conclude that the best Random Forest prediction maps will use training data from all possible times with the understanding that estimates at the extremes will be biased.
Keywords: Google earth engine | Big data | Machine learning | Domain adaptation | Transfer learning | Feature selection | Rangeland monitoring
Object Memorability Prediction using Deep Learning: Location and Size Bias
پیش بینی قابلیت یادآوری شی با استفاده از یادگیری عمیق: محل و اندازه-2019
Object memorability prediction is a task of estimating the probability that a human recognises the recurrence of an object after a single view. Initial research on object memorability showed that it is possible to predict the object memorability scores from the intrinsic features of an object. Though the existing works proposed some of the features for object memorability prediction task, the influence of Spatial-location and Spatial-size of an object to its memorability have not been explored yet. In this work, the importance of these two characteristics in determining object memorability prediction is investigated and the same is demonstrated by building a baseline model. Further, a deep learning model is devised for automatic feature learning on these two object characteristics. Experimental results highlight that the Spatial-location and Spatial-size of an object play a significant role in object memorability prediction and the proposed models outperformed the existing methods
Keywords: Object Memorability | Deep Learning | Transfer Learning
Research on image steganography analysis based on deep learning
تحقیق در مورد تجزیه و تحلیل استگانوگرافی تصویر بر اساس یادگیری عمیق-2019
Although steganalysis has developed rapidly in recent years, it still faces many difficulties and challenges. Based on the theory of in-depth learning method and image-based general steganalysis, this paper makes a deep study of the hot and difficult problem of steganalysis feature expression, and tries to establish a new steganalysis paradigm from the idea of feature learning. The main contributions of this paper are as follows: 1. An innovative steganalysis paradigm based on in-depth learning is proposed. Based on the representative deep learning method CNN, the model is designed and adjusted according to the characteristics of steganalysis, which makes the proposed model more effective in capturing the statistical characteristics such as neighborhood correlation. 2. A steganalysis feature learning method based on global information constraints is proposed. Based on the previous research of steganalysis method based on CNN, this work focuses on the importance of global information in steganalysis feature expression. 3. A feature learning method for low embedding rate steganalysis is proposed. 4. A general steganalysis method for multi-class steganography is proposed. The ultimate goal of general steganalysis is to construct steganalysis detectors without distinguishing specific types of steganalysis algorithms
Keywords: Steganalysis | Steganography | Feature learning | Deep learning | Convolutional neural network | Transfer learning | Multitask learning
Improving air quality prediction accuracy at larger temporal resolutions using deep learning and transfer learning techniques
بهبود دقت پیش بینی کیفیت هوا در وضوح زمانی بزرگتر با استفاده از تکنیک های یادگیری عمیق و انتقال یادگیری-2019
As air pollution becomes more and more severe, air quality prediction has become an important approach for air pollution management and prevention. In recent years, a number of methods have been proposed to predict air quality, such as deterministic methods, statistical methods as well as machine learning methods. However, these methods have some limitations. Deterministic methods require expensive computations and specific knowledge for parameter identification, while the forecasting performance of statistical methods is limited due to the linear assumption and the multicollinearity problem. Most of the machine learning methods, on the other hand, cannot capture the time series patterns or learn from the long-term dependencies of air pollutant concentrations. Furthermore, there is a lack of methods that could generate high prediction accuracy for air quality forecasting at larger temporal resolutions, such as daily and weekly or even monthly. This paper, therefore, proposes a deep learning-based method namely transferred bi-directional long short-term memory (TL-BLSTM) model for air quality prediction. The methodology framework utilizes the bi-directional LSTM model to learn from the longterm dependencies of PM2.5, and applies transfer learning to transfer the knowledge learned from smaller temporal resolutions to larger temporal resolutions. A case study is conducted in Guangdong, China to test the proposed methodology framework. The performance of the framework is compared with other commonly seen machine learning algorithms, and the results show that the proposed TL-BLSTM model has smaller errors, especially for larger temporal resolutions
Keywords: Air quality prediction | Large temporal resolution | Deep learning | Long short-term memory | Transfer learning
Machine learning phase transition: An iterative proposal
فاز انتقال یادگیری ماشین: یک پیشنهاد تکرار شونده-2019
We propose an iterative proposal to estimate critical points for statistical models based on configurations by combing machine-learning tools. Firstly, phase scenarios and preliminary boundaries of phases are obtained by dimensionality-reduction techniques. Besides, this step not only provides labelled samples for the subsequent step but also is necessary for its application to novel statistical models. Secondly, making use of these samples as training set, neural networks are employed to assign labels to those samples between the phase boundaries in an iterative manner. Newly labelled samples would be put in the training set used in subsequent training and the phase boundaries would be updated as well. The average of the phase boundaries is expected to converge to the critical temperature in this proposal. In concrete examples, we implement this proposal to estimate the critical temperatures for two q-state Potts models with continuous and first order phase transitions. Linear and manifold dimensionality-reduction techniques are employed in the first step. Both a convolutional neural network and a bidirectional recurrent neural network with long short-term memory units perform well for two Potts models in the second step. The convergent behaviors of the estimations reflect the types of phase transitions. And the results indicate that our proposal may be used to explore phase transitions for new general statistical models.
Quality and content analysis of fundus images using deep learning
تجزیه و تحلیل کیفیت و محتوا از تصاویر fundus با استفاده از یادگیری عمیق-2019
Automatic retinal image analysis has remained an important topic of research in the last ten years. Various algorithms and methods have been developed for analysing retinal images. The majority of these methods use public retinal image databases for performance evaluation without first examining the retinal image quality. Therefore, the performance metrics reported by these methods are inconsistent. In this article, we propose a deep learning-based approach to assess the quality of input retinal images. The method begins with a deep learningbased classification that identifies the image quality in terms of sharpness, illumination and homogeneity, followed by an unsupervised second stage that evaluates the field definition and content in the image. Using the inter-database cross-validation technique, our proposed method achieved overall sensitivity, specificity, positive predictive value, negative predictive value and accuracy of above 90% when tested on 7007 images collected from seven different public databases, including our own developed database—the UoA-DR database. Therefore, our proposed method is generalised and robust, making it more suitable than alternative methods for adoption in clinical practice.
Keywords: Retinal image quality analysis | Fundus images | Deep learning | Transfer learning