On the relevance of the metadata used in the semantic segmentation of indoor image spaces
ارتباط فراداده های مورد استفاده در تقسیم بندی معنایی فضاهای تصویر داخلی-2021
The study of artificial learning processes in the area of computer vision context has mainly focused on achieving a fixed output target rather than on identifying the underlying processes as a means to develop solutions capable of performing as good as or better than the human brain. This work reviews the well-known segmentation efforts in computer vision. However, our primary focus is on the quantitative evaluation of the amount of contextual information provided to the neural network. In particular, the information used to mimic the tacit information that a human is capable of using, like a sense of unambiguous order and the capability of improving its estimation by complementing already learned information. Our results show that, after a set of pre and post- processing methods applied to both the training data and the neural network architecture, the predictions made were drastically closer to the expected output in comparison to the cases where no contextual additions were provided. Our results provide evidence that learning systems strongly rely on contextual information for the identification task process.
Keywords: Deep learning | U-net | Semantic segmentation | Metadata preprocessing | Fully convolutional network | Indoor scenes
Defect detection and quantification in electroluminescence images of solar PV modules using U-net semantic segmentation
تشخیص و تعیین کمبود در تصاویر الکترولومینسانس ماژول های PV خورشیدی با استفاده از تقسیم بندی معنایی U-net-2021
Electroluminescence (EL) images enable defect detection in solar photovoltaic (PV) modules that are otherwise invisible to the naked eye, much the same way an x-ray enables a doctor to detect cracks and fractures in bones. The prevalence of multiple defects, e.g. micro cracks, inactive regions, gridline defects, and material defects, in PV module can be quantiﬁed with an EL image. Modern, deep learning tech- niques for computer vision can be applied to extract the useful information contained in the images on entire batches of PV modules. Defect detection and quantiﬁcation in EL images can improve the efﬁciency and the reliability of PV modules both at the factory by identifying potential process issues and at the PV plant by identifying and reducing the number of faulty modules installed. In this work, we train and test a semantic segmentation model based on the u-net architecture for EL image analysis of PV modules made from mono-crystalline and multi-crystalline silicon wafer-based solar cells. This work is focused on developing and testing a deep learning method for computer vision that is independent of the equipment used to generate the EL images, independent of the wafer-based module design, and independent of the image quality.© 2021 Elsevier Ltd. All rights reserved.
Keywords: Electroluminescence | EL | PV | U-net | Semantic segmentation | Machine learning
Deep learning-based single image face depth data enhancement
افزایش عمق داده ها با یادگیری عمیق مبتنی بر تک تصویر-2021
Face recognition can benefit from the utilization of depth data captured using low-cost cameras, in particular for presentation attack detection purposes. Depth video output from these capture devices can however contain defects such as holes or general depth inaccuracies. This work proposes a deep learning face depth enhancement method in this context of facial biometrics, which adds a security aspect to the topic. U-Net-like architectures are utilized, and the networks are compared against hand-crafted enhancer types, as well as a similar depth enhancer network from related work trained for an adjacent application scenario. All tested enhancer types exclusively use depth data as input, which differs from methods that enhance depth based on additional input data such as visible light color images. Synthetic face depth ground truth images and degraded forms thereof are created with help of PRNet, to train multiple deep learning enhancer models with different network sizes and training configurations. Evaluations are carried out on the synthetic data, on Kinect v1 images from the KinectFaceDB, and on in-house RealSense D435 images. These evaluations include an assessment of the falsification for occluded face depth input, which is relevant to biometric security. The proposed deep learning enhancers yield noticeably better results than the tested preexisting enhancers, without overly falsifying depth data when non-face input is provided, and are shown to reduce the error of a simple landmark-based PAD method.
Keywords: 3D face depth | Deep learning | Image enhancement | Face depth synthesis | Face recognition | Presentation attack detection
AIS-Based Vessel Trajectory Reconstruction with U-Net Convolutional Networks
بازسازی مسیر کشتی مبتنی بر AIS با شبکه های کانولوشن U-Net-2020
The vessel trajectory data indicated by the Automatic Identification System (AIS) is important and useful in maritime data analysis, navigational safety and maritime risk assessment. However, the raw trajectory data contains noise, missing data and other errors which can lead to a wrong conclusion. Therefore, it is essential to develop a vessel trajectory reconstruction method, which is meaningful for enhancing the applicability of vessel trajectory and improving the navigation safety. In recent years, there have been many studies about vessel trajectory reconstruction, but the performance of these methods will degrade when they are faced with curved trajectories with high loss rate. In this paper, we propose a novel trajectory reconstruction method via U-net. Benefiting from the architecture of U-net, this method makes great use of historical trajectories and takes advantage of the rich skip connections in this network which help copy low-level features to corresponding high-level features. Consequently, this method is robust to the trajectories with different sampling rates, missing points, and noisy data. In addition, the proposed method is tested and compared with cubic spline interpolation. The results show that our method is capable of higher accuracy than the cubic spline interpolation especially when the trajectories are curved and have a high loss rate.
Keywords: Trajectory reconstruction | U-net | Machine learning | AIS data | Traffic safety
AI Illustrator: Art Illustration Generation Based on Generative Adversarial Network
AI Illustrator: تولید تصویر هنری بر اساس شبکه تخاصمی تولیدی-2020
In recent years, peoples pursuit of art has been on the rise. People want computers to be able to create artistic paintings based on descriptions. In this paper, we proposed a novel project, Painting Creator, which uses deep learning technology to enable the computer to generate artistic illustrations from a short piece of text. Our scheme includes two models, image generation model and style transfer model. In the real image generation model, inspired by the application of stack generative adversarial networks in text to image generation, we proposed an improved model, IStackGAN, to solve the problem of image generation. We added a classifier based on the original model and added image structure loss and feature extraction loss to improve the performance of the generator. The generator network can get additional hidden information from the classification information to produce better pictures. The loss of image structure can force the generator to restore the real image, and the loss of feature extraction can verify whether the generator network has extracted the features of the real image set. For the style transfer model, we improved the generator based on the original cycle generative adversarial networks and used the residual block to improve the stability and performance of the u-net generator. To improve the performance of the generator, we also added the cycle consistent loss with MS-SSIM. The experimental results show that our model is improved significantly based on the original paper, and the generated pictures are more vivid in detail, and pictures after the style transfer are more artistic to watch.
Keywords : Image generation | style transfer
Automatic detection, localization and segmentation of nano-particles with deep learning in microscopy images
تشخیص خودکار ، بومی سازی و تقسیم نانو ذرات با یادگیری عمیق در تصاویر میکروسکوپی-2019
With the growing amount of high resolution microscopy images automatic nano-particle detection, shape analysis and size determination have gained importance for providing quantitative support that gives important information for the evaluation of the material. In this paper, we present a new method for detection of nanoparticles and determination of their shapes and sizes simultaneously with deep learning. The proposed method employs multiple output convolutional neural networks (MO-CNN) and has two outputs: first is the detection output that gives the locations of the particles and the other one is the segmentation output for providing the boundaries of the nano-particles. The final sizes of particles are determined with the modified Hough algorithm that runs on the segmentation output. The proposed method is tested and evaluated on a dataset containing 17 TEM images of Fe3O4 and silica coated nano-particles. Also, we compared these results with U-net algorithm which is a popular deep learning method. The experiments showed that the proposed method has 98.23% accuracy for detection and 96.59% accuracy for segmentation of nano-particles.
Keywords: Nano-particle | Deep learning | Object detection | MO-CNN | Hough transform
Learning to detect lymphocytes in immunohistochemistry with deep learning
یادگیری لنفوسیت ها در ایمونوهیستوشیمی با یادگیری عمیق-2019
The immune system is of critical importance in the development of cancer. The evasion of destruction by the immune system is one of the emerging hallmarks of cancer. We have built a dataset of 171,166 manually annotated CD3 + and CD8 + cells, which we used to train deep learning algorithms for auto- matic detection of lymphocytes in histopathology images to better quantify immune response. Moreover, we investigate the effectiveness of four deep learning based methods when different subcompartments of the whole-slide image are considered: normal tissue areas, areas with immune cell clusters, and areas containing artifacts. We have compared the proposed methods in breast, colon and prostate cancer tissue slides collected from nine different medical centers. Finally, we report the results of an observer study on lymphocyte quantification, which involved four pathologists from different medical centers, and com- pare their performance with the automatic detection. The results give insights on the applicability of the proposed methods for clinical use. U-Net obtained the highest performance with an F1-score of 0.78 and the highest agreement with manual evaluation ( κ= 0 . 72 ), whereas the average pathologists agreement with reference standard was κ= 0 . 64 . The test set and the automatic evaluation procedure are publicly available at lyon19.grand-challenge.org .
Keywords: Deep learning | Immune cell detection | Computational pathology | Immunohistochemistry
Identification and Quantification of Cardiovascular Structures From CCTA
شناسایی و تعیین ساختارهای قلبی و عروقی از CCTA-2019
OBJECTIVES This study designed and evaluated an end-to-end deep learning solution for cardiac segmentation and quantification. BACKGROUND Segmentation of cardiac structures from coronary computed tomography angiography (CCTA) images is laborious. We designed an end-to-end deep-learning solution. METHODS Scans were obtained from multicenter registries of 166 patients who underwent clinically indicated CCTA. Left ventricular volume (LVV) and right ventricular volume (RVV), left atrial volume (LAV) and right atrial volume (RAV), and left ventricular myocardial mass (LVM) were manually annotated as ground truth. A U-Netinspired, deep-learning model was trained, validated, and tested in a 70:20:10 split. RESULTS Mean age was 61.1 8.4 years, and 49% were women. A combined overall median Dice score of 0.9246 (interquartile range: 0.8870 to 0.9475) was achieved. The median Dice scores for LVV, RVV, LAV, RAV, and LVM were 0.938 (interquartile range: 0.887 to 0.958), 0.927 (interquartile range: 0.916 to 0.946), 0.934 (interquartile range: 0.899 to 0.950), 0.915 (interquartile range: 0.890 to 0.920), and 0.920 (interquartile range: 0.811 to 0.944), respectively. Model prediction correlated and agreed well with manual annotation for LVV (r ¼ 0.98), RVV (r ¼ 0.97), LAV (r ¼ 0.78), RAV (r ¼ 0.97), and LVM (r ¼ 0.94) (p < 0.05 for all). Mean difference and limits of agreement for LVV, RVV, LAV, RAV, and LVM were 1.20 ml (95% CI: 7.12 to 9.51), 0.78 ml (95% CI: 10.08 to 8.52), 3.75 ml (95% CI: 21.53 to 14.03), 0.97 ml (95% CI: 6.14 to 8.09), and 6.41 g (95% CI: 8.71 to 21.52), respectively. CONCLUSIONS A deep-learning model rapidly segmented and quantified cardiac structures. This was done with high accuracy on a pixel level, with good agreement with manual annotation, facilitating its expansion into areas of research and clinical import. (J Am Coll Cardiol Img 2019;-:-–-) © 2019 by the American College of Cardiology Foundation.
Comparison of deep learning-based and patch-based methods for pseudo-CT generation in MRI-based prostate dose planning
مقایسه روشهای مبتنی بر یادگیری عمیق و مبتنی بر پچ برای تولید شبه CT در برنامه ریزی دوز پروستات مبتنی بر MRI-2019
Purpose Deep learning methods (DLMs) have recently been proposed to generate pseudo-CT (pCT) for MRI-based dose planning. This study aims to evaluate and compare DLMs (U-Net and generative adversarial network (GAN)) using various loss functions (L2, single-scale perceptual loss (PL), multiscale PL, weighted multiscale PL), and a patch-based method (PBM). Materials and Methods Thirty-nine patients received a VMAT for prostate cancer (78 Gy). T2-weighted MRIs were acquired in addition to planning CTs. The pCTs were generated from the MRIs using seven configurations: four GANs (L2, single-scale PL, multiscale PL, weighted multiscale PL), two U-Net (L2 and single-scale PL), and the PBM. The imaging endpoints were mean absolute error (MAE) and mean error (ME), in Hounsfield units (HU), between the reference CT (CTref) and the pCT. Dose uncertainties were quantified as mean absolute differences between the DVHs calculated from the CTref and pCT obtained by each method. 3D gamma indexes were analyzed Results Considering the image uncertainties in the whole pelvis, GAN L2 and U-Net L2 showed the lowest MAE (≤34.4 HU). The ME were not different than 0 (p≤0.05). The PBM provided the highest uncertainties. Very few DVH points differed when comparing GAN L2 or U-Net L2 DVHs and CTref DVHs (p≤0.05). Their dose uncertainties were: ≤0.6% for the prostate PTV V95%, ≤0.5% for the rectum V70Gy, and ≤0.1% for the bladder V50Gy. The PBM, U-Net PL and GAN PL presented the highest systematic dose uncertainties. The gamma passrates were >99% for all DLMs. The mean calculation time to generate one pCT was 15 s for the DLMs and 62 min for the PBM. Conclusion Generating pCT for MRI dose planning with DLMs and PBM provided low dose uncertainties. In particular, the GAN L2 and U-Net L2 provided the lowest dose uncertainties together with a low computation time.
Keywords: pseudo-CT generation | MRI-only radiotherapy | deep learning | dose calculation | prostate cancer