Weighted boxes fusion: Ensembling boxes from different object detection models
همجوشی جعبه های توزین شده: جمع آوری جعبه هایی از مدل های مختلف تشخیص شیء-2021
Object detection is a crucial task in computer vision systems with a wide range of applications in autonomous driving, medical imaging, retail, security, face recognition, robotics, and others. Nowadays, neural networks- based models are used to localize and classify instances of objects of particular classes. When real-time inference is not required, ensembles of models help to achieve better results. In this work, we present a novel method for fusing predictions from different object detection models: weighted boxes fusion. Our algorithm utilizes conﬁdence scores of all proposed bounding boxes to construct averaged boxes. We tested the method on several datasets and evaluated it in the context of Open Images and COCO Object Detection challenges, achieving top results in these challenges. The 3D version of boxes fusion was successfully applied by the winning teams of Waymo Open Dataset and Lyft 3D Object Detection for Autonomous Vehicles challenges. The source code is publicly available at GitHub (Solovyev, 2019 ).We present a novel method for combining predictions in ensembles of different object detection models: weighted boxes fusion. This method signiﬁcantly improves the quality of the fused predicted rectangles for an ensemble. We tested the method on several datasets and evaluated it in the context of the Open Images and COCO Object Detection challenges. It helped to achieve top results in these challenges. The source code is publicly available at GitHub.© 2021 Published by Elsevier B.V.
Keywords: Object detection | Computer vision | Deep learning
A deep learning approach to the screening of malaria infection: Automated and rapid cell counting, object detection and instance segmentation using Mask R-CNN
یک روش یادگیری عمیق برای غربالگری عفونت مالاریا: شمارش خودکار و سریع سلول ها ، تشخیص اشیاء و تقسیم بندی نمونه با استفاده از Mask R-CNN-2021
Accurate and early diagnosis is critical to proper malaria treatment and hence death prevention. Several com- puter vision technologies have emerged in recent years as alternatives to traditional microscopy and rapid diagnostic tests. In this work, we used a deep learning model called Mask R-CNN that is trained on uninfected and Plasmodium falciparum-infected red blood cells. Our predictive model produced reports at a rate 15 times faster than manual counting without compromising on accuracy. Another unique feature of our model is its ability to generate segmentation masks on top of bounding box classifications for immediate visualization, making it superior to existing models. Furthermore, with greater standardization, it holds much potential to reduce errors arising from manual counting and save a significant amount of human resources, time, and cost.
Keywords: Malaria diagnosis | Mask R-CNN | Computer vision | Image analysis
Deep learning computer vision for the separation of Cast- and Wrought-Aluminum scrap
یادگیری عمیق بینایی ماشین برای جداسازی ضایعات آلومینیوم ریخته گری و فرفورژه-2021
In consequence of the electrification and the increased adoption of lightweight structures in the automotive industry, global demand for wrought Aluminum (Al) is expected to rise while demand for cast Al will stagnate. Since cast alloys can only be converted to wrought alloys by energy-intensive processes, the most promising strategy to avoid the emergence of excess Al cast alloys scrap is to sort cast from wrought alloys. To date, the separation of complex mixes of non-ferrous metals often implies the use of either or both sink-float techniques and/or X-ray fluorescence (XRF) based sorting. Therefore, the presented research develops an efficient method to classify cast and wrought (C&W) alloys in a real-time system with a conveyor belt using transfer learning methods, such as fine-tuning and feature extraction. Five CNNs are evaluated to classify C&W alloys using colour and depth images and transfer learning methods. In addition, the early fusion and late fusion of colour and depth images of C&W Al are investigated. For early fusion, data is added as an extra input channel to the first convolution layer of the CNN, and for later fusion, the images are fed in two separate subnetworks with the same architecture, where the parameters of the fully-connected layers are concatenated in both subnetworks. Our approach shows that late fusion CNN DenseNet allows obtaining the best performances and can achieve up to 98% accuracy.
Keywords: Artificial intelligence | Automatic sorting | Scrap recycling | Cast and wrought Aluminum | Deep learning computer vision | Object detection and recognition
An automated deep learning based anomaly detection in pedestrian walkways for vulnerable road users safety
یک تشخیص ناهنجاری مبتنی بر یادگیری عمیق در معابر پیاده برای ایمنی کاربران جاده ای آسیب پذیر-2021
Anomaly detection in pedestrian walkways is an important research topic, commonly used to improve the safety of pedestrians. Due to the wide utilization of video surveillance systems and the increased quantity of captured videos, the traditional manual examination of labeling abnormal events is a tiresome task. So, an automated surveillance system that detects anomalies becomes essential among computer vision researchers. Presently, the development of deep learning (DL) models has gained significant interest in different computer vision processes namely object classification and object detection, and these applications were depending on supervised learning that required labels. Therefore, this paper develops an automated deep learning based anomaly detectiontechnique in pedestrian walkways (DLADT-PW) for vulnerable road user’s safety. The goal of the DLADT-PWmodel is to detect and classify the various anomalies that exist in the pedestrian walkways such as cars, skating, jeep, etc. The DLADT-PW model involves preprocessing as the primary step, which is applied for removing the noise and raise the quality of the image. In addition, mask region convolutional neural network (Mask-RCNN) with densely connected networks (DenseNet) model is employed for the detection process. To ensure the better anomaly detection performance of the DLADT-PW technique, an extensive set of simulations were performed and the outcomes are investigated under distinct aspects. The obtained experimental values confirmed the superior characteristics of the DLADT-PW technique by achieving a maximum detection accuracy.
Keywords: Anomaly detection | Pedestrian walkways | Deep learning | Safety | Mask RCNN
Artificial intelligence quality inspection of steel bars installation by integrating mask R-CNN and stereo vision
بازرسی کیفیت هوش مصنوعی نصب میله های فولادی با ادغام ماسک R-CNN و دید استریو-2021
Contractors should conduct strict quality inspection of the steel bars used in concrete structures and need to automate the process of quality inspection. The objective of this study is to develop an Artificial Intelligence Quality Inspection Model (AI-QIM) that can execute quality inspection on steel bars at the construction site. The proposed AI-QIM is built on the Mask Region-based Convolutional Neural Network (Mask R-CNN) technique, which can perform instance segmentation of steel bars. This object detection technique is integrated with a stereo vision camera to generate information on steel bar installation. A contractor can use the proposed AI-QIM to estimate the quantity, spacing, diameter, and length of steel bars during quality inspection. A sample case study indicated that the AI-QIM yielded a maximum relative error of 3% when measuring steel bar spacing and a maximum relative error of 8% when measuring steel bar lengths within a range of 1–2 m from a stereo camera.
Keywords: Steel bar | Quality inspection | Artificial intelligence | Convolutional Neural Network (CNN) | Mask R-CNN | Stereo vision | Object detection | Object mask | Instance segmentation
Computer vision approaches for detecting missing barricades
رویکردهای بینایی ماشین برای تشخیص موانع گمشده-2021
The installation of barricades effectively prevents falls from height (FFH) on construction sites. Common approaches for detecting missing barricades (e.g., manual inspection of the site or three-dimensional models) are not practical due to two inherent challenges: (1) these approaches are labor-intensive and time-consuming; and(2) FFH hazards are dynamic and changing as construction work progresses. To address these challenges, two computer vision-based detection approaches, including Masks Comparison Approach (MCA) and Missing Object Detection Approach (MODA), are developed in this study to automatically detect missing barricade. The performance of the proposed approaches and their benefits and implementation challenges were evaluated through a case study. The results demonstrate that MODA can achieve better performance and have several implementation advantages over MCA. The average precision and average recall for MODA were 57.9% and 73.6%, respectively. These two approaches can help site managers take action promptly to reduce the risks of FFH accidents.
Keywords: Falls from height | Safety | Computer vision | Unsafe behavior | Deep learning
Computer vision detection of foreign objects in coal processing using attention CNN
تشخیص بینایی ماشین اجسام خارجی در پردازش زغال سنگ با استفاده از CNN-2021
Foreign objects in coal seriously affect the efficiency and safety of clean coal production. Currently, the removal of foreign objects in coal preparation plant mainly depends on manual picking, which has disadvantages of high labor intensity and low efficiency. Therefore, there is an urgent need for rapid detection and removal of foreign objects. However, due to the inference of the background and surround objects, it is a challenge for the accurate detection of foreign objects. In this study, a convolutional neural network (CNN) with attention modules was designed to accurately segment foreign objects from a complex background in real-time. The proposed network consists of an encoder and a decoder, and the attention mechanism was introduced into the decoder to capture rich semantic information. The visualization results proved that the attention modules could focus on the features of the salient region and inhibit the irrelevant background, which significantly improved the accuracy of the detection The results showed that the proposed model correctly recognized 97% of the foreign objects in the 1871 sets of test images. The mean intersection over union (MIOU) of the optimal model was 91.24%, and the inference speed was greater than 15 fps/s, which satisfied the real-time requirement.
Keywords: Foreign object detection | Model uncertainties | Attention mechanisms | Visualization
A comprehensive survey on computer vision based concepts, methodologies, analysis and applications for automatic gun/knife detection
یک بررسی جامع در مورد مفاهیم ، روش ها ، تجزیه و تحلیل و برنامه های کاربردی برای تشخیص خودکار اسلحه/چاقو بر اساس بینایی ماشین-2021
The ability to detect gun and gun held in hand or other body parts is a typical human skill. The same problem presents an imperative task for computer vision system. Automatic observer independent detection of hand held gun or gun held in the other body part, whether it is visible or concealed, provides enhance security in vulnerable places and initiates appropriate action there. Compare to the automatic object detection systems, automatic detection of gun has very few successful attempts. In the present scope of this paper, we present an extensive survey on automatic detection of gun and define a taxonomy for this particular detection system. We also describe the inherent difficulties related with this problem. In this survey of published papers, we examine different approaches used in state-of-the-art attempts and compare performances of these approaches. Finally, this paper concludes pointing to the possible research gaps in related fields.
Keywords: Weapon detection | Survey | Security and surveillance | Multiple object detection | Visual and concealed
An object detection approach for detecting damages in heritage sites using 3-D point clouds and 2-D visual data
یک رویکرد تشخیص شی برای تشخیص خسارت در میراث با استفاده از ابرهای نقطه سه بعدی و داده های بصری دو بعدی-2021
We propose a novel pipeline for structural damage detection on surfaces of complex heritage structures using visual 2D and 3D data. We use deep learning and computer vision to detect damages in images of heritage sites, and subsequently localize the detected damage on corresponding 3D models. This enables intuitive visualization, giving a concrete idea about the extent of damage in 3D space. To train deep learning models for damage detection, we curate a labeled database consisting of images of Ayutthaya – Wat Phra Si Sanphet Temple (situated in Thailand), essentially converting the damage detection problem into an object detection task. We consider the two most common kinds of damages occurring in heritage structures, namely Crack and Spalling. Models trained using these database are experimentally observed to be robust as they can detect damages among intricate architectural designs and backgrounds. Post- training, we test the model’s domain transferability by detecting damages on unseen rendered images from 3D Models of UNESCO World Heritage Site – Hampi (situated in India). We also present a comparison of the performance of different conﬁgurations of Faster-RCNN as the damage detection model over heritage structure data and demonstrate the obtained results.© 2021 Elsevier Masson SAS. All rights reserved.
Keywords: Deep learning | Object detection | Computer vision | 3D point clouds | Image rendering | Structural health monitoring
Advances in deep space exploration via simulators & deep learning
پیشرفت در اکتشاف فضا از طریق شبیه سازها و یادگیری عمیق-2021
The NASA Starlight and Breakthrough Starshot programs conceptualize fast interstellar travel via small re- lativistic spacecraft that are propelled by directed energy. This process is radically different from traditional space travel and trades large and slow spacecraft for small, fast, inexpensive, and fragile ones. The main goal of these wafer satellites is to gather useful images during their deep space journey. We introduce and solve some of the main problems that accompany this concept. First, we need an object detection system that can detect planets that we have never seen before, some containing features that we may not even know exist in the universe. Second, once we have images of exoplanets, we need a way to take these images and rank them by importance. Equipment fails and data rates are slow, thus we need a method to ensure that the most important images to humankind are the ones that are prioritized for data transfer. Finally, the energy on board is minimal and must be conserved and used sparingly. No exoplanet images should be missed, but using energy erroneously would be detrimental. We introduce simulator-based methods that leverage artificial intelligence, mostly in the form of computer vision, in order to solve all three of these issues. Our results confirm that simulators provide an extremely rich training environment that surpasses that of real images, and can be used to train models on features that have yet to be observed by humans. We also show that the immersive and adaptable environment provided by the simulator, combined with deep learning, lets us navigate and save energy in an otherwise implausible way.
Keywords: Computer vision | Simulator | Deep learning | Space | Universe | Exoplanet | Object detection | Novelty detection