AVILNet: A New Pliable Network with a Novel Metric for Small-Object Segmentation and Detection in Infrared Images

Ikhwan Song; Sungho Kim

doi:10.3390/rs13040555

AVILNet: A New Pliable Network with a Novel Metric for Small-Object Segmentation and Detection in Infrared Images

Remote Sensing ◽

10.3390/rs13040555 ◽

2021 ◽

Vol 13 (4) ◽

pp. 555

Author(s):

Ikhwan Song ◽

Sungho Kim

Keyword(s):

Object Segmentation ◽

State Of The Art ◽

Threshold Value ◽

Small Object ◽

Infrared Images ◽

Average Precision ◽

Trade Off ◽

Good Evaluation ◽

Single Dataset ◽

Evaluation Metric

Infrared small-object segmentation (ISOS) has a persistent trade-off problem—that is, which came first, recall or precision? Constructing a fine balance between of them is, au fond, of vital importance to obtain the best performance in real applications, such as surveillance, tracking, and many fields related to infrared searching and tracking. F1-score may be a good evaluation metric for this problem. However, since the F1-score only depends upon a specific threshold value, it cannot reflect the user’s requirements according to the various application environment. Therefore, several metrics are commonly used together. Now we introduce F-area, a novel metric for a panoptic evaluation of average precision and F1-score. It can simultaneously consider the performance in terms of real application and the potential capability of a model. Furthermore, we propose a new network, called the Amorphous Variable Inter-located Network (AVILNet), which is of pliable structure based on GridNet, and it is also an ensemble network consisting of the main and its sub-network. Compared with the state-of-the-art ISOS methods, our model achieved an AP of 51.69%, F1-score of 63.03%, and F-area of 32.58% on the International Conference on Computer Vision 2019 ISOS Single dataset by using one generator. In addition, an AP of 53.6%, an F1-score of 60.99%, and F-area of 32.69% by using dual generators, with beating the existing best record (AP, 51.42%; F1-score, 57.04%; and F-area, 29.33%).

Download Full-text

Accurate Instance-Based Segmentation for Boundary Detection in Robot Grasping Application

Applied Sciences ◽

10.3390/app11094248 ◽

2021 ◽

Vol 11 (9) ◽

pp. 4248

Author(s):

Hong Hai Hoang ◽

Bao Long Tran

Keyword(s):

Object Segmentation ◽

State Of The Art ◽

Rapid Development ◽

Spatial Relationship ◽

Learning Technologies ◽

Average Precision ◽

Novel Approach ◽

3D Camera ◽

Robot Grasping ◽

Instance Segmentation

With the rapid development of cameras and deep learning technologies, computer vision tasks such as object detection, object segmentation and object tracking are being widely applied in many fields of life. For robot grasping tasks, object segmentation aims to classify and localize objects, which helps robots to be able to pick objects accurately. The state-of-the-art instance segmentation network framework, Mask Region-Convolution Neural Network (Mask R-CNN), does not always perform an excellent accurate segmentation at the edge or border of objects. The approach using 3D camera, however, is able to extract the entire (foreground) objects easily but can be difficult or require a large amount of computation effort to classify it. We propose a novel approach, in which we combine Mask R-CNN with 3D algorithms by adding a 3D process branch for instance segmentation. Both outcomes of two branches are contemporaneously used to classify the pixels at the edge objects by dealing with the spatial relationship between edge region and mask region. We analyze the effectiveness of the method by testing with harsh cases of object positions, for example, objects are closed, overlapped or obscured by each other to focus on edge and border segmentation. Our proposed method is about 4 to 7% higher and more stable in IoU (intersection of union). This leads to a reach of 46% of mAP (mean Average Precision), which is a higher accuracy than its counterpart. The feasibility experiment shows that our method could be a remarkable promoting for the research of the grasping robot.

Download Full-text

Miss Detection vs. False Alarm: Adversarial Learning for Small Object Segmentation in Infrared Images

2019 IEEE/CVF International Conference on Computer Vision (ICCV) ◽

10.1109/iccv.2019.00860 ◽

2019 ◽

Author(s):

Huan Wang ◽

Luping Zhou ◽

Lei Wang

Keyword(s):

False Alarm ◽

Object Segmentation ◽

Small Object ◽

Infrared Images ◽

Adversarial Learning

Download Full-text

Object Detection Based on Center Point Proposals

Electronics ◽

10.3390/electronics9122075 ◽

2020 ◽

Vol 9 (12) ◽

pp. 2075

Author(s):

Hao Chen ◽

Hong Zheng

Keyword(s):

Object Detection ◽

Detection Method ◽

State Of The Art ◽

Input Image ◽

Small Object ◽

Average Precision ◽

Shallow Layer ◽

Center Point ◽

Speed And Accuracy

Anchor-based detectors are widely adopted in object detection. To improve the accuracy of object detection, multiple anchor boxes are intensively placed on the input image, yet most of them are invalid. Although anchor-free methods can reduce the number of useless anchor boxes, the invalid ones still occupy a high proportion. On this basis, this paper proposes an object-detection method based on center point proposals to reduce the number of useless anchor boxes while improving the quality of anchor boxes, balancing the proportion of positive and negative samples. By introducing the differentiation module in the shallow layer, the new method can alleviate the problem of missing detection caused by overlapping of center points. When trained and tested on COCO (Common Objects in Context) dataset, this algorithm records an increase of about 2% in APS (Average Precision of Small Object), reaching 27.8%. The detector designed in this study outperforms most of the state-of-the-art real-time detectors in speed and accuracy trade-off, achieving the AP of 43.2 in 137 ms.

Download Full-text

UBAR

ACM Transactions on Embedded Computing Systems ◽

10.1145/3441644 ◽

2021 ◽

Vol 20 (3) ◽

pp. 1-25

Author(s):

Elham Shamsa ◽

Alma Pröbstl ◽

Nima TaheriNejad ◽

Anil Kanduri ◽

Samarjit Chakraborty ◽

...

Keyword(s):

Resource Management ◽

Quality Of Experience ◽

State Of The Art ◽

State Of Charge ◽

User Preference ◽

Management Approach ◽

High Quality ◽

Trade Off ◽

Run Time

Smartphone users require high Battery Cycle Life (BCL) and high Quality of Experience (QoE) during their usage. These two objectives can be conflicting based on the user preference at run-time. Finding the best trade-off between QoE and BCL requires an intelligent resource management approach that considers and learns user preference at run-time. Current approaches focus on one of these two objectives and neglect the other, limiting their efficiency in meeting users’ needs. In this article, we present UBAR, User- and Battery-aware Resource management, which considers dynamic workload, user preference, and user plug-in/out pattern at run-time to provide a suitable trade-off between BCL and QoE. UBAR personalizes this trade-off by learning the user’s habits and using that to satisfy QoE, while considering battery temperature and State of Charge (SOC) pattern to maximize BCL. The evaluation results show that UBAR achieves 10% to 40% improvement compared to the existing state-of-the-art approaches.

Download Full-text

Performance vs. hardware requirements in state-of-the-art automatic speech recognition

EURASIP Journal on Audio Speech and Music Processing ◽

10.1186/s13636-021-00217-4 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Alexandru-Lucian Georgescu ◽

Alessandro Pappalardo ◽

Horia Cucu ◽

Michaela Blott

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

State Of The Art ◽

Decision Makers ◽

Computing Power ◽

Trade Off ◽

Speech Features ◽

Commercial Applications ◽

Guided Tour ◽

Embedded Applications

AbstractThe last decade brought significant advances in automatic speech recognition (ASR) thanks to the evolution of deep learning methods. ASR systems evolved from pipeline-based systems, that modeled hand-crafted speech features with probabilistic frameworks and generated phone posteriors, to end-to-end (E2E) systems, that translate the raw waveform directly into words using one deep neural network (DNN). The transcription accuracy greatly increased, leading to ASR technology being integrated into many commercial applications. However, few of the existing ASR technologies are suitable for integration in embedded applications, due to their hard constrains related to computing power and memory usage. This overview paper serves as a guided tour through the recent literature on speech recognition and compares the most popular ASR implementations. The comparison emphasizes the trade-off between ASR performance and hardware requirements, to further serve decision makers in choosing the system which fits best their embedded application. To the best of our knowledge, this is the first study to provide this kind of trade-off analysis for state-of-the-art ASR systems.

Download Full-text

PCAN—Part-Based Context Attention Network for Thermal Power Plant Detection in Remote Sensing Imagery

Remote Sensing ◽

10.3390/rs13071243 ◽

2021 ◽

Vol 13 (7) ◽

pp. 1243

Author(s):

Wenxin Yin ◽

Wenhui Diao ◽

Peijin Wang ◽

Xin Gao ◽

Ya Li ◽

...

Keyword(s):

Remote Sensing ◽

Power Plants ◽

State Of The Art ◽

Thermal Power ◽

Image Interpretation ◽

Remote Sensing Image ◽

Thermal Power Plants ◽

Average Precision ◽

Deep Convolutional Neural Networks ◽

Multi Scale

The detection of Thermal Power Plants (TPPs) is a meaningful task for remote sensing image interpretation. It is a challenging task, because as facility objects TPPs are composed of various distinctive and irregular components. In this paper, we propose a novel end-to-end detection framework for TPPs based on deep convolutional neural networks. Specifically, based on the RetinaNet one-stage detector, a context attention multi-scale feature extraction network is proposed to fuse global spatial attention to strengthen the ability in representing irregular objects. In addition, we design a part-based attention module to adapt to TPPs containing distinctive components. Experiments show that the proposed method outperforms the state-of-the-art methods and can achieve 68.15% mean average precision.

Download Full-text

A trade-off between classical and quantum circuit size for an attack against CSIDH

Journal of Mathematical Cryptology ◽

10.1515/jmc-2020-0070 ◽

2020 ◽

Vol 15 (1) ◽

pp. 4-17

Author(s):

Jean-François Biasse ◽

Xavier Bonnetain ◽

Benjamin Pring ◽

André Schrottenloher ◽

William Youmans

Keyword(s):

Endomorphism Ring ◽

State Of The Art ◽

Quantum Circuit ◽

List Type ◽

Hard Problem ◽

Trade Off ◽

Classical State ◽

Security Parameter ◽

The Cost ◽

Quadratic Order

AbstractWe propose a heuristic algorithm to solve the underlying hard problem of the CSIDH cryptosystem (and other isogeny-based cryptosystems using elliptic curves with endomorphism ring isomorphic to an imaginary quadratic order 𝒪). Let Δ = Disc(𝒪) (in CSIDH, Δ = −4p for p the security parameter). Let 0 < α < 1/2, our algorithm requires:A classical circuit of size $2^{\tilde{O}\left(\log(|\Delta|)^{1-\alpha}\right)}.$A quantum circuit of size $2^{\tilde{O}\left(\log(|\Delta|)^{\alpha}\right)}.$Polynomial classical and quantum memory.Essentially, we propose to reduce the size of the quantum circuit below the state-of-the-art complexity $2^{\tilde{O}\left(\log(|\Delta|)^{1/2}\right)}$ at the cost of increasing the classical circuit-size required. The required classical circuit remains subexponential, which is a superpolynomial improvement over the classical state-of-the-art exponential solutions to these problems. Our method requires polynomial memory, both classical and quantum.

Download Full-text

Malaria parasite detection in thick blood smear microscopic images using modified YOLOV3 and YOLOV4 models

BMC Bioinformatics ◽

10.1186/s12859-021-04036-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Fetulhak Abdurahman ◽

Kinde Anlay Fante ◽

Mohammed Aliy

Keyword(s):

Object Detection ◽

Malaria Parasite ◽

Blood Smear ◽

Clustering Algorithm ◽

State Of The Art ◽

Detection Accuracy ◽

Small Object ◽

Thick Blood Smear ◽

Malaria Parasites ◽

Microscopic Images

Abstract Background Manual microscopic examination of Leishman/Giemsa stained thin and thick blood smear is still the “gold standard” for malaria diagnosis. One of the drawbacks of this method is that its accuracy, consistency, and diagnosis speed depend on microscopists’ diagnostic and technical skills. It is difficult to get highly skilled microscopists in remote areas of developing countries. To alleviate this problem, in this paper, we propose to investigate state-of-the-art one-stage and two-stage object detection algorithms for automated malaria parasite screening from microscopic image of thick blood slides. Results YOLOV3 and YOLOV4 models, which are state-of-the-art object detectors in accuracy and speed, are not optimized for detecting small objects such as malaria parasites in microscopic images. We modify these models by increasing feature scale and adding more detection layers to enhance their capability of detecting small objects without notably decreasing detection speed. We propose one modified YOLOV4 model, called YOLOV4-MOD and two modified models of YOLOV3, which are called YOLOV3-MOD1 and YOLOV3-MOD2. Besides, new anchor box sizes are generated using K-means clustering algorithm to exploit the potential of these models in small object detection. The performance of the modified YOLOV3 and YOLOV4 models were evaluated on a publicly available malaria dataset. These models have achieved state-of-the-art accuracy by exceeding performance of their original versions, Faster R-CNN, and SSD in terms of mean average precision (mAP), recall, precision, F1 score, and average IOU. YOLOV4-MOD has achieved the best detection accuracy among all the other models with a mAP of 96.32%. YOLOV3-MOD2 and YOLOV3-MOD1 have achieved mAP of 96.14% and 95.46%, respectively. Conclusions The experimental results of this study demonstrate that performance of modified YOLOV3 and YOLOV4 models are highly promising for detecting malaria parasites from images captured by a smartphone camera over the microscope eyepiece. The proposed system is suitable for deployment in low-resource setting areas.

Download Full-text

Evolutionary Machine Learning for Multi-Objective Class Solutions in Medical Deformable Image Registration

Algorithms ◽

10.3390/a12050099 ◽

2019 ◽

Vol 12 (5) ◽

pp. 99 ◽

Cited By ~ 2

Author(s):

Kleopatra Pirpinia ◽

Peter A. N. Bosman ◽

Jan-Jakob Sonke ◽

Marcel van Herk ◽

Tanja Alderliesten

Keyword(s):

Machine Learning ◽

Image Registration ◽

State Of The Art ◽

Deformable Image Registration ◽

Optimization Approach ◽

High Quality ◽

Trade Off ◽

Multi Objective ◽

Current State ◽

Image Artefacts

Current state-of-the-art medical deformable image registration (DIR) methods optimize a weighted sum of key objectives of interest. Having a pre-determined weight combination that leads to high-quality results for any instance of a specific DIR problem (i.e., a class solution) would facilitate clinical application of DIR. However, such a combination can vary widely for each instance and is currently often manually determined. A multi-objective optimization approach for DIR removes the need for manual tuning, providing a set of high-quality trade-off solutions. Here, we investigate machine learning for a multi-objective class solution, i.e., not a single weight combination, but a set thereof, that, when used on any instance of a specific DIR problem, approximates such a set of trade-off solutions. To this end, we employed a multi-objective evolutionary algorithm to learn sets of weight combinations for three breast DIR problems of increasing difficulty: 10 prone-prone cases, 4 prone-supine cases with limited deformations and 6 prone-supine cases with larger deformations and image artefacts. Clinically-acceptable results were obtained for the first two problems. Therefore, for DIR problems with limited deformations, a multi-objective class solution can be machine learned and used to compute straightforwardly multiple high-quality DIR outcomes, potentially leading to more efficient use of DIR in clinical practice.

Download Full-text

Towards a knowledge-based probabilistic and context-aware social recommender system

Journal of Information Science ◽

10.1177/0165551517698787 ◽

2017 ◽

Vol 44 (4) ◽

pp. 464-490 ◽

Cited By ~ 14

Author(s):

Luis Omar Colombo-Mendoza ◽

Rafael Valencia-García ◽

Alejandro Rodríguez-González ◽

Ricardo Colomo-Palacios ◽

Giner Alor-Hernández

Keyword(s):

Latent Dirichlet Allocation ◽

Evaluation Method ◽

Context Aware ◽

Average Precision ◽

Case Scenario ◽

Similarity Metric ◽

Knowledge Based ◽

Context Modelling ◽

Evaluation Metric ◽

Partial Implementation

In this article, we propose (1) a knowledge-based probabilistic collaborative filtering (CF) recommendation approach using both an ontology-based semantic similarity metric and a latent Dirichlet allocation (LDA) model-based recommendation technique and (2) a context-aware software architecture and system with the objective of validating the recommendation approach in the eating domain (foodservice places). The ontology on which the similarity metric is based is additionally leveraged to model and reason about users’ contexts; the proposed LDA model also guides the users’ context modelling to some extent. An evaluation method in the form of a comparative analysis based on traditional information retrieval (IR) metrics and a reference ranking-based evaluation metric (correctly ranked places) is presented towards the end of this article to reliably assess the efficacy and effectiveness of our recommendation approach, along with its utility from the user’s perspective. Our recommendation approach achieves higher average precision and recall values (8% and 7.40%, respectively) in the best-case scenario when compared with a CF approach that employs a baseline similarity metric. In addition, when compared with a partial implementation that does not consider users’ preferences for topics, the comprehensive implementation of our recommendation approach achieves higher average values of correctly ranked places (2.5 of 5 versus 1.5 of 5).

Download Full-text