Development of a Wearable Camera and AI Algorithm for Medication Behavior Recognition

Hwiwon Lee; Sekyoung Youm

doi:10.3390/s21113594

Development of a Wearable Camera and AI Algorithm for Medication Behavior Recognition

Sensors ◽

10.3390/s21113594 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3594

Author(s):

Hwiwon Lee ◽

Sekyoung Youm

Keyword(s):

Object Detection ◽

Action Recognition ◽

Health Professionals ◽

Medication Compliance ◽

Image Data ◽

Image Sensor ◽

Training Dataset ◽

Behavior Recognition ◽

Recognition Model ◽

Detection Model

As many as 40% to 50% of patients do not adhere to long-term medications for managing chronic conditions, such as diabetes or hypertension. Limited opportunity for medication monitoring is a major problem from the perspective of health professionals. The availability of prompt medication error reports can enable health professionals to provide immediate interventions for patients. Furthermore, it can enable clinical researchers to modify experiments easily and predict health levels based on medication compliance. This study proposes a method in which videos of patients taking medications are recorded using a camera image sensor integrated into a wearable device. The collected data are used as a training dataset based on applying the latest convolutional neural network (CNN) technique. As for an artificial intelligence (AI) algorithm to analyze the medication behavior, we constructed an object detection model (Model 1) using the faster region-based CNN technique and a second model that uses the combined feature values to perform action recognition (Model 2). Moreover, 50,000 image data were collected from 89 participants, and labeling was performed on different data categories to train the algorithm. The experimental combination of the object detection model (Model 1) and action recognition model (Model 2) was newly developed, and the accuracy was 92.7%, which is significantly high for medication behavior recognition. This study is expected to enable rapid intervention for providers seeking to treat patients through rapid reporting of drug errors.

Download Full-text

Obtaining Urban Waterlogging Depths from Video Images Using Synthetic Image Data

Remote Sensing ◽

10.3390/rs12061014 ◽

2020 ◽

Vol 12 (6) ◽

pp. 1014

Author(s):

Jingchao Jiang ◽

Cheng-Zhi Qin ◽

Juan Yu ◽

Changxiu Cheng ◽

Junzhi Liu ◽

...

Keyword(s):

Object Detection ◽

Data Augmentation ◽

Open Data ◽

Image Data ◽

Training Data ◽

Synthetic Image ◽

Detection Model ◽

Video Images ◽

Image Dataset ◽

Water Surfaces

Reference objects in video images can be used to indicate urban waterlogging depths. The detection of reference objects is the key step to obtain waterlogging depths from video images. Object detection models with convolutional neural networks (CNNs) have been utilized to detect reference objects. These models require a large number of labeled images as the training data to ensure the applicability at a city scale. However, it is hard to collect a sufficient number of urban flooding images containing valuable reference objects, and manually labeling images is time-consuming and expensive. To solve the problem, we present a method to synthesize image data as the training data. Firstly, original images containing reference objects and original images with water surfaces are collected from open data sources, and reference objects and water surfaces are cropped from these original images. Secondly, the reference objects and water surfaces are further enriched via data augmentation techniques to ensure the diversity. Finally, the enriched reference objects and water surfaces are combined to generate a synthetic image dataset with annotations. The synthetic image dataset is further used for training an object detection model with CNN. The waterlogging depths are calculated based on the reference objects detected by the trained model. A real video dataset and an artificial image dataset are used to evaluate the effectiveness of the proposed method. The results show that the detection model trained using the synthetic image dataset can effectively detect reference objects from images, and it can achieve acceptable accuracies of waterlogging depths based on the detected reference objects. The proposed method has the potential to monitor waterlogging depths at a city scale.

Download Full-text

Deep Learning based Fruit Freshness Classification and Detection with CMOS Image sensors and Edge processors

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.12.fais-172 ◽

2020 ◽

Vol 2020 (12) ◽

pp. 172-1-172-7 ◽

Cited By ~ 1

Author(s):

Tejaswini Ananthanarayana ◽

Raymond Ptucha ◽

Sean C. Kelly

Keyword(s):

Deep Learning ◽

Object Detection ◽

Image Classification ◽

High Speed ◽

Image Sensor ◽

Vital Role ◽

Image Sensors ◽

Classification Model ◽

Detection Model ◽

Cmos Image Sensors

CMOS Image sensors play a vital role in the exponentially growing field of Artificial Intelligence (AI). Applications like image classification, object detection and tracking are just some of the many problems now solved with the help of AI, and specifically deep learning. In this work, we target image classification to discern between six categories of fruits — fresh/ rotten apples, fresh/ rotten oranges, fresh/ rotten bananas. Using images captured from high speed CMOS sensors along with lightweight CNN architectures, we show the results on various edge platforms. Specifically, we show results using ON Semiconductor’s global-shutter based, 12MP, 90 frame per second image sensor (XGS-12), and ON Semiconductor’s 13 MP AR1335 image sensor feeding into MobileNetV2, implemented on NVIDIA Jetson platforms. In addition to using the data captured with these sensors, we utilize an open-source fruits dataset to increase the number of training images. For image classification, we train our model on approximately 30,000 RGB images from the six categories of fruits. The model achieves an accuracy of 97% on edge platforms using ON Semiconductor’s 13 MP camera with AR1335 sensor. In addition to the image classification model, work is currently in progress to improve the accuracy of object detection using SSD and SSDLite with MobileNetV2 as the feature extractor. In this paper, we show preliminary results on the object detection model for the same six categories of fruits.

Download Full-text

Gradually Applying Weakly Supervised and Active Learning for Mass Detection in Breast Ultrasound Images

Applied Sciences ◽

10.3390/app10134519 ◽

2020 ◽

Vol 10 (13) ◽

pp. 4519

Author(s):

JooYeol Yun ◽

JungWoo Oh ◽

IlDong Yun

Keyword(s):

Active Learning ◽

Object Detection ◽

Image Data ◽

Breast Ultrasound ◽

Ultrasound Images ◽

Mass Detection ◽

Additional Increase ◽

Detection Model ◽

Point Increase ◽

Weakly Supervised

We propose a method for effectively utilizing weakly annotated image data in an object detection tasks of breast ultrasound images. Given the problem setting where a small, strongly annotated dataset and a large, weakly annotated dataset with no bounding box information are available, training an object detection model becomes a non-trivial problem. We suggest a controlled weight for handling the effect of weakly annotated images in a two stage object detection model. We also present a subsequent active learning scheme for safely assigning weakly annotated images a strong annotation using the trained model. Experimental results showed a 24% point increase in correct localization (CorLoc) measure, which is the ratio of correctly localized and classified images, by assigning the properly controlled weight. Performing active learning after a model is trained showed an additional increase in CorLoc. We tested the proposed method on the Stanford Dog datasets to assure that it can be applied to general cases, where strong annotations are insufficient to obtain resembling results. The presented method showed that higher performance is achievable with lesser annotation effort.

Download Full-text

Object Detection Model, Image Data and Results from the “When Computers Dream of Charcoal: Using Deep Learning, Open Tools and Open Data to Identify Relict Charcoal Hearths in and Around State Game Lands in Pennsylvania” Paper

Journal of Open Archaeology Data ◽

10.5334/joad.81 ◽

2021 ◽

Vol 9 ◽

Author(s):

Jeff Blackadar ◽

Benjamin Carter ◽

Weston Conner

Keyword(s):

Deep Learning ◽

Object Detection ◽

Open Data ◽

Image Data ◽

Detection Model ◽

Model Image ◽

Charcoal Hearths

Download Full-text

Tree species detection and identification from UAV imagery to support tropical forest monitoring

10.5194/egusphere-egu2020-17759 ◽

2020 ◽

Author(s):

Loïc Dutrieux ◽

Radhouene Azzabi ◽

Sébastien Bauwens ◽

Ulrich Gaël Bouka Dipelet ◽

Olivier Chenoz ◽

...

Keyword(s):

Object Detection ◽

Tropical Forests ◽

Tree Species ◽

Training Data ◽

Aerial Imagery ◽

Forest Monitoring ◽

Human Operator ◽

Training Dataset ◽

Detection Model ◽

Synthetic Images

<p>As part of a project aiming to support FSC certified logging concessions in their tasks of forest inventory and management, we collected aerial imagery over 9000 ha of tropical forests in Northern Congo using long range Unmanned Aerial Vehicles (UAVs). Once processed into orthomosaics, the aerial imagery is used in combination with reference training samples to train a deep learning object detection model (FasterRCNN) capable of detecting and predicting tree species. The remoteness and diversity of these forests make both data acquisition and generation of a training dataset challenging. Unlike natural images containing common objects like cars, bicycles, cats and dogs, there is no easy way to create a training dataset of tree species from overhead imagery of tropical forests. The first reason is that a human operator cannot as easily recognize and label objects. The second reason is that the polymorphism of tree species, phenological variations and uncertainty associated with visual recognition makes the exhaustive labeling of all instances of each class very difficult. Such exhaustive labeling is required to successfully train any object detection model. To overcome these challenges we built an interactive and ergonomic interface that allows a human operator to work in a spatial context, being guided by the approximate geographic location of already inventoried trees. We solved the issue of non-exhaustive instance labeling by building synthetic images, hence allowing full control of the training data. In addition to these specific developments related to training data generation, we will present details of the UAV missions, modelling results on synthetic images, and finally preliminary results of model transfer to aerial imagery.</p>

Download Full-text

NPU RGB+D Dataset and a Feature-Enhanced LSTM-DGCN Method for Action Recognition of Basketball Players

Applied Sciences ◽

10.3390/app11104426 ◽

2021 ◽

Vol 11 (10) ◽

pp. 4426

Author(s):

Chunyan Ma ◽

Ji Fan ◽

Jinghao Yao ◽

Tao Zhang

Keyword(s):

Action Recognition ◽

Large Scale ◽

Short Term Memory ◽

Evaluation Criteria ◽

Image Data ◽

Basketball Player ◽

Basketball Players ◽

Convolutional Network ◽

Atomic Actions ◽

New Feature

Computer vision-based action recognition of basketball players in basketball training and competition has gradually become a research hotspot. However, owing to the complex technical action, diverse background, and limb occlusion, it remains a challenging task without effective solutions or public dataset benchmarks. In this study, we defined 32 kinds of atomic actions covering most of the complex actions for basketball players and built the dataset NPU RGB+D (a large scale dataset of basketball action recognition with RGB image data and Depth data captured in Northwestern Polytechnical University) for 12 kinds of actions of 10 professional basketball players with 2169 RGB+D videos and 75 thousand frames, including RGB frame sequences, depth maps, and skeleton coordinates. Through extracting the spatial features of the distances and angles between the joint points of basketball players, we created a new feature-enhanced skeleton-based method called LSTM-DGCN for basketball player action recognition based on the deep graph convolutional network (DGCN) and long short-term memory (LSTM) methods. Many advanced action recognition methods were evaluated on our dataset and compared with our proposed method. The experimental results show that the NPU RGB+D dataset is very competitive with the current action recognition algorithms and that our LSTM-DGCN outperforms the state-of-the-art action recognition methods in various evaluation criteria on our dataset. Our action classifications and this NPU RGB+D dataset are valuable for basketball player action recognition techniques. The feature-enhanced LSTM-DGCN has a more accurate action recognition effect, which improves the motion expression ability of the skeleton data.

Download Full-text

U-Infuse: Democratization of Customizable Deep Learning for Object Detection

Sensors ◽

10.3390/s21082611 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2611

Author(s):

Andrew Shepley ◽

Greg Falzon ◽

Christopher Lawson ◽

Paul Meek ◽

Paul Kwan

Keyword(s):

Deep Learning ◽

Intellectual Property ◽

Object Detection ◽

Image Data ◽

Learning Technologies ◽

Training Data ◽

Learning Models ◽

Ecological Data ◽

Single Class ◽

Large Numbers

Image data is one of the primary sources of ecological data used in biodiversity conservation and management worldwide. However, classifying and interpreting large numbers of images is time and resource expensive, particularly in the context of camera trapping. Deep learning models have been used to achieve this task but are often not suited to specific applications due to their inability to generalise to new environments and inconsistent performance. Models need to be developed for specific species cohorts and environments, but the technical skills required to achieve this are a key barrier to the accessibility of this technology to ecologists. Thus, there is a strong need to democratize access to deep learning technologies by providing an easy-to-use software application allowing non-technical users to train custom object detectors. U-Infuse addresses this issue by providing ecologists with the ability to train customised models using publicly available images and/or their own images without specific technical expertise. Auto-annotation and annotation editing functionalities minimize the constraints of manually annotating and pre-processing large numbers of images. U-Infuse is a free and open-source software solution that supports both multiclass and single class training and object detection, allowing ecologists to access deep learning technologies usually only available to computer scientists, on their own device, customised for their application, without sharing intellectual property or sensitive data. It provides ecological practitioners with the ability to (i) easily achieve object detection within a user-friendly GUI, generating a species distribution report, and other useful statistics, (ii) custom train deep learning models using publicly available and custom training data, (iii) achieve supervised auto-annotation of images for further training, with the benefit of editing annotations to ensure quality datasets. Broad adoption of U-Infuse by ecological practitioners will improve ecological image analysis and processing by allowing significantly more image data to be processed with minimal expenditure of time and resources, particularly for camera trap images. Ease of training and use of transfer learning means domain-specific models can be trained rapidly, and frequently updated without the need for computer science expertise, or data sharing, protecting intellectual property and privacy.

Download Full-text

Data Augmentation Methods Applying Grayscale Images for Convolutional Neural Networks in Machine Vision

Applied Sciences ◽

10.3390/app11156721 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6721

Author(s):

Jinyeong Wang ◽

Sanghwan Lee

Keyword(s):

Neural Networks ◽

Machine Vision ◽

Object Detection ◽

Image Classification ◽

Convolutional Neural Networks ◽

Data Augmentation ◽

Image Data ◽

Manufacturing Productivity ◽

Smart Factories ◽

Grayscale Images

In increasing manufacturing productivity with automated surface inspection in smart factories, the demand for machine vision is rising. Recently, convolutional neural networks (CNNs) have demonstrated outstanding performance and solved many problems in the field of computer vision. With that, many machine vision systems adopt CNNs to surface defect inspection. In this study, we developed an effective data augmentation method for grayscale images in CNN-based machine vision with mono cameras. Our method can apply to grayscale industrial images, and we demonstrated outstanding performance in the image classification and the object detection tasks. The main contributions of this study are as follows: (1) We propose a data augmentation method that can be performed when training CNNs with industrial images taken with mono cameras. (2) We demonstrate that image classification or object detection performance is better when training with the industrial image data augmented by the proposed method. Through the proposed method, many machine-vision-related problems using mono cameras can be effectively solved by using CNNs.

Download Full-text

Automatic Roadway Features Detection with Oriented Object Detection

Applied Sciences ◽

10.3390/app11083531 ◽

2021 ◽

Vol 11 (8) ◽

pp. 3531

Author(s):

Hesham M. Eraqi ◽

Karim Soliman ◽

Dalia Said ◽

Omar R. Elezaby ◽

Mohamed N. Moustafa ◽

...

Keyword(s):

Object Detection ◽

Safety Evaluation ◽

Autonomous Driving ◽

Detection Accuracy ◽

The Road ◽

Detection Model ◽

Detection Approach ◽

Roadway Safety ◽

Safety Features ◽

Oriented Object

Extensive research efforts have been devoted to identify and improve roadway features that impact safety. Maintaining roadway safety features relies on costly manual operations of regular road surveying and data analysis. This paper introduces an automatic roadway safety features detection approach, which harnesses the potential of artificial intelligence (AI) computer vision to make the process more efficient and less costly. Given a front-facing camera and a global positioning system (GPS) sensor, the proposed system automatically evaluates ten roadway safety features. The system is composed of an oriented (or rotated) object detection model, which solves an orientation encoding discontinuity problem to improve detection accuracy, and a rule-based roadway safety evaluation module. To train and validate the proposed model, a fully-annotated dataset for roadway safety features extraction was collected covering 473 km of roads. The proposed method baseline results are found encouraging when compared to the state-of-the-art models. Different oriented object detection strategies are presented and discussed, and the developed model resulted in improving the mean average precision (mAP) by 16.9% when compared with the literature. The roadway safety feature average prediction accuracy is 84.39% and ranges between 91.11% and 63.12%. The introduced model can pervasively enable/disable autonomous driving (AD) based on safety features of the road; and empower connected vehicles (CV) to send and receive estimated safety features, alerting drivers about black spots or relatively less-safe segments or roads.

Download Full-text

Augmenting Crop Detection for Precision Agriculture with Deep Visual Transfer Learning—A Case Study of Bale Detection

Remote Sensing ◽

10.3390/rs13010023 ◽

2020 ◽

Vol 13 (1) ◽

pp. 23

Author(s):

Wei Zhao ◽

William Yamada ◽

Tianxin Li ◽

Matthew Digman ◽

Troy Runge

Keyword(s):

Object Detection ◽

Transfer Learning ◽

Precision Agriculture ◽

Crop Production ◽

Domain Adaptation ◽

Training Data ◽

Detection Accuracy ◽

Detection Model ◽

Agriculture Products

In recent years, precision agriculture has been researched to increase crop production with less inputs, as a promising means to meet the growing demand of agriculture products. Computer vision-based crop detection with unmanned aerial vehicle (UAV)-acquired images is a critical tool for precision agriculture. However, object detection using deep learning algorithms rely on a significant amount of manually prelabeled training datasets as ground truths. Field object detection, such as bales, is especially difficult because of (1) long-period image acquisitions under different illumination conditions and seasons; (2) limited existing prelabeled data; and (3) few pretrained models and research as references. This work increases the bale detection accuracy based on limited data collection and labeling, by building an innovative algorithms pipeline. First, an object detection model is trained using 243 images captured with good illimitation conditions in fall from the crop lands. In addition, domain adaptation (DA), a kind of transfer learning, is applied for synthesizing the training data under diverse environmental conditions with automatic labels. Finally, the object detection model is optimized with the synthesized datasets. The case study shows the proposed method improves the bale detecting performance, including the recall, mean average precision (mAP), and F measure (F1 score), from averages of 0.59, 0.7, and 0.7 (the object detection) to averages of 0.93, 0.94, and 0.89 (the object detection + DA), respectively. This approach could be easily scaled to many other crop field objects and will significantly contribute to precision agriculture.

Download Full-text