scholarly journals Development of a Wearable Camera and AI Algorithm for Medication Behavior Recognition

Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3594
Author(s):  
Hwiwon Lee ◽  
Sekyoung Youm

As many as 40% to 50% of patients do not adhere to long-term medications for managing chronic conditions, such as diabetes or hypertension. Limited opportunity for medication monitoring is a major problem from the perspective of health professionals. The availability of prompt medication error reports can enable health professionals to provide immediate interventions for patients. Furthermore, it can enable clinical researchers to modify experiments easily and predict health levels based on medication compliance. This study proposes a method in which videos of patients taking medications are recorded using a camera image sensor integrated into a wearable device. The collected data are used as a training dataset based on applying the latest convolutional neural network (CNN) technique. As for an artificial intelligence (AI) algorithm to analyze the medication behavior, we constructed an object detection model (Model 1) using the faster region-based CNN technique and a second model that uses the combined feature values to perform action recognition (Model 2). Moreover, 50,000 image data were collected from 89 participants, and labeling was performed on different data categories to train the algorithm. The experimental combination of the object detection model (Model 1) and action recognition model (Model 2) was newly developed, and the accuracy was 92.7%, which is significantly high for medication behavior recognition. This study is expected to enable rapid intervention for providers seeking to treat patients through rapid reporting of drug errors.

2020 ◽  
Vol 12 (6) ◽  
pp. 1014
Author(s):  
Jingchao Jiang ◽  
Cheng-Zhi Qin ◽  
Juan Yu ◽  
Changxiu Cheng ◽  
Junzhi Liu ◽  
...  

Reference objects in video images can be used to indicate urban waterlogging depths. The detection of reference objects is the key step to obtain waterlogging depths from video images. Object detection models with convolutional neural networks (CNNs) have been utilized to detect reference objects. These models require a large number of labeled images as the training data to ensure the applicability at a city scale. However, it is hard to collect a sufficient number of urban flooding images containing valuable reference objects, and manually labeling images is time-consuming and expensive. To solve the problem, we present a method to synthesize image data as the training data. Firstly, original images containing reference objects and original images with water surfaces are collected from open data sources, and reference objects and water surfaces are cropped from these original images. Secondly, the reference objects and water surfaces are further enriched via data augmentation techniques to ensure the diversity. Finally, the enriched reference objects and water surfaces are combined to generate a synthetic image dataset with annotations. The synthetic image dataset is further used for training an object detection model with CNN. The waterlogging depths are calculated based on the reference objects detected by the trained model. A real video dataset and an artificial image dataset are used to evaluate the effectiveness of the proposed method. The results show that the detection model trained using the synthetic image dataset can effectively detect reference objects from images, and it can achieve acceptable accuracies of waterlogging depths based on the detected reference objects. The proposed method has the potential to monitor waterlogging depths at a city scale.


2020 ◽  
Vol 2020 (12) ◽  
pp. 172-1-172-7 ◽  
Author(s):  
Tejaswini Ananthanarayana ◽  
Raymond Ptucha ◽  
Sean C. Kelly

CMOS Image sensors play a vital role in the exponentially growing field of Artificial Intelligence (AI). Applications like image classification, object detection and tracking are just some of the many problems now solved with the help of AI, and specifically deep learning. In this work, we target image classification to discern between six categories of fruits — fresh/ rotten apples, fresh/ rotten oranges, fresh/ rotten bananas. Using images captured from high speed CMOS sensors along with lightweight CNN architectures, we show the results on various edge platforms. Specifically, we show results using ON Semiconductor’s global-shutter based, 12MP, 90 frame per second image sensor (XGS-12), and ON Semiconductor’s 13 MP AR1335 image sensor feeding into MobileNetV2, implemented on NVIDIA Jetson platforms. In addition to using the data captured with these sensors, we utilize an open-source fruits dataset to increase the number of training images. For image classification, we train our model on approximately 30,000 RGB images from the six categories of fruits. The model achieves an accuracy of 97% on edge platforms using ON Semiconductor’s 13 MP camera with AR1335 sensor. In addition to the image classification model, work is currently in progress to improve the accuracy of object detection using SSD and SSDLite with MobileNetV2 as the feature extractor. In this paper, we show preliminary results on the object detection model for the same six categories of fruits.


2020 ◽  
Vol 10 (13) ◽  
pp. 4519
Author(s):  
JooYeol Yun ◽  
JungWoo Oh ◽  
IlDong Yun

We propose a method for effectively utilizing weakly annotated image data in an object detection tasks of breast ultrasound images. Given the problem setting where a small, strongly annotated dataset and a large, weakly annotated dataset with no bounding box information are available, training an object detection model becomes a non-trivial problem. We suggest a controlled weight for handling the effect of weakly annotated images in a two stage object detection model. We also present a subsequent active learning scheme for safely assigning weakly annotated images a strong annotation using the trained model. Experimental results showed a 24% point increase in correct localization (CorLoc) measure, which is the ratio of correctly localized and classified images, by assigning the properly controlled weight. Performing active learning after a model is trained showed an additional increase in CorLoc. We tested the proposed method on the Stanford Dog datasets to assure that it can be applied to general cases, where strong annotations are insufficient to obtain resembling results. The presented method showed that higher performance is achievable with lesser annotation effort.


2020 ◽  
Author(s):  
Loïc Dutrieux ◽  
Radhouene Azzabi ◽  
Sébastien Bauwens ◽  
Ulrich Gaël Bouka Dipelet ◽  
Olivier Chenoz ◽  
...  

<p>As part of a project aiming to support FSC certified logging concessions in their tasks of forest inventory and management, we collected aerial imagery over 9000 ha of tropical forests in Northern Congo using long range Unmanned Aerial Vehicles (UAVs). Once processed into orthomosaics, the aerial imagery is used in combination with reference training samples to train a deep learning object detection model (FasterRCNN) capable of detecting and predicting tree species. The remoteness and diversity of these forests make both data acquisition and generation of a training dataset challenging. Unlike natural images containing common objects like cars, bicycles, cats and dogs, there is no easy way to create a training dataset of tree species from overhead imagery of tropical forests. The first reason is that a human operator cannot as easily recognize and label objects. The second reason is that the polymorphism of tree species, phenological variations and uncertainty associated with visual recognition makes the exhaustive labeling of all instances of each class very difficult. Such exhaustive labeling is required to successfully train any object detection model. To overcome these challenges we built an interactive and ergonomic interface that allows a human operator to work in a spatial context, being guided by the approximate geographic location of already inventoried trees. We solved the issue of non-exhaustive instance labeling by building synthetic images, hence allowing full control of the training data. In addition to these specific developments related to training data generation, we will present details of the UAV missions, modelling results on synthetic images, and finally preliminary results of model transfer to aerial imagery.</p>


2021 ◽  
Vol 11 (10) ◽  
pp. 4426
Author(s):  
Chunyan Ma ◽  
Ji Fan ◽  
Jinghao Yao ◽  
Tao Zhang

Computer vision-based action recognition of basketball players in basketball training and competition has gradually become a research hotspot. However, owing to the complex technical action, diverse background, and limb occlusion, it remains a challenging task without effective solutions or public dataset benchmarks. In this study, we defined 32 kinds of atomic actions covering most of the complex actions for basketball players and built the dataset NPU RGB+D (a large scale dataset of basketball action recognition with RGB image data and Depth data captured in Northwestern Polytechnical University) for 12 kinds of actions of 10 professional basketball players with 2169 RGB+D videos and 75 thousand frames, including RGB frame sequences, depth maps, and skeleton coordinates. Through extracting the spatial features of the distances and angles between the joint points of basketball players, we created a new feature-enhanced skeleton-based method called LSTM-DGCN for basketball player action recognition based on the deep graph convolutional network (DGCN) and long short-term memory (LSTM) methods. Many advanced action recognition methods were evaluated on our dataset and compared with our proposed method. The experimental results show that the NPU RGB+D dataset is very competitive with the current action recognition algorithms and that our LSTM-DGCN outperforms the state-of-the-art action recognition methods in various evaluation criteria on our dataset. Our action classifications and this NPU RGB+D dataset are valuable for basketball player action recognition techniques. The feature-enhanced LSTM-DGCN has a more accurate action recognition effect, which improves the motion expression ability of the skeleton data.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2611
Author(s):  
Andrew Shepley ◽  
Greg Falzon ◽  
Christopher Lawson ◽  
Paul Meek ◽  
Paul Kwan

Image data is one of the primary sources of ecological data used in biodiversity conservation and management worldwide. However, classifying and interpreting large numbers of images is time and resource expensive, particularly in the context of camera trapping. Deep learning models have been used to achieve this task but are often not suited to specific applications due to their inability to generalise to new environments and inconsistent performance. Models need to be developed for specific species cohorts and environments, but the technical skills required to achieve this are a key barrier to the accessibility of this technology to ecologists. Thus, there is a strong need to democratize access to deep learning technologies by providing an easy-to-use software application allowing non-technical users to train custom object detectors. U-Infuse addresses this issue by providing ecologists with the ability to train customised models using publicly available images and/or their own images without specific technical expertise. Auto-annotation and annotation editing functionalities minimize the constraints of manually annotating and pre-processing large numbers of images. U-Infuse is a free and open-source software solution that supports both multiclass and single class training and object detection, allowing ecologists to access deep learning technologies usually only available to computer scientists, on their own device, customised for their application, without sharing intellectual property or sensitive data. It provides ecological practitioners with the ability to (i) easily achieve object detection within a user-friendly GUI, generating a species distribution report, and other useful statistics, (ii) custom train deep learning models using publicly available and custom training data, (iii) achieve supervised auto-annotation of images for further training, with the benefit of editing annotations to ensure quality datasets. Broad adoption of U-Infuse by ecological practitioners will improve ecological image analysis and processing by allowing significantly more image data to be processed with minimal expenditure of time and resources, particularly for camera trap images. Ease of training and use of transfer learning means domain-specific models can be trained rapidly, and frequently updated without the need for computer science expertise, or data sharing, protecting intellectual property and privacy.


2021 ◽  
Vol 11 (15) ◽  
pp. 6721
Author(s):  
Jinyeong Wang ◽  
Sanghwan Lee

In increasing manufacturing productivity with automated surface inspection in smart factories, the demand for machine vision is rising. Recently, convolutional neural networks (CNNs) have demonstrated outstanding performance and solved many problems in the field of computer vision. With that, many machine vision systems adopt CNNs to surface defect inspection. In this study, we developed an effective data augmentation method for grayscale images in CNN-based machine vision with mono cameras. Our method can apply to grayscale industrial images, and we demonstrated outstanding performance in the image classification and the object detection tasks. The main contributions of this study are as follows: (1) We propose a data augmentation method that can be performed when training CNNs with industrial images taken with mono cameras. (2) We demonstrate that image classification or object detection performance is better when training with the industrial image data augmented by the proposed method. Through the proposed method, many machine-vision-related problems using mono cameras can be effectively solved by using CNNs.


2021 ◽  
Vol 11 (8) ◽  
pp. 3531
Author(s):  
Hesham M. Eraqi ◽  
Karim Soliman ◽  
Dalia Said ◽  
Omar R. Elezaby ◽  
Mohamed N. Moustafa ◽  
...  

Extensive research efforts have been devoted to identify and improve roadway features that impact safety. Maintaining roadway safety features relies on costly manual operations of regular road surveying and data analysis. This paper introduces an automatic roadway safety features detection approach, which harnesses the potential of artificial intelligence (AI) computer vision to make the process more efficient and less costly. Given a front-facing camera and a global positioning system (GPS) sensor, the proposed system automatically evaluates ten roadway safety features. The system is composed of an oriented (or rotated) object detection model, which solves an orientation encoding discontinuity problem to improve detection accuracy, and a rule-based roadway safety evaluation module. To train and validate the proposed model, a fully-annotated dataset for roadway safety features extraction was collected covering 473 km of roads. The proposed method baseline results are found encouraging when compared to the state-of-the-art models. Different oriented object detection strategies are presented and discussed, and the developed model resulted in improving the mean average precision (mAP) by 16.9% when compared with the literature. The roadway safety feature average prediction accuracy is 84.39% and ranges between 91.11% and 63.12%. The introduced model can pervasively enable/disable autonomous driving (AD) based on safety features of the road; and empower connected vehicles (CV) to send and receive estimated safety features, alerting drivers about black spots or relatively less-safe segments or roads.


2020 ◽  
Vol 13 (1) ◽  
pp. 23
Author(s):  
Wei Zhao ◽  
William Yamada ◽  
Tianxin Li ◽  
Matthew Digman ◽  
Troy Runge

In recent years, precision agriculture has been researched to increase crop production with less inputs, as a promising means to meet the growing demand of agriculture products. Computer vision-based crop detection with unmanned aerial vehicle (UAV)-acquired images is a critical tool for precision agriculture. However, object detection using deep learning algorithms rely on a significant amount of manually prelabeled training datasets as ground truths. Field object detection, such as bales, is especially difficult because of (1) long-period image acquisitions under different illumination conditions and seasons; (2) limited existing prelabeled data; and (3) few pretrained models and research as references. This work increases the bale detection accuracy based on limited data collection and labeling, by building an innovative algorithms pipeline. First, an object detection model is trained using 243 images captured with good illimitation conditions in fall from the crop lands. In addition, domain adaptation (DA), a kind of transfer learning, is applied for synthesizing the training data under diverse environmental conditions with automatic labels. Finally, the object detection model is optimized with the synthesized datasets. The case study shows the proposed method improves the bale detecting performance, including the recall, mean average precision (mAP), and F measure (F1 score), from averages of 0.59, 0.7, and 0.7 (the object detection) to averages of 0.93, 0.94, and 0.89 (the object detection + DA), respectively. This approach could be easily scaled to many other crop field objects and will significantly contribute to precision agriculture.


Sign in / Sign up

Export Citation Format

Share Document