Astronaut Visual Tracking of Flying Assistant Robot in Space Station Based on Deep Learning and Probabilistic Model

International Journal of Aerospace Engineering ◽

10.1155/2018/6357185 ◽

2018 ◽

Vol 2018 ◽

pp. 1-17 ◽

Cited By ~ 5

Author(s):

Rui Zhang ◽

Zhaokui Wang ◽

Yulin Zhang

Keyword(s):

Feature Extraction ◽

Deep Learning ◽

Visual Tracking ◽

Probabilistic Model ◽

Color Image ◽

Space Station ◽

Depth Image ◽

Single Shot ◽

Occlusion Detection ◽

Robust Tracking

Real-time astronaut visual tracking is the most important prerequisite for flying assistant robot to follow and assist the served astronaut in the space station. In this paper, an astronaut visual tracking algorithm which is based on deep learning and probabilistic model is proposed. Fine-tuned with feature extraction layers’ parameters being initialized by ready-made model, an improved SSD (Single Shot Multibox Detector) network was proposed for robust astronaut detection in color image. Associating the detection results with synchronized depth image measured by RGB-D camera, a probabilistic model is presented to ensure accurate and consecutive tracking of the certain served astronaut. The algorithm runs 10 fps at Jetson TX2, and it was extensively validated by several datasets which contain most instances of astronaut activities. The experimental results indicate that our proposed algorithm achieves not only robust tracking of the specified person with diverse postures or dressings but also effective occlusion detection for avoiding mistaken tracking.

Download Full-text

Manipulator grabbing position detection with information fusion of color image and depth image using deep learning

Journal of Ambient Intelligence and Humanized Computing ◽

10.1007/s12652-020-02843-w ◽

2021 ◽

Author(s):

Du Jiang ◽

Gongfa Li ◽

Ying Sun ◽

Jiabing Hu ◽

Juntong Yun ◽

...

Keyword(s):

Deep Learning ◽

Information Fusion ◽

Color Image ◽

Depth Image ◽

Position Detection

Download Full-text

Depth Image–Based Deep Learning of Grasp Planning for Textureless Planar-Faced Objects in Vision-Guided Robotic Bin-Picking

Sensors ◽

10.3390/s20030706 ◽

2020 ◽

Vol 20 (3) ◽

pp. 706 ◽

Cited By ~ 4

Author(s):

Ping Jiang ◽

Yoshiyuki Ishihara ◽

Nobukatsu Sugiyama ◽

Junji Oaki ◽

Seiji Tokura ◽

...

Keyword(s):

Feature Extraction ◽

Degrees Of Freedom ◽

Color Image ◽

Texture Features ◽

Depth Image ◽

Feature Descriptor ◽

Grasp Planning ◽

Depth Images ◽

Bin Picking ◽

Picking System

Bin-picking of small parcels and other textureless planar-faced objects is a common task at warehouses. A general color image–based vision-guided robot picking system requires feature extraction and goal image preparation of various objects. However, feature extraction for goal image matching is difficult for textureless objects. Further, prior preparation of huge numbers of goal images is impractical at a warehouse. In this paper, we propose a novel depth image–based vision-guided robot bin-picking system for textureless planar-faced objects. Our method uses a deep convolutional neural network (DCNN) model that is trained on 15,000 annotated depth images synthetically generated in a physics simulator to directly predict grasp points without object segmentation. Unlike previous studies that predicted grasp points for a robot suction hand with only one vacuum cup, our DCNN also predicts optimal grasp patterns for a hand with two vacuum cups (left cup on, right cup on, or both cups on). Further, we propose a surface feature descriptor to extract surface features (center position and normal) and refine the predicted grasp point position, removing the need for texture features for vision-guided robot control and sim-to-real modification for DCNN model training. Experimental results demonstrate the efficiency of our system, namely that a robot with 7 degrees of freedom can pick randomly posed textureless boxes in a cluttered environment with a 97.5% success rate at speeds exceeding 1000 pieces per hour.

Download Full-text

Smart teaching mode based on particle swarm image recognition and human-computer interaction deep learning

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189048 ◽

2020 ◽

Vol 39 (4) ◽

pp. 5699-5711

Author(s):

Shirong Long ◽

Xuekong Zhao

Keyword(s):

Feature Extraction ◽

Particle Swarm Optimization ◽

Deep Learning ◽

Real Time ◽

Image Recognition ◽

Particle Swarm ◽

Learning Technology ◽

Search Performance ◽

Swarm Optimization ◽

Teaching Mode

The smart teaching mode overcomes the shortcomings of traditional teaching online and offline, but there are certain deficiencies in the real-time feature extraction of teachers and students. In view of this, this study uses the particle swarm image recognition and deep learning technology to process the intelligent classroom video teaching image and extracts the classroom task features in real time and sends them to the teacher. In order to overcome the shortcomings of the premature convergence of the standard particle swarm optimization algorithm, an improved strategy for multiple particle swarm optimization algorithms is proposed. In order to improve the premature problem in the search performance algorithm of PSO algorithm, this paper combines the algorithm with the useful attributes of other algorithms to improve the particle diversity in the algorithm, enhance the global search ability of the particle, and achieve effective feature extraction. The research indicates that the method proposed in this paper has certain practical effects and can provide theoretical reference for subsequent related research.

Download Full-text

Barcode Detection and Classification using SSD (Single Shot Multibox Detector) Deep Learning Algorithm

SSRN Electronic Journal ◽

10.2139/ssrn.3568499 ◽

2020 ◽

Author(s):

Akshata Kolekar ◽

Vipul Dalal

Keyword(s):

Deep Learning ◽

Learning Algorithm ◽

Single Shot ◽

Deep Learning Algorithm

Download Full-text

Multi Disease-Prediction Framework Using Hybrid Deep Learning: An Optimal Prediction Model (Preprint)

10.2196/preprints.22865 ◽

2020 ◽

Author(s):

Anusha Ampavathi ◽

Vijaya Saradhi T

Keyword(s):

Feature Extraction ◽

Big Data ◽

Deep Learning ◽

Weight Function ◽

Optimization Algorithm ◽

Large Scale ◽

Heuristic Algorithms ◽

Disease Prediction ◽

Health Care Decisions ◽

Proposed Model

UNSTRUCTURED Big data and its approaches are generally helpful for healthcare and biomedical sectors for predicting the disease. For trivial symptoms, the difficulty is to meet the doctors at any time in the hospital. Thus, big data provides essential data regarding the diseases on the basis of the patient’s symptoms. For several medical organizations, disease prediction is important for making the best feasible health care decisions. Conversely, the conventional medical care model offers input as structured that requires more accurate and consistent prediction. This paper is planned to develop the multi-disease prediction using the improvised deep learning concept. Here, the different datasets pertain to “Diabetes, Hepatitis, lung cancer, liver tumor, heart disease, Parkinson’s disease, and Alzheimer’s disease”, from the benchmark UCI repository is gathered for conducting the experiment. The proposed model involves three phases (a) Data normalization (b) Weighted normalized feature extraction, and (c) prediction. Initially, the dataset is normalized in order to make the attribute's range at a certain level. Further, weighted feature extraction is performed, in which a weight function is multiplied with each attribute value for making large scale deviation. Here, the weight function is optimized using the combination of two meta-heuristic algorithms termed as Jaya Algorithm-based Multi-Verse Optimization algorithm (JA-MVO). The optimally extracted features are subjected to the hybrid deep learning algorithms like “Deep Belief Network (DBN) and Recurrent Neural Network (RNN)”. As a modification to hybrid deep learning architecture, the weight of both DBN and RNN is optimized using the same hybrid optimization algorithm. Further, the comparative evaluation of the proposed prediction over the existing models certifies its effectiveness through various performance measures.

Download Full-text

Design of Desktop Audiovisual Entertainment System with Deep Learning and Haptic Sensations

Symmetry ◽

10.3390/sym12101718 ◽

2020 ◽

Vol 12 (10) ◽

pp. 1718

Author(s):

Chien-Hsing Chou ◽

Yu-Sheng Su ◽

Che-Ju Hsu ◽

Kong-Chang Lee ◽

Ping-Hsuan Han

Keyword(s):

Deep Learning ◽

Object Detection ◽

User Experience ◽

Recognition System ◽

Scene Recognition ◽

Single Shot ◽

Auditory Signals ◽

Hot Weather ◽

Viewing Experience ◽

At Home

In this study, we designed a four-dimensional (4D) audiovisual entertainment system called Sense. This system comprises a scene recognition system and hardware modules that provide haptic sensations for users when they watch movies and animations at home. In the scene recognition system, we used Google Cloud Vision to detect common scene elements in a video, such as fire, explosions, wind, and rain, and further determine whether the scene depicts hot weather, rain, or snow. Additionally, for animated videos, we applied deep learning with a single shot multibox detector to detect whether the animated video contained scenes of fire-related objects. The hardware module was designed to provide six types of haptic sensations set as line-symmetry to provide a better user experience. After the system considers the results of object detection via the scene recognition system, the system generates corresponding haptic sensations. The system integrates deep learning, auditory signals, and haptic sensations to provide an enhanced viewing experience.

Download Full-text

Deep Learning for Transient Image Reconstruction from ToF Data

Sensors ◽

10.3390/s21061962 ◽

2021 ◽

Vol 21 (6) ◽

pp. 1962

Author(s):

Enrico Buratto ◽

Adriano Simonetto ◽

Gianluca Agresti ◽

Henrik Schäfer ◽

Pietro Zanuttigh

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Light Response ◽

Real Data ◽

Depth Image ◽

Learning Approach ◽

Multiple Reflections ◽

Noisy Input ◽

Novel Approach ◽

Incoming Light

In this work, we propose a novel approach for correcting multi-path interference (MPI) in Time-of-Flight (ToF) cameras by estimating the direct and global components of the incoming light. MPI is an error source linked to the multiple reflections of light inside a scene; each sensor pixel receives information coming from different light paths which generally leads to an overestimation of the depth. We introduce a novel deep learning approach, which estimates the structure of the time-dependent scene impulse response and from it recovers a depth image with a reduced amount of MPI. The model consists of two main blocks: a predictive model that learns a compact encoded representation of the backscattering vector from the noisy input data and a fixed backscattering model which translates the encoded representation into the high dimensional light response. Experimental results on real data show the effectiveness of the proposed approach, which reaches state-of-the-art performances.

Download Full-text

A super-resolution method of combined color image with depth map based on deep learning

Proceedings of the 2020 International Conference on Cyberspace Innovation of Advanced Technologies ◽

10.1145/3444370.3444539 ◽

2020 ◽

Author(s):

Wei Zhang ◽

Chi-fu Yang ◽

Feng Jiang ◽

Xian-zhong Gao ◽

Kai Yang

Keyword(s):

Deep Learning ◽

Color Image ◽

Super Resolution ◽

Depth Map ◽

Resolution Method

Download Full-text

Deep learning and evolutionary intelligence with fusion-based feature extraction for detection of COVID-19 from chest X-ray images

Multimedia Systems ◽

10.1007/s00530-021-00800-x ◽

2021 ◽

Author(s):

K. Shankar ◽

Eswaran Perumal ◽

Prayag Tiwari ◽

Mohammad Shorfuzzaman ◽

Deepak Gupta

Keyword(s):

Feature Extraction ◽

Deep Learning ◽

X Ray ◽

Chest X Ray ◽

Evolutionary Intelligence

Download Full-text

Auto-Colorization of Historical Images Using Deep Convolutional Neural Networks

Mathematics ◽

10.3390/math8122258 ◽

2020 ◽

Vol 8 (12) ◽

pp. 2258

Author(s):

Madhab Raj Joshi ◽

Lewis Nkenyereye ◽

Gyanendra Prasad Joshi ◽

S. M. Riazul Islam ◽

Mohammad Abdullah-Al-Wadud ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

User Study ◽

Mean Squared Error ◽

Color Image ◽

Machine Learning Techniques ◽

Global Features ◽

Black And White ◽

Historical Images ◽

Learning Techniques

Enhancement of Cultural Heritage such as historical images is very crucial to safeguard the diversity of cultures. Automated colorization of black and white images has been subject to extensive research through computer vision and machine learning techniques. Our research addresses the problem of generating a plausible colored photograph of ancient, historically black, and white images of Nepal using deep learning techniques without direct human intervention. Motivated by the recent success of deep learning techniques in image processing, a feed-forward, deep Convolutional Neural Network (CNN) in combination with Inception- ResnetV2 is being trained by sets of sample images using back-propagation to recognize the pattern in RGB and grayscale values. The trained neural network is then used to predict two a* and b* chroma channels given grayscale, L channel of test images. CNN vividly colorizes images with the help of the fusion layer accounting for local features as well as global features. Two objective functions, namely, Mean Squared Error (MSE) and Peak Signal-to-Noise Ratio (PSNR), are employed for objective quality assessment between the estimated color image and its ground truth. The model is trained on the dataset created by ourselves with 1.2 K historical images comprised of old and ancient photographs of Nepal, each having 256 × 256 resolution. The loss i.e., MSE, PSNR, and accuracy of the model are found to be 6.08%, 34.65 dB, and 75.23%, respectively. Other than presenting the training results, the public acceptance or subjective validation of the generated images is assessed by means of a user study where the model shows 41.71% of naturalness while evaluating colorization results.

Download Full-text