scholarly journals Astronaut Visual Tracking of Flying Assistant Robot in Space Station Based on Deep Learning and Probabilistic Model

2018 ◽  
Vol 2018 ◽  
pp. 1-17 ◽  
Author(s):  
Rui Zhang ◽  
Zhaokui Wang ◽  
Yulin Zhang

Real-time astronaut visual tracking is the most important prerequisite for flying assistant robot to follow and assist the served astronaut in the space station. In this paper, an astronaut visual tracking algorithm which is based on deep learning and probabilistic model is proposed. Fine-tuned with feature extraction layers’ parameters being initialized by ready-made model, an improved SSD (Single Shot Multibox Detector) network was proposed for robust astronaut detection in color image. Associating the detection results with synchronized depth image measured by RGB-D camera, a probabilistic model is presented to ensure accurate and consecutive tracking of the certain served astronaut. The algorithm runs 10 fps at Jetson TX2, and it was extensively validated by several datasets which contain most instances of astronaut activities. The experimental results indicate that our proposed algorithm achieves not only robust tracking of the specified person with diverse postures or dressings but also effective occlusion detection for avoiding mistaken tracking.

Sensors ◽  
2020 ◽  
Vol 20 (3) ◽  
pp. 706 ◽  
Author(s):  
Ping Jiang ◽  
Yoshiyuki Ishihara ◽  
Nobukatsu Sugiyama ◽  
Junji Oaki ◽  
Seiji Tokura ◽  
...  

Bin-picking of small parcels and other textureless planar-faced objects is a common task at warehouses. A general color image–based vision-guided robot picking system requires feature extraction and goal image preparation of various objects. However, feature extraction for goal image matching is difficult for textureless objects. Further, prior preparation of huge numbers of goal images is impractical at a warehouse. In this paper, we propose a novel depth image–based vision-guided robot bin-picking system for textureless planar-faced objects. Our method uses a deep convolutional neural network (DCNN) model that is trained on 15,000 annotated depth images synthetically generated in a physics simulator to directly predict grasp points without object segmentation. Unlike previous studies that predicted grasp points for a robot suction hand with only one vacuum cup, our DCNN also predicts optimal grasp patterns for a hand with two vacuum cups (left cup on, right cup on, or both cups on). Further, we propose a surface feature descriptor to extract surface features (center position and normal) and refine the predicted grasp point position, removing the need for texture features for vision-guided robot control and sim-to-real modification for DCNN model training. Experimental results demonstrate the efficiency of our system, namely that a robot with 7 degrees of freedom can pick randomly posed textureless boxes in a cluttered environment with a 97.5% success rate at speeds exceeding 1000 pieces per hour.


2020 ◽  
Vol 39 (4) ◽  
pp. 5699-5711
Author(s):  
Shirong Long ◽  
Xuekong Zhao

The smart teaching mode overcomes the shortcomings of traditional teaching online and offline, but there are certain deficiencies in the real-time feature extraction of teachers and students. In view of this, this study uses the particle swarm image recognition and deep learning technology to process the intelligent classroom video teaching image and extracts the classroom task features in real time and sends them to the teacher. In order to overcome the shortcomings of the premature convergence of the standard particle swarm optimization algorithm, an improved strategy for multiple particle swarm optimization algorithms is proposed. In order to improve the premature problem in the search performance algorithm of PSO algorithm, this paper combines the algorithm with the useful attributes of other algorithms to improve the particle diversity in the algorithm, enhance the global search ability of the particle, and achieve effective feature extraction. The research indicates that the method proposed in this paper has certain practical effects and can provide theoretical reference for subsequent related research.


2020 ◽  
Author(s):  
Anusha Ampavathi ◽  
Vijaya Saradhi T

UNSTRUCTURED Big data and its approaches are generally helpful for healthcare and biomedical sectors for predicting the disease. For trivial symptoms, the difficulty is to meet the doctors at any time in the hospital. Thus, big data provides essential data regarding the diseases on the basis of the patient’s symptoms. For several medical organizations, disease prediction is important for making the best feasible health care decisions. Conversely, the conventional medical care model offers input as structured that requires more accurate and consistent prediction. This paper is planned to develop the multi-disease prediction using the improvised deep learning concept. Here, the different datasets pertain to “Diabetes, Hepatitis, lung cancer, liver tumor, heart disease, Parkinson’s disease, and Alzheimer’s disease”, from the benchmark UCI repository is gathered for conducting the experiment. The proposed model involves three phases (a) Data normalization (b) Weighted normalized feature extraction, and (c) prediction. Initially, the dataset is normalized in order to make the attribute's range at a certain level. Further, weighted feature extraction is performed, in which a weight function is multiplied with each attribute value for making large scale deviation. Here, the weight function is optimized using the combination of two meta-heuristic algorithms termed as Jaya Algorithm-based Multi-Verse Optimization algorithm (JA-MVO). The optimally extracted features are subjected to the hybrid deep learning algorithms like “Deep Belief Network (DBN) and Recurrent Neural Network (RNN)”. As a modification to hybrid deep learning architecture, the weight of both DBN and RNN is optimized using the same hybrid optimization algorithm. Further, the comparative evaluation of the proposed prediction over the existing models certifies its effectiveness through various performance measures.


Symmetry ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 1718
Author(s):  
Chien-Hsing Chou ◽  
Yu-Sheng Su ◽  
Che-Ju Hsu ◽  
Kong-Chang Lee ◽  
Ping-Hsuan Han

In this study, we designed a four-dimensional (4D) audiovisual entertainment system called Sense. This system comprises a scene recognition system and hardware modules that provide haptic sensations for users when they watch movies and animations at home. In the scene recognition system, we used Google Cloud Vision to detect common scene elements in a video, such as fire, explosions, wind, and rain, and further determine whether the scene depicts hot weather, rain, or snow. Additionally, for animated videos, we applied deep learning with a single shot multibox detector to detect whether the animated video contained scenes of fire-related objects. The hardware module was designed to provide six types of haptic sensations set as line-symmetry to provide a better user experience. After the system considers the results of object detection via the scene recognition system, the system generates corresponding haptic sensations. The system integrates deep learning, auditory signals, and haptic sensations to provide an enhanced viewing experience.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 1962
Author(s):  
Enrico Buratto ◽  
Adriano Simonetto ◽  
Gianluca Agresti ◽  
Henrik Schäfer ◽  
Pietro Zanuttigh

In this work, we propose a novel approach for correcting multi-path interference (MPI) in Time-of-Flight (ToF) cameras by estimating the direct and global components of the incoming light. MPI is an error source linked to the multiple reflections of light inside a scene; each sensor pixel receives information coming from different light paths which generally leads to an overestimation of the depth. We introduce a novel deep learning approach, which estimates the structure of the time-dependent scene impulse response and from it recovers a depth image with a reduced amount of MPI. The model consists of two main blocks: a predictive model that learns a compact encoded representation of the backscattering vector from the noisy input data and a fixed backscattering model which translates the encoded representation into the high dimensional light response. Experimental results on real data show the effectiveness of the proposed approach, which reaches state-of-the-art performances.


Mathematics ◽  
2020 ◽  
Vol 8 (12) ◽  
pp. 2258
Author(s):  
Madhab Raj Joshi ◽  
Lewis Nkenyereye ◽  
Gyanendra Prasad Joshi ◽  
S. M. Riazul Islam ◽  
Mohammad Abdullah-Al-Wadud ◽  
...  

Enhancement of Cultural Heritage such as historical images is very crucial to safeguard the diversity of cultures. Automated colorization of black and white images has been subject to extensive research through computer vision and machine learning techniques. Our research addresses the problem of generating a plausible colored photograph of ancient, historically black, and white images of Nepal using deep learning techniques without direct human intervention. Motivated by the recent success of deep learning techniques in image processing, a feed-forward, deep Convolutional Neural Network (CNN) in combination with Inception- ResnetV2 is being trained by sets of sample images using back-propagation to recognize the pattern in RGB and grayscale values. The trained neural network is then used to predict two a* and b* chroma channels given grayscale, L channel of test images. CNN vividly colorizes images with the help of the fusion layer accounting for local features as well as global features. Two objective functions, namely, Mean Squared Error (MSE) and Peak Signal-to-Noise Ratio (PSNR), are employed for objective quality assessment between the estimated color image and its ground truth. The model is trained on the dataset created by ourselves with 1.2 K historical images comprised of old and ancient photographs of Nepal, each having 256 × 256 resolution. The loss i.e., MSE, PSNR, and accuracy of the model are found to be 6.08%, 34.65 dB, and 75.23%, respectively. Other than presenting the training results, the public acceptance or subjective validation of the generated images is assessed by means of a user study where the model shows 41.71% of naturalness while evaluating colorization results.


Sign in / Sign up

Export Citation Format

Share Document