Hand Depth Image Denoising and Superresolution via Noise-Aware Dictionaries

Deep Learning Based Object Recognition Using Physically-Realistic Synthetic Depth Scenes

Machine Learning and Knowledge Extraction ◽

10.3390/make1030051 ◽

2019 ◽

Vol 1 (3) ◽

pp. 883-903 ◽

Cited By ~ 1

Author(s):

Daulet Baimukashev ◽

Alikhan Zhilisbayev ◽

Askat Kuzdeuov ◽

Artemiy Oleinikov ◽

Denis Fadeyev ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Object Recognition ◽

Data Collection ◽

Object Detection ◽

Depth Image ◽

Cluttered Environment ◽

Depth Data ◽

Depth Images ◽

Image Dataset

Recognizing objects and estimating their poses have a wide range of application in robotics. For instance, to grasp objects, robots need the position and orientation of objects in 3D. The task becomes challenging in a cluttered environment with different types of objects. A popular approach to tackle this problem is to utilize a deep neural network for object recognition. However, deep learning-based object detection in cluttered environments requires a substantial amount of data. Collection of these data requires time and extensive human labor for manual labeling. In this study, our objective was the development and validation of a deep object recognition framework using a synthetic depth image dataset. We synthetically generated a depth image dataset of 22 objects randomly placed in a 0.5 m × 0.5 m × 0.1 m box, and automatically labeled all objects with an occlusion rate below 70%. Faster Region Convolutional Neural Network (R-CNN) architecture was adopted for training using a dataset of 800,000 synthetic depth images, and its performance was tested on a real-world depth image dataset consisting of 2000 samples. Deep object recognizer has 40.96% detection accuracy on the real depth images and 93.5% on the synthetic depth images. Training the deep learning model with noise-added synthetic images improves the recognition accuracy for real images to 46.3%. The object detection framework can be trained on synthetically generated depth data, and then employed for object recognition on the real depth data in a cluttered environment. Synthetic depth data-based deep object detection has the potential to substantially decrease the time and human effort required for the extensive data collection and labeling.

Download Full-text

Denoising of image using bilateral filtering in multiresolution

APTIKOM Journal on Computer Science and Information Technologies ◽

10.34306/csit.v3i1.76 ◽

2020 ◽

Vol 3 (1) ◽

pp. 6-12

Author(s):

Alaa Abid Muslam Abid Ali ◽

Mohammed Iqbal Dohan ◽

Saif Khalid Musluh

Keyword(s):

Image Denoising ◽

Bilateral Filtering ◽

Spatial Averaging ◽

Transformation Techniques ◽

Filter Parameter ◽

Filtering Technique ◽

Bilateral Filters ◽

Parameter Values ◽

New Framework ◽

Selection Of

One of the very efficient and resource conservative image processing methodology is with the help of bilateral filters. This technique filters the image without the help of edge smoothing but it does employs spatial averaging in a non-linear way. The filtering technique discussed above is very much dependent on the parameters of its filters. A very slight change in filter parameter values effects the outputs and results in a most drastic manner. In this paper, the author has worked on two contributions. In the applications concerning image denoising, the author has contributed in study of the parameter selection of bilateral filters which are optimal in nature. The contribution number two is about extending the present work i.e. extension of the filters which are bilateral in nature. In this process, the bilateral filtering of images is applied to the lower frequency sub-bands which is also known as approximation sub-band. This sub-band is obtained by using the wavelet transformations. Hence, a new framework for image denoising will be created which will be combination of multiresolution bilateral filtering and wavelets transformation techniques. As a matter of fact, this combination is efficient in contradicting noise from an image.

Download Full-text

Denoising of image using bilateral filtering in multiresolution

APTIKOM Journal on Computer Science and Information Technologies ◽

10.11591/aptikom.j.csit.80 ◽

2018 ◽

Vol 3 (1) ◽

pp. 6-12

Author(s):

Alaa Abid Muslam Abid Ali ◽

Mohammed Iqbal Dohan ◽

Saif Khalid Musluh

Keyword(s):

Image Denoising ◽

Bilateral Filtering ◽

Spatial Averaging ◽

Transformation Techniques ◽

Filter Parameter ◽

Filtering Technique ◽

Bilateral Filters ◽

Parameter Values ◽

New Framework ◽

Selection Of

One of the very efficient and resource conservative image processing methodology is with the help of bilateral filters. This technique filters the image without the help of edge smoothing but it does employs spatial averaging in a non-linear way. The filtering technique discussed above is very much dependent on the parameters of its filters. A very slight change in filter parameter values effects the outputs and results in a most drastic manner. In this paper, the author has worked on two contributions. In the applications concerning image denoising, the author has contributed in study of the parameter selection of bilateral filters which are optimal in nature. The contribution number two is about extending the present work i.e. extension of the filters which are bilateral in nature. In this process, the bilateral filtering of images is applied to the lower frequency sub-bands which is also known as approximation sub-band. This sub-band is obtained by using the wavelet transformations. Hence, a new framework for image denoising will be created which will be combination of multiresolution bilateral filtering and wavelets transformation techniques. As a matter of fact, this combination is efficient in contradicting noise from an image.

Download Full-text

Iranian kinect face database (IKFDB): a color-depth based face database collected by kinect v.2 sensor

SN Applied Sciences ◽

10.1007/s42452-020-03999-y ◽

2021 ◽

Vol 3 (1) ◽

Author(s):

Seyed Muhammad Hossein Mousavi ◽

S. Younes Mirinezhad

Keyword(s):

Neural Network ◽

Facial Expression ◽

Facial Expression Recognition ◽

Depth Image ◽

Sensor Technology ◽

Support Vector ◽

Expression Recognition ◽

Face Database ◽

Depth Data ◽

Color Depth

AbstractThis study presents a new color-depth based face database gathered from different genders and age ranges from Iranian subjects. Using suitable databases, it is possible to validate and assess available methods in different research fields. This database has application in different fields such as face recognition, age estimation and Facial Expression Recognition and Facial Micro Expressions Recognition. Image databases based on their size and resolution are mostly large. Color images usually consist of three channels namely Red, Green and Blue. But in the last decade, another aspect of image type has emerged, named “depth image”. Depth images are used in calculating range and distance between objects and the sensor. Depending on the depth sensor technology, it is possible to acquire range data differently. Kinect sensor version 2 is capable of acquiring color and depth data simultaneously. Facial expression recognition is an important field in image processing, which has multiple uses from animation to psychology. Currently, there is a few numbers of color-depth (RGB-D) facial micro expressions recognition databases existing. With adding depth data to color data, the accuracy of final recognition will be increased. Due to the shortage of color-depth based facial expression databases and some weakness in available ones, a new and almost perfect RGB-D face database is presented in this paper, covering Middle-Eastern face type. In the validation section, the database will be compared with some famous benchmark face databases. For evaluation, Histogram Oriented Gradients features are extracted, and classification algorithms such as Support Vector Machine, Multi-Layer Neural Network and a deep learning method, called Convolutional Neural Network or are employed. The results are so promising.

Download Full-text

RobotP: A Benchmark Dataset for 6D Object Pose Estimation

Sensors ◽

10.3390/s21041299 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1299

Author(s):

Honglin Yuan ◽

Tim Hoogenkamp ◽

Remco C. Veltkamp

Keyword(s):

Pose Estimation ◽

Ground Truth ◽

3D Models ◽

Depth Image ◽

Great Success ◽

Estimation Algorithms ◽

Depth Images ◽

Object Pose Estimation ◽

Image Pairs ◽

Bounding Boxes

Deep learning has achieved great success on robotic vision tasks. However, when compared with other vision-based tasks, it is difficult to collect a representative and sufficiently large training set for six-dimensional (6D) object pose estimation, due to the inherent difficulty of data collection. In this paper, we propose the RobotP dataset consisting of commonly used objects for benchmarking in 6D object pose estimation. To create the dataset, we apply a 3D reconstruction pipeline to produce high-quality depth images, ground truth poses, and 3D models for well-selected objects. Subsequently, based on the generated data, we produce object segmentation masks and two-dimensional (2D) bounding boxes automatically. To further enrich the data, we synthesize a large number of photo-realistic color-and-depth image pairs with ground truth 6D poses. Our dataset is freely distributed to research groups by the Shape Retrieval Challenge benchmark on 6D pose estimation. Based on our benchmark, different learning-based approaches are trained and tested by the unified dataset. The evaluation results indicate that there is considerable room for improvement in 6D object pose estimation, particularly for objects with dark colors, and photo-realistic images are helpful in increasing the performance of pose estimation algorithms.

Download Full-text

HRDepthNet: Depth Image-Based Marker-Less Tracking of Body Joints

Sensors ◽

10.3390/s21041356 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1356

Author(s):

Linda Christin Büker ◽

Finnja Zuber ◽

Andreas Hein ◽

Sebastian Fudickar

Keyword(s):

Color Images ◽

Depth Image ◽

Accuracy Evaluation ◽

Timed Up And Go ◽

Position Errors ◽

Depth Images ◽

Upper And Lower Extremities ◽

Rgb Images ◽

Human Joints ◽

Body Joints

With approaches for the detection of joint positions in color images such as HRNet and OpenPose being available, consideration of corresponding approaches for depth images is limited even though depth images have several advantages over color images like robustness to light variation or color- and texture invariance. Correspondingly, we introduce High- Resolution Depth Net (HRDepthNet)—a machine learning driven approach to detect human joints (body, head, and upper and lower extremities) in purely depth images. HRDepthNet retrains the original HRNet for depth images. Therefore, a dataset is created holding depth (and RGB) images recorded with subjects conducting the timed up and go test—an established geriatric assessment. The images were manually annotated RGB images. The training and evaluation were conducted with this dataset. For accuracy evaluation, detection of body joints was evaluated via COCO’s evaluation metrics and indicated that the resulting depth image-based model achieved better results than the HRNet trained and applied on corresponding RGB images. An additional evaluation of the position errors showed a median deviation of 1.619 cm (x-axis), 2.342 cm (y-axis) and 2.4 cm (z-axis).

Download Full-text

Image Denoising Using Sparse Representation and Principal Component Analysis

International Journal of Image and Graphics ◽

10.1142/s0219467822500334 ◽

2021 ◽

pp. 2250033

Author(s):

Maryam Abedini ◽

Horriyeh Haddad ◽

Marzieh Faridi Masouleh ◽

Asadollah Shahbahrami

Keyword(s):

Principal Component Analysis ◽

Sparse Representation ◽

Image Denoising ◽

Matching Pursuit ◽

Signal To Noise Ratio ◽

Principal Component ◽

Structural Similarity ◽

Component Analysis ◽

Block Matching ◽

Discrete Wavelet

This study proposes an image denoising algorithm based on sparse representation and Principal Component Analysis (PCA). The proposed algorithm includes the following steps. First, the noisy image is divided into overlapped [Formula: see text] blocks. Second, the discrete cosine transform is applied as a dictionary for the sparse representation of the vectors created by the overlapped blocks. To calculate the sparse vector, the orthogonal matching pursuit algorithm is used. Then, the dictionary is updated by means of the PCA algorithm to achieve the sparsest representation of vectors. Since the signal energy, unlike the noise energy, is concentrated on a small dataset by transforming into the PCA domain, the signal and noise can be well distinguished. The proposed algorithm was implemented in a MATLAB environment and its performance was evaluated on some standard grayscale images under different levels of standard deviations of white Gaussian noise by means of peak signal-to-noise ratio, structural similarity indexes, and visual effects. The experimental results demonstrate that the proposed denoising algorithm achieves significant improvement compared to dual-tree complex discrete wavelet transform and K-singular value decomposition image denoising methods. It also obtains competitive results with the block-matching and 3D filtering method, which is the current state-of-the-art for image denoising.

Download Full-text

Image Denoising Algorithm via Doubly Bilateral Filtering

2009 International Conference on Information Engineering and Computer Science ◽

10.1109/iciecs.2009.5363149 ◽

2009 ◽

Cited By ~ 3

Author(s):

Zuo-feng Zhou ◽

Jian-zhong Cao ◽

Hao Wang ◽

Wei-hua Liu

Keyword(s):

Image Denoising ◽

Bilateral Filtering

Download Full-text

RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor

Mathematics ◽

10.3390/math9212815 ◽

2021 ◽

Vol 9 (21) ◽

pp. 2815

Author(s):

Shih-Hung Yang ◽

Yao-Mao Cheng ◽

Jyun-We Huang ◽

Yon-Ping Chen

Keyword(s):

Receptive Field ◽

American Sign Language ◽

Multiple Scales ◽

Receptive Fields ◽

Depth Image ◽

Small Scale ◽

Feature Maps ◽

Depth Sensor ◽

Depth Images ◽

Effective Transfer

Automatic fingerspelling recognition tackles the communication barrier between deaf and hearing individuals. However, the accuracy of fingerspelling recognition is reduced by high intra-class variability and low inter-class variability. In the existing methods, regular convolutional kernels, which have limited receptive fields (RFs) and often cannot detect subtle discriminative details, are applied to learn features. In this study, we propose a receptive field-aware network with finger attention (RFaNet) that highlights the finger regions and builds inter-finger relations. To highlight the discriminative details of these fingers, RFaNet reweights the low-level features of the hand depth image with those of the non-forearm image and improves finger localization, even when the wrist is occluded. RFaNet captures neighboring and inter-region dependencies between fingers in high-level features. An atrous convolution procedure enlarges the RFs at multiple scales and a non-local operation computes the interactions between multi-scale feature maps, thereby facilitating the building of inter-finger relations. Thus, the representation of a sign is invariant to viewpoint changes, which are primarily responsible for intra-class variability. On an American Sign Language fingerspelling dataset, RFaNet achieved 1.77% higher classification accuracy than state-of-the-art methods. RFaNet achieved effective transfer learning when the number of labeled depth images was insufficient. The fingerspelling representation of a depth image can be effectively transferred from large- to small-scale datasets via highlighting the finger regions and building inter-finger relations, thereby reducing the requirement for expensive fingerspelling annotations.

Download Full-text

Multispectral Image Denoising Based on Non-local Means and Bilateral Filtering

Communications in Computer and Information Science - Internet Multimedia Computing and Service ◽

10.1007/978-981-10-8530-7_37 ◽

2018 ◽

pp. 384-391

Author(s):

Xueyan Zhen ◽

Ning He ◽

Xin Sun ◽

Yuqing Zhang

Keyword(s):

Image Denoising ◽

Multispectral Image ◽

Bilateral Filtering ◽

Local Means ◽

Non Local

Download Full-text