Application of deep learning and computer vision frameworks for solving video context prediction problem

PyConvU-Net: a lightweight and multiscale network for biomedical image segmentation

BMC Bioinformatics ◽

10.1186/s12859-020-03943-2 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Changyong Li ◽

Yongxian Fan ◽

Xiaodong Cai

Keyword(s):

Image Segmentation ◽

Deep Learning ◽

State Of The Art ◽

Experimental Results ◽

Actual Situation ◽

Controlled Experiments ◽

Biomedical Image ◽

Segmentation Methods ◽

Art Performance

Abstract Background With the development of deep learning (DL), more and more methods based on deep learning are proposed and achieve state-of-the-art performance in biomedical image segmentation. However, these methods are usually complex and require the support of powerful computing resources. According to the actual situation, it is impractical that we use huge computing resources in clinical situations. Thus, it is significant to develop accurate DL based biomedical image segmentation methods which depend on resources-constraint computing. Results A lightweight and multiscale network called PyConvU-Net is proposed to potentially work with low-resources computing. Through strictly controlled experiments, PyConvU-Net predictions have a good performance on three biomedical image segmentation tasks with the fewest parameters. Conclusions Our experimental results preliminarily demonstrate the potential of proposed PyConvU-Net in biomedical image segmentation with resources-constraint computing.

Download Full-text

Random Forest with Adaptive Local Template for Pedestrian Detection

Mathematical Problems in Engineering ◽

10.1155/2015/767423 ◽

2015 ◽

Vol 2015 ◽

pp. 1-11 ◽

Cited By ~ 2

Author(s):

Tao Xiang ◽

Tao Li ◽

Mao Ye ◽

Zijian Liu

Keyword(s):

Computer Vision ◽

Random Forest ◽

Classification Accuracy ◽

Template Matching ◽

Detection Method ◽

State Of The Art ◽

Pedestrian Detection ◽

Sliding Window ◽

Experimental Results ◽

Training Samples

Pedestrian detection with large intraclass variations is still a challenging task in computer vision. In this paper, we propose a novel pedestrian detection method based on Random Forest. Firstly, we generate a few local templates with different sizes and different locations in positive exemplars. Then, the Random Forest is built whose splitting functions are optimized by maximizing class purity of matching the local templates to the training samples, respectively. To improve the classification accuracy, we adopt a boosting-like algorithm to update the weights of the training samples in a layer-wise fashion. During detection, the trained Random Forest will vote the category when a sliding window is input. Our contributions are the splitting functions based on local template matching with adaptive size and location and iteratively weight updating method. We evaluate the proposed method on 2 well-known challenging datasets: TUD pedestrians and INRIA pedestrians. The experimental results demonstrate that our method achieves state-of-the-art or competitive performance.

Download Full-text

Particle Size Estimation in Mixed Commercial Waste Images Using Deep Learning

10.36227/techrxiv.14762043.v1 ◽

2021 ◽

Author(s):

Phongsathorn Kittiworapanya ◽

Kitsuchart Pasupa ◽

Peter Auer

Keyword(s):

Computer Vision ◽

Particle Size ◽

Deep Learning ◽

Waste Management ◽

State Of The Art ◽

Learning Algorithms ◽

Input Image ◽

Size Estimation ◽

Waste Particles ◽

Set Up

<div>We assessed several state-of-the-art deep learning algorithms and computer vision techniques for estimating the particle size of mixed commercial waste from images. In waste management, the first step is often coarse shredding, using the particle size to set up the shredder machine. The difficulty is separating the waste particles in an image, which can not be performed well. This work focused on estimating size by using the texture from the input image, captured at a fixed height from the camera lens to the ground. We found that EfficientNet achieved the best performance of 0.72 on F1-Score and 75.89% on accuracy.<br></div>

Download Full-text

A Feasibility Test on Preventing PRMDs Based on Deep Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.330110005 ◽

2019 ◽

Vol 33 ◽

pp. 10005-10006

Author(s):

So-Hyun Park ◽

Sun-Young Ihm ◽

Aziz Nasridinov ◽

Young-Ho Park

Keyword(s):

Deep Learning ◽

Musculoskeletal Disorders ◽

Classification Accuracy ◽

State Of The Art ◽

Learning Algorithms ◽

Experimental Results ◽

Video Classification ◽

Learning Paradigm ◽

Feasibility Test ◽

Piano Playing

This study proposes a method to reduce the playing-related musculoskeletal disorders (PRMDs) that often occur among pianists. Specifically, we propose a feasibility test that evaluates several state-of-the-art deep learning algorithms to prevent injuries of pianist. For this, we propose (1) a C3P dataset including various piano playing postures and show (2) the application of four learning algorithms, which demonstrated their superiority in video classification, to the proposed C3P datasets. To our knowledge, this is the first study that attempted to apply the deep learning paradigm to reduce the PRMDs in pianist. The experimental results demonstrated that the classification accuracy is 80% on average, indicating that the proposed hypothesis about the effectiveness of the deep learning algorithms to prevent injuries of pianist is true.

Download Full-text

Document-level Relation Extraction as Semantic Segmentation

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/551 ◽

2021 ◽

Author(s):

Ningyu Zhang ◽

Xiang Chen ◽

Xin Xie ◽

Shumin Deng ◽

Chuanqi Tan ◽

...

Keyword(s):

Computer Vision ◽

State Of The Art ◽

Relation Extraction ◽

Semantic Segmentation ◽

Experimental Results ◽

Context Information ◽

Global Information ◽

Benchmark Datasets ◽

Segmentation Task ◽

Document Level

Document-level relation extraction aims to extract relations among multiple entity pairs from a document. Previously proposed graph-based or transformer-based models utilize the entities independently, regardless of global information among relational triples. This paper approaches the problem by predicting an entity-level relation matrix to capture local and global information, parallel to the semantic segmentation task in computer vision. Herein, we propose a Document U-shaped Network for document-level relation extraction. Specifically, we leverage an encoder module to capture the context information of entities and a U-shaped segmentation module over the image-style feature map to capture global interdependency among triples. Experimental results show that our approach can obtain state-of-the-art performance on three benchmark datasets DocRED, CDR, and GDA.

Download Full-text

Sanitizing hidden activations for improving adversarial robustness of convolutional neural networks

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210371 ◽

2021 ◽

pp. 1-11

Author(s):

Tianshi Mu ◽

Kequan Lin ◽

Huabing Zhang ◽

Jian Wang

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

State Of The Art ◽

Black Box ◽

Experimental Results ◽

Amplification Effect ◽

Wide Range ◽

Adversarial Examples

Deep learning is gaining significant traction in a wide range of areas. Whereas, recent studies have demonstrated that deep learning exhibits the fatal weakness on adversarial examples. Due to the black-box nature and un-transparency problem of deep learning, it is difficult to explain the reason for the existence of adversarial examples and also hard to defend against them. This study focuses on improving the adversarial robustness of convolutional neural networks. We first explore how adversarial examples behave inside the network through visualization. We find that adversarial examples produce perturbations in hidden activations, which forms an amplification effect to fool the network. Motivated by this observation, we propose an approach, termed as sanitizing hidden activations, to help the network correctly recognize adversarial examples by eliminating or reducing the perturbations in hidden activations. To demonstrate the effectiveness of our approach, we conduct experiments on three widely used datasets: MNIST, CIFAR-10 and ImageNet, and also compare with state-of-the-art defense techniques. The experimental results show that our sanitizing approach is more generalized to defend against different kinds of attacks and can effectively improve the adversarial robustness of convolutional neural networks.

Download Full-text

Global and Local Attention-Based Free-Form Image Inpainting

Sensors ◽

10.3390/s20113204 ◽

2020 ◽

Vol 20 (11) ◽

pp. 3204

Author(s):

S. M. Nadim Uddin ◽

Yong Ju Jung

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

State Of The Art ◽

Image Inpainting ◽

Experimental Results ◽

Free Form ◽

Local Similarity ◽

Global And Local ◽

Similarity Information

Deep-learning-based image inpainting methods have shown significant promise in both rectangular and irregular holes. However, the inpainting of irregular holes presents numerous challenges owing to uncertainties in their shapes and locations. When depending solely on convolutional neural network (CNN) or adversarial supervision, plausible inpainting results cannot be guaranteed because irregular holes need attention-based guidance for retrieving information for content generation. In this paper, we propose two new attention mechanisms, namely a mask pruning-based global attention module and a global and local attention module to obtain global dependency information and the local similarity information among the features for refined results. The proposed method is evaluated using state-of-the-art methods, and the experimental results show that our method outperforms the existing methods in both quantitative and qualitative measures.

Download Full-text

Building a Real-Time 2D Lidar Using Deep Learning

Journal of Robotics ◽

10.1155/2021/6652828 ◽

2021 ◽

Vol 2021 ◽

pp. 1-7

Author(s):

Nadim Arubai ◽

Omar Hamdoun ◽

Assef Jafar

Keyword(s):

Deep Learning ◽

Real Time ◽

Obstacle Avoidance ◽

Tilt Angle ◽

State Of The Art ◽

The State ◽

Prediction Problem ◽

Learning Methods ◽

Depth Prediction ◽

Avoidance Problem

Applying deep learning methods, this paper addresses depth prediction problem resulting from single monocular images. A vector of distances is predicted instead of a whole image matrix. A vector-only prediction decreases training overhead and prediction periods and requires less resources (memory, CPU). We propose a module which is more time efficient than the state-of-the-art modules ResNet, VGG, FCRN, and DORN. We enhanced the network results by training it on depth vectors from other levels (we get a new level by changing the Lidar tilt angle). The predicted results give a vector of distances around the robot, which is sufficient for the obstacle avoidance problem and many other applications.

Download Full-text

An interpretable framework for investigating the neighborhood effect in POI recommendation

PLoS ONE ◽

10.1371/journal.pone.0255685 ◽

2021 ◽

Vol 16 (8) ◽

pp. e0255685

Author(s):

Guangchao Yuan ◽

Munindar P. Singh ◽

Pradeep K. Murukannaiah

Keyword(s):

Deep Learning ◽

Matrix Factorization ◽

State Of The Art ◽

Experimental Results ◽

Neighborhood Effect ◽

Data Set ◽

Point Of Interest ◽

Poi Recommendation ◽

Geographical Characteristics

Geographical characteristics have been proven to be effective in improving the quality of point-of-interest (POI) recommendation. However, existing works on POI recommendation focus on cost (time or money) of travel for a user. An important geographical aspect that has not been studied adequately is the neighborhood effect, which captures a user’s POI visiting behavior based on the user’s preference not only to a POI, but also to the POI’s neighborhood. To provide an interpretable framework to fully study the neighborhood effect, first, we develop different sets of insightful features, representing different aspects of neighborhood effect. We employ a Yelp data set to evaluate how different aspects of the neighborhood effect affect a user’s POI visiting behavior. Second, we propose a deep learning–based recommendation framework that exploits the neighborhood effect. Experimental results show that our approach is more effective than two state-of-the-art matrix factorization–based POI recommendation techniques.

Download Full-text

On Determining Suitable Embedded Devices for Deep Learning Models

10.3233/faia210147 ◽

2021 ◽

Author(s):

Daniel Padilla ◽

Hatem A. Rashwan ◽

Domènec Savi Puig

Keyword(s):

Computer Vision ◽

Deep Learning ◽

Embedded Systems ◽

State Of The Art ◽

Image Data ◽

Important Change ◽

Learning Models ◽

Embedded Devices ◽

And Performance ◽

High Level

Deep learning (DL) networks have proven to be crucial in commercial solutions with computer vision challenges due to their abilities to extract high-level abstractions of the image data and their capabilities of being easily adapted to many applications. As a result, DL methodologies had become a de facto standard for computer vision problems yielding many new kinds of research, approaches and applications. Recently, the commercial sector is also driving to use of embedded systems to be able to execute DL models, which has caused an important change on the DL panorama and the embedded systems themselves. Consequently, in this paper, we attempt to study the state of the art of embedded systems, such as GPUs, FPGAs and Mobile SoCs, that are able to use DL techniques, to modernize the stakeholders with the new systems available in the market. Besides, we aim at helping them to determine which of these systems can be beneficial and suitable for their applications in terms of upgradeability, price, deployment and performance.

Download Full-text