A Survey on Deep Learning in Image Polarity Detection: Balancing Generalization Performances and Computational Costs

Edoardo Ragusa; Erik Cambria; Rodolfo Zunino; Paolo Gastaldo

doi:10.3390/electronics8070783

A Survey on Deep Learning in Image Polarity Detection: Balancing Generalization Performances and Computational Costs

Electronics ◽

10.3390/electronics8070783 ◽

2019 ◽

Vol 8 (7) ◽

pp. 783 ◽

Cited By ~ 8

Author(s):

Edoardo Ragusa ◽

Erik Cambria ◽

Rodolfo Zunino ◽

Paolo Gastaldo

Keyword(s):

Computational Cost ◽

Fine Tuning ◽

Deep Convolutional Neural Networks ◽

Computational Costs ◽

Advantages And Disadvantages ◽

Complex Information ◽

Large Sets ◽

Learning Techniques ◽

Feature Extractor ◽

Overall Performance

Deep convolutional neural networks (CNNs) provide an effective tool to extract complex information from images. In the area of image polarity detection, CNNs are customarily utilized in combination with transfer learning techniques to tackle a major problem: the unavailability of large sets of labeled data. Thus, polarity predictors in general exploit a pre-trained CNN as the feature extractor that in turn feeds a classification unit. While the latter unit is trained from scratch, the pre-trained CNN is subject to fine-tuning. As a result, the specific CNN architecture employed as the feature extractor strongly affects the overall performance of the model. This paper analyses state-of-the-art literature on image polarity detection and identifies the most reliable CNN architectures. Moreover, the paper provides an experimental protocol that should allow assessing the role played by the baseline architecture in the polarity detection task. Performance is evaluated in terms of both generalization abilities and computational complexity. The latter attribute becomes critical as polarity predictors, in the era of social networks, might need to be updated within hours or even minutes. In this regard, the paper gives practical hints on the advantages and disadvantages of the examined architectures both in terms of generalization and computational cost.

Download Full-text

Effective Filter Pruning Method Using Additional Downsampled Image for Global Pooling Applied CNN

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s021800142154029x ◽

2021 ◽

pp. 2154029

Author(s):

Hyunseok Kim ◽

Bunyodbek Ibrokhimov ◽

Sanggil Kang

Keyword(s):

Evaluation Method ◽

Computational Cost ◽

Information Loss ◽

Deep Convolutional Neural Networks ◽

Embedded Devices ◽

Computational Costs ◽

Pruning Method ◽

Original Dataset ◽

High Resolution Images ◽

Deep Cnn

Deep Convolutional Neural Networks (CNNs) show remarkable performance in many areas. However, most of the applications require huge computational costs and massive memory, which are hard to obtain in devices with a relatively weak performance like embedded devices. To reduce the computational cost, and meantime, to preserve the performance of the trained deep CNN, we propose a new filter pruning method using an additional dataset derived by downsampling the original dataset. Our method takes advantage of the fact that information in high-resolution images is lost in the downsampling process. Each trained convolutional filter reacts differently to this information loss. Based on this, the importance of the filter is evaluated by comparing the gradient obtained from two different resolution images. We validate the superiority of our filter evaluation method using a VGG-16 model trained on CIFAR-10 and CUB-200-2011 datasets. The pruned network with our method shows an average of 2.66% higher accuracy in the latter dataset, compared to existing pruning methods when about 75% of the parameters are removed.

Download Full-text

Automatic Pharyngeal Phase Recognition in Untrimmed Videofluoroscopic Swallowing Study Using Transfer Learning with Deep Convolutional Neural Networks

Diagnostics ◽

10.3390/diagnostics11020300 ◽

2021 ◽

Vol 11 (2) ◽

pp. 300

Author(s):

Ki-Sun Lee ◽

Eunyoung Lee ◽

Bareun Choi ◽

Sung-Bom Pyun

Keyword(s):

Transfer Learning ◽

Fine Tuning ◽

Deep Convolutional Neural Networks ◽

Image Frame ◽

Pharyngeal Phase ◽

Single Frame ◽

Learning Techniques ◽

Learning Technique ◽

Deep Cnn ◽

Phase Recognition

Background: Video fluoroscopic swallowing study (VFSS) is considered as the gold standard diagnostic tool for evaluating dysphagia. However, it is time consuming and labor intensive for the clinician to manually search the recorded long video image frame by frame to identify the instantaneous swallowing abnormality in VFSS images. Therefore, this study aims to present a deep leaning-based approach using transfer learning with a convolutional neural network (CNN) that automatically annotates pharyngeal phase frames in untrimmed VFSS videos such that frames need not be searched manually. Methods: To determine whether the image frame in the VFSS video is in the pharyngeal phase, a single-frame baseline architecture based the deep CNN framework is used and a transfer learning technique with fine-tuning is applied. Results: Compared with all experimental CNN models, that fine-tuned with two blocks of the VGG-16 (VGG16-FT5) model achieved the highest performance in terms of recognizing the frame of pharyngeal phase, that is, the accuracy of 93.20 (±1.25)%, sensitivity of 84.57 (±5.19)%, specificity of 94.36 (±1.21)%, AUC of 0.8947 (±0.0269) and Kappa of 0.7093 (±0.0488). Conclusions: Using appropriate and fine-tuning techniques and explainable deep learning techniques such as grad CAM, this study shows that the proposed single-frame-baseline-architecture-based deep CNN framework can yield high performances in the full automation of VFSS video analysis.

Download Full-text

EVALUATING COMPUTATIONAL COSTS WHILE HANDLING DATA AND CONTROL PARALLELISM

Parallel Processing Letters ◽

10.1142/s0129626408003296 ◽

2008 ◽

Vol 18 (01) ◽

pp. 165-174 ◽

Cited By ~ 1

Author(s):

SONIA CAMPA

Keyword(s):

Computational Cost ◽

Data Access ◽

Parallel Application ◽

Semantic Framework ◽

Computational Costs ◽

Overall Performance ◽

And Control ◽

Rewriting Rules

The aim of this work is to introduce a computational costs system associated to a semantic framework for orthogonal data and control parallelism handling. In such a framework a parallel application is described by a semantic expression involving in an orthogonal manner both data access and control parallelism abstractions. The evaluation of such an expression is driven by a set of rewriting rules each of which is combined with a computational cost. We present how to proceed in the evaluation of the final cost of the application as well as how such information together with the semantic framework capabilities can be exploited to increase the overall performance.

Download Full-text

Deep-Feature-Based Approach to Marine Debris Classification

Applied Sciences ◽

10.3390/app11125644 ◽

2021 ◽

Vol 11 (12) ◽

pp. 5644

Author(s):

Ivana Marin ◽

Saša Mladenović ◽

Sven Gotovac ◽

Goran Zaharija

Keyword(s):

Marine Pollution ◽

Classification Performance ◽

Fine Tuning ◽

Marine Debris ◽

Support Vector ◽

Deep Convolutional Neural Networks ◽

Deep Feature ◽

Feature Extractor ◽

Feature Based ◽

The Given

The global community has recognized an increasing amount of pollutants entering oceans and other water bodies as a severe environmental, economic, and social issue. In addition to prevention, one of the key measures in addressing marine pollution is the cleanup of debris already present in marine environments. Deployment of machine learning (ML) and deep learning (DL) techniques can automate marine waste removal, making the cleanup process more efficient. This study examines the performance of six well-known deep convolutional neural networks (CNNs), namely VGG19, InceptionV3, ResNet50, Inception-ResNetV2, DenseNet121, and MobileNetV2, utilized as feature extractors according to three different extraction schemes for the identification and classification of underwater marine debris. We compare the performance of a neural network (NN) classifier trained on top of deep CNN feature extractors when the feature extractor is (1) fixed; (2) fine-tuned on the given task; (3) fixed during the first phase of training and fine-tuned afterward. In general, fine-tuning resulted in better-performing models but is much more computationally expensive. The overall best NN performance showed the fine-tuned Inception-ResNetV2 feature extractor with an accuracy of 91.40% and F1-score 92.08%, followed by fine-tuned InceptionV3 extractor. Furthermore, we analyze conventional ML classifiers’ performance when trained on features extracted with deep CNNs. Finally, we show that replacing NN with a conventional ML classifier, such as support vector machine (SVM) or logistic regression (LR), can further enhance the classification performance on new data.

Download Full-text

A Survey of Deep Convolutional Neural Networks Applied for Prediction of Plant Leaf Diseases

Sensors ◽

10.3390/s21144749 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4749

Author(s):

Vijaypal Singh Dhaka ◽

Sangeeta Vaibhav Meena ◽

Geeta Rani ◽

Deepak Sinwar ◽

Kavita Kavita ◽

...

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

Image Recognition ◽

Optimization Techniques ◽

Plant Diseases ◽

Deep Convolutional Neural Networks ◽

Plant Leaf ◽

Advantages And Disadvantages ◽

Learning Techniques

In the modern era, deep learning techniques have emerged as powerful tools in image recognition. Convolutional Neural Networks, one of the deep learning tools, have attained an impressive outcome in this area. Applications such as identifying objects, faces, bones, handwritten digits, and traffic signs signify the importance of Convolutional Neural Networks in the real world. The effectiveness of Convolutional Neural Networks in image recognition motivates the researchers to extend its applications in the field of agriculture for recognition of plant species, yield management, weed detection, soil, and water management, fruit counting, diseases, and pest detection, evaluating the nutrient status of plants, and much more. The availability of voluminous research works in applying deep learning models in agriculture leads to difficulty in selecting a suitable model according to the type of dataset and experimental environment. In this manuscript, the authors present a survey of the existing literature in applying deep Convolutional Neural Networks to predict plant diseases from leaf images. This manuscript presents an exemplary comparison of the pre-processing techniques, Convolutional Neural Network models, frameworks, and optimization techniques applied to detect and classify plant diseases using leaf images as a data set. This manuscript also presents a survey of the datasets and performance metrics used to evaluate the efficacy of models. The manuscript highlights the advantages and disadvantages of different techniques and models proposed in the existing literature. This survey will ease the task of researchers working in the field of applying deep learning techniques for the identification and classification of plant leaf diseases.

Download Full-text

TLBO-FLN: Teaching-Learning Based Optimization of Functional Link Neural Networks for Stock Closing Price Prediction

International Journal of Sensors Wireless Communications and Control ◽

10.2174/2210327909666191202113015 ◽

2020 ◽

Vol 10 (4) ◽

pp. 522-532 ◽

Cited By ~ 1

Author(s):

Sarat Chandra Nayak ◽

Subhranginee Das ◽

Mohammad Dilsad Ansari

Keyword(s):

Neural Networks ◽

Computational Cost ◽

Optimization Techniques ◽

Fine Tuning ◽

Functional Link ◽

Price Prediction ◽

Closing Price ◽

Teaching Learning Based Optimization ◽

Artificial Neural ◽

Teaching Learning

Background and Objective: Stock closing price prediction is enormously complicated. Artificial Neural Networks (ANN) are excellent approximation algorithms applied to this area. Several nature-inspired evolutionary optimization techniques are proposed and used in the literature to search the optimum parameters of ANN based forecasting models. However, most of them need fine-tuning of several control parameters as well as algorithm specific parameters to achieve optimal performance. Improper tuning of such parameters either leads toward additional computational cost or local optima. Methods: Teaching Learning Based Optimization (TLBO) is a newly proposed algorithm which does not necessitate any parameters specific to it. The intrinsic capability of Functional Link Artificial Neural Network (FLANN) to recognize the multifaceted nonlinear relationship present in the historical stock data made it popular and got wide applications in the stock market prediction. This article presents a hybrid model termed as Teaching Learning Based Optimization of Functional Neural Networks (TLBO-FLN) by combining the advantages of both TLBO and FLANN. Results and Conclusion: The model is evaluated by predicting the short, medium, and long-term closing prices of four emerging stock markets. The performance of the TLBO-FLN model is measured through Mean Absolute Percentage of Error (MAPE), Average Relative Variance (ARV), and coefficient of determination (R2); compared with that of few other state-of-the-art models similarly trained and found superior.

Download Full-text

Performance of Adaptive Unstructured Mesh Modelling in Idealized Advection Cases over Steep Terrains

Atmosphere ◽

10.3390/atmos9110444 ◽

2018 ◽

Vol 9 (11) ◽

pp. 444 ◽

Cited By ~ 2

Author(s):

Jinxi Li ◽

Jie Zheng ◽

Jiang Zhu ◽

Fangxin Fang ◽

Christopher. Pain ◽

...

Keyword(s):

Unstructured Mesh ◽

Computational Cost ◽

Adaptive Mesh ◽

Galerkin Finite Element Method ◽

Adaptive Meshes ◽

Galerkin Finite Element ◽

Computational Costs ◽

Cut Cell ◽

Discontinuous Galerkin Finite Element ◽

Terrain Following

Advection errors are common in basic terrain-following (TF) coordinates. Numerous methods, including the hybrid TF coordinate and smoothing vertical layers, have been proposed to reduce the advection errors. Advection errors are affected by the directions of velocity fields and the complexity of the terrain. In this study, an unstructured adaptive mesh together with the discontinuous Galerkin finite element method is employed to reduce advection errors over steep terrains. To test the capability of adaptive meshes, five two-dimensional (2D) idealized tests are conducted. Then, the results of adaptive meshes are compared with those of cut-cell and TF meshes. The results show that using adaptive meshes reduces the advection errors by one to two orders of magnitude compared to the cut-cell and TF meshes regardless of variations in velocity directions or terrain complexity. Furthermore, adaptive meshes can reduce the advection errors when the tracer moves tangentially along the terrain surface and allows the terrain to be represented without incurring in severe dispersion. Finally, the computational cost is analyzed. To achieve a given tagging criterion level, the adaptive mesh requires fewer nodes, smaller minimum mesh sizes, less runtime and lower proportion between the node numbers used for resolving the tracer and each wavelength than cut-cell and TF meshes, thus reducing the computational costs.

Download Full-text

Deep Learning Techniques for Grape Plant Species Identification in Natural Images

Sensors ◽

10.3390/s19224850 ◽

2019 ◽

Vol 19 (22) ◽

pp. 4850 ◽

Cited By ~ 6

Author(s):

Carlos S. Pereira ◽

Raul Morais ◽

Manuel J. C. S. Reis

Keyword(s):

Transfer Learning ◽

Climatic Conditions ◽

Fine Tuning ◽

Variety Identification ◽

Test Accuracy ◽

Accuracy Score ◽

Learning Techniques ◽

Four Corners ◽

Integrated Software ◽

Grape Varieties

Frequently, the vineyards in the Douro Region present multiple grape varieties per parcel and even per row. An automatic algorithm for grape variety identification as an integrated software component was proposed that can be applied, for example, to a robotic harvesting system. However, some issues and constraints in its development were highlighted, namely, the images captured in natural environment, low volume of images, high similarity of the images among different grape varieties, leaf senescence, and significant changes on the grapevine leaf and bunch images in the harvest seasons, mainly due to adverse climatic conditions, diseases, and the presence of pesticides. In this paper, the performance of the transfer learning and fine-tuning techniques based on AlexNet architecture were evaluated when applied to the identification of grape varieties. Two natural vineyard image datasets were captured in different geographical locations and harvest seasons. To generate different datasets for training and classification, some image processing methods, including a proposed four-corners-in-one image warping algorithm, were used. The experimental results, obtained from the application of an AlexNet-based transfer learning scheme and trained on the image dataset pre-processed through the four-corners-in-one method, achieved a test accuracy score of 77.30%. Applying this classifier model, an accuracy of 89.75% on the popular Flavia leaf dataset was reached. The results obtained by the proposed approach are promising and encouraging in helping Douro wine growers in the automatic task of identifying grape varieties.

Download Full-text

Multirate Processing with Selective Subbands and Machine Learning for Efficient Arrhythmia Classification

Sensors ◽

10.3390/s21041511 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1511

Author(s):

Saeed Mian Qaisar ◽

Alaeddine Mihoub ◽

Moez Krichen ◽

Humaira Nisar

Keyword(s):

Machine Learning ◽

Signal Reconstruction ◽

Computational Cost ◽

Machine Learning Techniques ◽

Features Selection ◽

Frequency Content ◽

Fixed Rate ◽

Ecg Signals ◽

Learning Techniques ◽

Multirate Processing

The usage of wearable gadgets is growing in the cloud-based health monitoring systems. The signal compression, computational and power efficiencies play an imperative part in this scenario. In this context, we propose an efficient method for the diagnosis of cardiovascular diseases based on electrocardiogram (ECG) signals. The method combines multirate processing, wavelet decomposition and frequency content-based subband coefficient selection and machine learning techniques. Multirate processing and features selection is used to reduce the amount of information processed thus reducing the computational complexity of the proposed system relative to the equivalent fixed-rate solutions. Frequency content-dependent subband coefficient selection enhances the compression gain and reduces the transmission activity and computational cost of the post cloud-based classification. We have used MIT-BIH dataset for our experiments. To avoid overfitting and biasness, the performance of considered classifiers is studied by using five-fold cross validation (5CV) and a novel proposed partial blind protocol. The designed method achieves more than 12-fold computational gain while assuring an appropriate signal reconstruction. The compression gain is 13 times compared to fixed-rate counterparts and the highest classification accuracies are 97.06% and 92.08% for the 5CV and partial blind cases, respectively. Results suggest the feasibility of detecting cardiac arrhythmias using the proposed approach.

Download Full-text

Interpretable deep learning for the remote characterisation of ambulation in multiple sclerosis using smartphones

Scientific Reports ◽

10.1038/s41598-021-92776-x ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Andrew P. Creagh ◽

Florian Lipsmeier ◽

Michael Lindemann ◽

Maarten De Vos

Keyword(s):

Multiple Sclerosis ◽

Deep Learning ◽

Inertial Sensor ◽

Heterogeneous Data ◽

Fine Tuning ◽

Sensor Data ◽

Support Vector ◽

Deep Convolutional Neural Networks ◽

Healthcare Applications ◽

Feature Based

AbstractThe emergence of digital technologies such as smartphones in healthcare applications have demonstrated the possibility of developing rich, continuous, and objective measures of multiple sclerosis (MS) disability that can be administered remotely and out-of-clinic. Deep Convolutional Neural Networks (DCNN) may capture a richer representation of healthy and MS-related ambulatory characteristics from the raw smartphone-based inertial sensor data than standard feature-based methodologies. To overcome the typical limitations associated with remotely generated health data, such as low subject numbers, sparsity, and heterogeneous data, a transfer learning (TL) model from similar large open-source datasets was proposed. Our TL framework leveraged the ambulatory information learned on human activity recognition (HAR) tasks collected from wearable smartphone sensor data. It was demonstrated that fine-tuning TL DCNN HAR models towards MS disease recognition tasks outperformed previous Support Vector Machine (SVM) feature-based methods, as well as DCNN models trained end-to-end, by upwards of 8–15%. A lack of transparency of “black-box” deep networks remains one of the largest stumbling blocks to the wider acceptance of deep learning for clinical applications. Ensuing work therefore aimed to visualise DCNN decisions attributed by relevance heatmaps using Layer-Wise Relevance Propagation (LRP). Through the LRP framework, the patterns captured from smartphone-based inertial sensor data that were reflective of those who are healthy versus people with MS (PwMS) could begin to be established and understood. Interpretations suggested that cadence-based measures, gait speed, and ambulation-related signal perturbations were distinct characteristics that distinguished MS disability from healthy participants. Robust and interpretable outcomes, generated from high-frequency out-of-clinic assessments, could greatly augment the current in-clinic assessment picture for PwMS, to inform better disease management techniques, and enable the development of better therapeutic interventions.

Download Full-text