Internet of things image recognition system based on deep learning

Author(s):  
Jing Li ◽  
Xinfang Li ◽  
Yuwen Ning

At present, many exciting results have been achieved in the application of deep learning to image recognition. However, there are still many problems to be overcome before deep learning is used in practical applications such as image retrieval, image annotation, and image-text conversion. This paper studies the structure of deep learning, improves the commonly used training algorithms, and proposes two new neural network models for different application scenarios. This paper uses Support Vector Machine (SVM) as the main classifier for Internet of Things image recognition and uses the database of this paper to train SVM and CNN. At the same time, the effectiveness of the two for image recognition is tested, and the trained classifier is used for image recognition. The result surface: In the labeled data set, the rank-1 accuracy of CNN is 85.77%, which is higher than 90.28% of the SVM method. In the detection data, CNN’s rank-1 accuracy rate is 83.11%, which also exceeds SVM’s 80.22%. SVM+CNN has a rank 1 value of 84.69% for the detection data set. This shows that deep learning can map the feature representation of the image and the feature representation of the word to the same space, making the calculation of the similarity and correlation between the image and the text easier and more straightforward.

Electronics ◽  
2021 ◽  
Vol 10 (13) ◽  
pp. 1514
Author(s):  
Seung-Ho Lim ◽  
WoonSik William Suh ◽  
Jin-Young Kim ◽  
Sang-Young Cho

The optimization for hardware processor and system for performing deep learning operations such as Convolutional Neural Networks (CNN) in resource limited embedded devices are recent active research area. In order to perform an optimized deep neural network model using the limited computational unit and memory of an embedded device, it is necessary to quickly apply various configurations of hardware modules to various deep neural network models and find the optimal combination. The Electronic System Level (ESL) Simulator based on SystemC is very useful for rapid hardware modeling and verification. In this paper, we designed and implemented a Deep Learning Accelerator (DLA) that performs Deep Neural Network (DNN) operation based on the RISC-V Virtual Platform implemented in SystemC in order to enable rapid and diverse analysis of deep learning operations in an embedded device based on the RISC-V processor, which is a recently emerging embedded processor. The developed RISC-V based DLA prototype can analyze the hardware requirements according to the CNN data set through the configuration of the CNN DLA architecture, and it is possible to run RISC-V compiled software on the platform, can perform a real neural network model like Darknet. We performed the Darknet CNN model on the developed DLA prototype, and confirmed that computational overhead and inference errors can be analyzed with the DLA prototype developed by analyzing the DLA architecture for various data sets.


Author(s):  
Shaoqiang Wang ◽  
Shudong Wang ◽  
Song Zhang ◽  
Yifan Wang

Abstract To automatically detect dynamic EEG signals to reduce the time cost of epilepsy diagnosis. In the signal recognition of electroencephalogram (EEG) of epilepsy, traditional machine learning and statistical methods require manual feature labeling engineering in order to show excellent results on a single data set. And the artificially selected features may carry a bias, and cannot guarantee the validity and expansibility in real-world data. In practical applications, deep learning methods can release people from feature engineering to a certain extent. As long as the focus is on the expansion of data quality and quantity, the algorithm model can learn automatically to get better improvements. In addition, the deep learning method can also extract many features that are difficult for humans to perceive, thereby making the algorithm more robust. Based on the design idea of ResNeXt deep neural network, this paper designs a Time-ResNeXt network structure suitable for time series EEG epilepsy detection to identify EEG signals. The accuracy rate of Time-ResNeXt in the detection of EEG epilepsy can reach 91.50%. The Time-ResNeXt network structure produces extremely advanced performance on the benchmark dataset (Berne-Barcelona dataset) and has great potential for improving clinical practice.


2021 ◽  
Vol 2113 (1) ◽  
pp. 012045
Author(s):  
Chunlei Zhou ◽  
Xiangzhou Chen ◽  
Wenli Liu ◽  
Tianyu Dong ◽  
Huang Yun

Abstract With the increase in the number of traction substations year by year, manual inspections are gradually being replaced by unattended inspections. Target detection algorithms based on deep learning are more widely used in intelligent inspections of power equipment. However, in practical applications, it is found that due to the small target to be detected, the accuracy of the deep learning model will decrease when the shooting angle is inclined and the light conditions are poor. This is because the algorithm’s robustness is low, and the detection ability of the model will be seriously affected when the angle or illumination difference with the sample is large. Based on this, the feature fusion part of the YOLOv3 algorithm and the selection of the loss function and the size of the anchor frame are improved, and the improved ASFF fusion method is used to classify various images in the power equipment. Actual measurement and repeated experiments show that the proposed method can be effectively applied to image recognition of various power equipment, optimize robustness, and greatly improve the image recognition efficiency of power equipment.


GEOMATICA ◽  
2021 ◽  
pp. 1-23
Author(s):  
Roholah Yazdan ◽  
Masood Varshosaz ◽  
Saied Pirasteh ◽  
Fabio Remondino

Automatic detection and recognition of traffic signs from images is an important topic in many applications. At first, we segmented the images using a classification algorithm to delineate the areas where the signs are more likely to be found. In this regard, shadows, objects having similar colours, and extreme illumination changes can significantly affect the segmentation results. We propose a new shape-based algorithm to improve the accuracy of the segmentation. The algorithm works by incorporating the sign geometry to filter out the wrong pixels from the classification results. We performed several tests to compare the performance of our algorithm against those obtained by popular techniques such as Support Vector Machine (SVM), K-Means, and K-Nearest Neighbours. In these tests, to overcome the unwanted illumination effects, the images are transformed into colour spaces Hue, Saturation, and Intensity, YUV, normalized red green blue, and Gaussian. Among the traditional techniques used in this study, the best results were obtained with SVM applied to the images transformed into the Gaussian colour space. The comparison results also suggested that by adding the geometric constraints proposed in this study, the quality of sign image segmentation is improved by 10%–25%. We also comparted the SVM classifier enhanced by incorporating the geometry of signs with a U-Shaped deep learning algorithm. Results suggested the performance of both techniques is very close. Perhaps the deep learning results could be improved if a more comprehensive data set is provided.


Sensors ◽  
2020 ◽  
Vol 20 (23) ◽  
pp. 6762
Author(s):  
Jung Hyuk Lee ◽  
Geon Woo Lee ◽  
Guiyoung Bong ◽  
Hee Jeong Yoo ◽  
Hong Kook Kim

Autism spectrum disorder (ASD) is a developmental disorder with a life-span disability. While diagnostic instruments have been developed and qualified based on the accuracy of the discrimination of children with ASD from typical development (TD) children, the stability of such procedures can be disrupted by limitations pertaining to time expenses and the subjectivity of clinicians. Consequently, automated diagnostic methods have been developed for acquiring objective measures of autism, and in various fields of research, vocal characteristics have not only been reported as distinctive characteristics by clinicians, but have also shown promising performance in several studies utilizing deep learning models based on the automated discrimination of children with ASD from children with TD. However, difficulties still exist in terms of the characteristics of the data, the complexity of the analysis, and the lack of arranged data caused by the low accessibility for diagnosis and the need to secure anonymity. In order to address these issues, we introduce a pre-trained feature extraction auto-encoder model and a joint optimization scheme, which can achieve robustness for widely distributed and unrefined data using a deep-learning-based method for the detection of autism that utilizes various models. By adopting this auto-encoder-based feature extraction and joint optimization in the extended version of the Geneva minimalistic acoustic parameter set (eGeMAPS) speech feature data set, we acquire improved performance in the detection of ASD in infants compared to the raw data set.


2021 ◽  
Author(s):  
Tomochika Fujisawa ◽  
Victor Noguerales ◽  
Emmanouil Meramveliotakis ◽  
Anna Papadopoulou ◽  
Alfried P Vogler

Complex bulk samples of invertebrates from biodiversity surveys present a great challenge for taxonomic identification, especially if obtained from unexplored ecosystems. High-throughput imaging combined with machine learning for rapid classification could overcome this bottleneck. Developing such procedures requires that taxonomic labels from an existing source data set are used for model training and prediction of an unknown target sample. Yet the feasibility of transfer learning for the classification of unknown samples remains to be tested. Here, we assess the efficiency of deep learning and domain transfer algorithms for family-level classification of below-ground bulk samples of Coleoptera from understudied forests of Cyprus. We trained neural network models with images from local surveys versus global databases of above-ground samples from tropical forests and evaluated how prediction accuracy was affected by: (a) the quality and resolution of images, (b) the size and complexity of the training set and (c) the transferability of identifications across very disparate source-target pairs that do not share any species or genera. Within-dataset classification accuracy reached 98% and depended on the number and quality of training images and on dataset complexity. The accuracy of between-datasets predictions was reduced to a maximum of 82% and depended greatly on the standardisation of the imaging procedure. When the source and target images were of similar quality and resolution, albeit from different faunas, the reduction of accuracy was minimal. Application of algorithms for domain adaptation significantly improved the prediction performance of models trained by non-standardised, low-quality images. Our findings demonstrate that existing databases can be used to train models and successfully classify images from unexplored biota, when the imaging conditions and classification algorithms are carefully considered. Also, our results provide guidelines for data acquisition and algorithmic development for high-throughput image-based biodiversity surveys.


2021 ◽  
Vol 38 (1) ◽  
pp. 1-11
Author(s):  
Hafzullah İş ◽  
Taner Tuncer

It is highly important to detect malicious account interaction in social networks with regard to political, social and economic aspects. This paper analyzed the profile structure of social media users using their data interactions. A total of 10 parameters including diameter, density, reciprocity, centrality and modularity were used to comprehensively characterize the interactions of Twitter users. Moreover, a new data set was formed by visualizing the data obtained with these parameters. User profiles were classified using Convolutional Neural Network models with deep learning. Users were divided into active, passive and malicious classes. Success rates for the algorithms used in the classification were estimated based on the hyper parameters and application platforms. The best model had a success rate of 98.67%. The methodology demonstrated that Twitter user profiles can be classified successfully through user interaction-based parameters. It is expected that this paper will contribute to published literature in terms of behavioral analysis and the determination of malicious accounts in social networks.


Author(s):  
Taynan Ferreira ◽  
Francisco Paiva ◽  
Roberto Silva ◽  
Angel Paula ◽  
Anna Costa ◽  
...  

Sentiment analysis (SA) is increasing its importance due to the enormous amount of opinionated textual data available today. Most of the researches have investigated different models, feature representation and hyperparameters in SA classification tasks. However, few studies were conducted to evaluate the impact of these features on regression SA tasks. In this paper, we conduct such assessment on a financial domain data set by investigating different feature representations and hyperparameters in two important models -- Support Vector Regression (SVR) and Convolution Neural Networks (CNN). We conclude presenting the most relevant feature representations and hyperparameters and how they impact outcomes on a regression SA task.


2021 ◽  
Vol 11 (4) ◽  
pp. 1529
Author(s):  
Xiaohong Sun ◽  
Jinan Gu ◽  
Meimei Wang ◽  
Yanhua Meng ◽  
Huichao Shi

In the wheel hub industry, the quality control of the product surface determines the subsequent processing, which can be realized through the hub defect image recognition based on deep learning. Although the existing methods based on deep learning have reached the level of human beings, they rely on large-scale training sets, however, these models are completely unable to cope with the situation without samples. Therefore, in this paper, a generalized zero-shot learning framework for hub defect image recognition was built. First, a reverse mapping strategy was adopted to reduce the hubness problem, then a domain adaptation measure was employed to alleviate the projection domain shift problem, and finally, a scaling calibration strategy was used to avoid the recognition preference of seen defects. The proposed model was validated using two data sets, VOC2007 and the self-built hub defect data set, and the results showed that the method performed better than the current popular methods.


2020 ◽  
pp. bjophthalmol-2019-315600
Author(s):  
Yohei Hashimoto ◽  
Ryo Asaoka ◽  
Taichi Kiwaki ◽  
Hiroki Sugiura ◽  
Shotaro Asano ◽  
...  

Background/AimTo train and validate the prediction performance of the deep learning (DL) model to predict visual field (VF) in central 10° from spectral domain optical coherence tomography (SD-OCT).MethodsThis multicentre, cross-sectional study included paired Humphrey field analyser (HFA) 10-2 VF and SD-OCT measurements from 591 eyes of 347 patients with open-angle glaucoma (OAG) or normal subjects for the training data set. We trained a convolutional neural network (CNN) for predicting VF threshold (TH) sensitivity values from the thickness of the three macular layers: retinal nerve fibre layer, ganglion cell layer+inner plexiform layer and outer segment+retinal pigment epithelium. We implemented pattern-based regularisation on top of CNN to avoid overfitting. Using an external testing data set of 160 eyes of 131 patients with OAG, the prediction performance (absolute error (AE) and R2 between predicted and actual TH values) was calculated for (1) mean TH in whole VF and (2) each TH of 68 points. For comparison, we trained support vector machine (SVM) and multiple linear regression (MLR).ResultsAE of whole VF with CNN was 2.84±2.98 (mean±SD) dB, significantly smaller than those with SVM (5.65±5.12 dB) and MLR (6.96±5.38 dB) (all, p<0.001). Mean of point-wise mean AE with CNN was 5.47±3.05 dB, significantly smaller than those with SVM (7.96±4.63 dB) and MLR (11.71±4.15 dB) (all, p<0.001). R2 with CNN was 0.74 for the mean TH of whole VF, and 0.44±0.24 for the overall 68 points.ConclusionDL model showed considerably accurate prediction of HFA 10-2 VF from SD-OCT.


Sign in / Sign up

Export Citation Format

Share Document