scholarly journals Development of an Optimised Dataset for Training a Deep Neural Network

2021 ◽  
Author(s):  
Callum Newman ◽  
Jon Petzing ◽  
Yee Mey Goh ◽  
Laura Justham

Artificial intelligence in computer vision has focused on improving test performance using techniques and architectures related to deep neural networks. However, improvements can also be achieved by carefully selecting the training dataset images. Environmental factors, such as light intensity, affect the image’s appearance and by choosing optimal factor levels the neural network’s performance can improve. However, little research into processes which help identify optimal levels is available. This research presents a case study which uses a process for developing an optimised dataset for training an object detection neural network. Images are gathered under controlled conditions using multiple factors to construct various training datasets. Each dataset is used to train the same neural network and the test performance compared to identify the optimal factors. The opportunity to use synthetic images is introduced, which has many advantages including creating images when real-world images are unavailable, and more easily controlled factors.

2021 ◽  
Vol 2042 (1) ◽  
pp. 012002
Author(s):  
Roberto Castello ◽  
Alina Walch ◽  
Raphaël Attias ◽  
Riccardo Cadei ◽  
Shasha Jiang ◽  
...  

Abstract The integration of solar technology in the built environment is realized mainly through rooftop-installed panels. In this paper, we leverage state-of-the-art Machine Learning and computer vision techniques applied on overhead images to provide a geo-localization of the available rooftop surfaces for solar panel installation. We further exploit a 3D building database to associate them to the corresponding roof geometries by means of a geospatial post-processing approach. The stand-alone Convolutional Neural Network used to segment suitable rooftop areas reaches an intersection over union of 64% and an accuracy of 93%, while a post-processing step using building database improves the rejection of false positives. The model is applied to a case study area in the canton of Geneva and the results are compared with another recent method used in the literature to derive the realistic available area.


2020 ◽  
Vol 23 (6) ◽  
pp. 1172-1191
Author(s):  
Artem Aleksandrovich Elizarov ◽  
Evgenii Viktorovich Razinkov

Recently, such a direction of machine learning as reinforcement learning has been actively developing. As a consequence, attempts are being made to use reinforcement learning for solving computer vision problems, in particular for solving the problem of image classification. The tasks of computer vision are currently one of the most urgent tasks of artificial intelligence. The article proposes a method for image classification in the form of a deep neural network using reinforcement learning. The idea of ​​the developed method comes down to solving the problem of a contextual multi-armed bandit using various strategies for achieving a compromise between exploitation and research and reinforcement learning algorithms. Strategies such as -greedy, -softmax, -decay-softmax, and the UCB1 method, and reinforcement learning algorithms such as DQN, REINFORCE, and A2C are considered. The analysis of the influence of various parameters on the efficiency of the method is carried out, and options for further development of the method are proposed.


2020 ◽  
Vol 3 (1) ◽  
pp. 138-146
Author(s):  
Subash Pandey ◽  
Rabin Kumar Dhamala ◽  
Bikram Karki ◽  
Saroj Dahal ◽  
Rama Bastola

 Automatically generating a natural language description of an image is a major challenging task in the field of artificial intelligence. Generating description of an image bring together the fields: Natural Language Processing and Computer Vision. There are two types of approaches i.e. top-down and bottom-up. For this paper, we approached top-down that starts from the image and converts it into the word. Image is passed to Convolutional Neural Network (CNN) encoder and the output from it is fed further to Recurrent Neural Network (RNN) decoder that generates meaningful captions. We generated the image description by passing the real time images from the camera of a smartphone as well as tested with the test images from the dataset. To evaluate the model performance, we used BLEU (Bilingual Evaluation Understudy) score and match predicted words to the original caption.


Author(s):  
N.A. Yanishevskaya ◽  
◽  
I.P. Bolodurina ◽  

In the Russian Federation, the agro-industrial complex is one of the leading sectors of the eco-nomy with a volume of domestic product of 4.5%. Russia owns 10 % of all arable land in the world. According to the data on the sown areas by crops in 2020, most of the agricultural area of Russia is occupied by wheat. The Russian Federation ranks third in the ranking of leading countries in the production of this type of grain crops, as well as leading positions in its export. Brown (leaf) and linear (stem) rust is the most harmful disease of grain crops. It is the reason for the sparseness of wheat crops and leads to a sharp decrease in yield. Therefore, one of the main tasks of farmers is to preserve the crop from diseases. The application of such areas of artificial intelligence as computer vision, machine learning and deep learning is able to cope with this task. These artificial intelligence technologies allow us to successfully solve applied problems of the agro-industrial complex using automated analysis of photographic materials. Aim. To consider the application of computer vision methods for the problem of classification of lesions of cultivated plants on the example of wheat. Materials and methods. The CGIAR Computer Vision for Crop Disease dataset for the crop disease recognition task is taken from the open source Kaggle. It is proposed to use an approach to the re-cognition of lesions of cultivated plants using the well-known neural network models ResNet50, DenseNet169, VGG16 and EfficientNet-B0. Neural network models receive images of wheat as in-put. The output of neural networks is the class of plant damage. To overcome the effect of overfit-ting neural networks, various regularization techniques are investigated. Results. The results of the classification quality, estimated by the software using the F1-score metric, which is the average harmonic between the Precision and Recall measures, are presented. Conclusion. As a result of the conducted research, it was found that the DenseNet model showed the best recognition accuracy us-ing a combination of transfer learning technology and DropOut and L2 regulation technologies to overcome the effect of retraining. The use of this approach allowed us to achieve a recognition ac-curacy of 91%.


Author(s):  
K. P. Moholkar, Et. al.

The ability of a computer system to be able to understand surroundings and elements and to think like a human being to process the information has always been the major point of focus in the field of Computer Science. One of the ways to achieve this artificial intelligence is Visual Question Answering. Visual Question Answering (VQA) is a trained system which can answer the questions associated to a given image in Natural Language. VQA is a generalized system which can be used in any image-based scenario with adequate training on the relevant data. This is achieved with the help of Neural Networks, particularly Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). In this study, we have compared different approaches of VQA, out of which we are exploring CNN based model. With the continued progress in the field of Computer Vision and Question answering system, Visual Question Answering is becoming the essential system which can handle multiple scenarios with their respective data.


Author(s):  
Abaikesh Sharma

The human faces have vibrant frequency of characteristics, which makes it difficult to analyze the facial expression. Automated real time emotions recognition with the help of facial expressions is a work in computer vision. This environment is an important and interesting tool between the humans and computers. In this investigation an environment is created which is capable of analyzing the person’s emotions using the real time facial gestures with the help of Deep Neural Network. It can detect the facial expression from any image either real or animated after facial extraction (muscle position, eye expression and lips position). This system is setup to classify images of human faces into seven discrete emotion categories using Convolutional Neural Networks (CNNs). This type of environment is important for social interaction.


2020 ◽  
Vol 96 (3s) ◽  
pp. 585-588
Author(s):  
С.Е. Фролова ◽  
Е.С. Янакова

Предлагаются методы построения платформ прототипирования высокопроизводительных систем на кристалле для задач искусственного интеллекта. Изложены требования к платформам подобного класса и принципы изменения проекта СнК для имплементации в прототип. Рассматриваются методы отладки проектов на платформе прототипирования. Приведены результаты работ алгоритмов компьютерного зрения с использованием нейросетевых технологий на FPGA-прототипе семантических ядер ELcore. Methods have been proposed for building prototyping platforms for high-performance systems-on-chip for artificial intelligence tasks. The requirements for platforms of this class and the principles for changing the design of the SoC for implementation in the prototype have been described as well as methods of debugging projects on the prototyping platform. The results of the work of computer vision algorithms using neural network technologies on the FPGA prototype of the ELcore semantic cores have been presented.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Florian Stelzer ◽  
André Röhm ◽  
Raul Vicente ◽  
Ingo Fischer ◽  
Serhiy Yanchuk

AbstractDeep neural networks are among the most widely applied machine learning tools showing outstanding performance in a broad range of tasks. We present a method for folding a deep neural network of arbitrary size into a single neuron with multiple time-delayed feedback loops. This single-neuron deep neural network comprises only a single nonlinearity and appropriately adjusted modulations of the feedback signals. The network states emerge in time as a temporal unfolding of the neuron’s dynamics. By adjusting the feedback-modulation within the loops, we adapt the network’s connection weights. These connection weights are determined via a back-propagation algorithm, where both the delay-induced and local network connections must be taken into account. Our approach can fully represent standard Deep Neural Networks (DNN), encompasses sparse DNNs, and extends the DNN concept toward dynamical systems implementations. The new method, which we call Folded-in-time DNN (Fit-DNN), exhibits promising performance in a set of benchmark tasks.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Mohammed Aliy Mohammed ◽  
Fetulhak Abdurahman ◽  
Yodit Abebe Ayalew

Abstract Background Automating cytology-based cervical cancer screening could alleviate the shortage of skilled pathologists in developing countries. Up until now, computer vision experts have attempted numerous semi and fully automated approaches to address the need. Yet, these days, leveraging the astonishing accuracy and reproducibility of deep neural networks has become common among computer vision experts. In this regard, the purpose of this study is to classify single-cell Pap smear (cytology) images using pre-trained deep convolutional neural network (DCNN) image classifiers. We have fine-tuned the top ten pre-trained DCNN image classifiers and evaluated them using five class single-cell Pap smear images from SIPaKMeD dataset. The pre-trained DCNN image classifiers were selected from Keras Applications based on their top 1% accuracy. Results Our experimental result demonstrated that from the selected top-ten pre-trained DCNN image classifiers DenseNet169 outperformed with an average accuracy, precision, recall, and F1-score of 0.990, 0.974, 0.974, and 0.974, respectively. Moreover, it dashed the benchmark accuracy proposed by the creators of the dataset with 3.70%. Conclusions Even though the size of DenseNet169 is small compared to the experimented pre-trained DCNN image classifiers, yet, it is not suitable for mobile or edge devices. Further experimentation with mobile or small-size DCNN image classifiers is required to extend the applicability of the models in real-world demands. In addition, since all experiments used the SIPaKMeD dataset, additional experiments will be needed using new datasets to enhance the generalizability of the models.


2021 ◽  
Author(s):  
Daniil A. Boiko ◽  
Evgeniy O. Pentsak ◽  
Vera A. Cherepanova ◽  
Evgeniy G. Gordeev ◽  
Valentine P. Ananikov

Defectiveness of carbon material surface is a key issue for many applications. Pd-nanoparticle SEM imaging was used to highlight “hidden” defects and analyzed by neural networks to solve order/disorder classification and defect segmentation tasks.


Sign in / Sign up

Export Citation Format

Share Document