A Study on the Detection of Cattle in UAV Images Using Deep Learning

Jayme Garcia Arnal Barbedo; Luciano Vieira Koenigkan; Thiago Teixeira Santos; Patrícia Menezes Santos

doi:10.3390/s19245436

A Study on the Detection of Cattle in UAV Images Using Deep Learning

Sensors ◽

10.3390/s19245436 ◽

2019 ◽

Vol 19 (24) ◽

pp. 5436 ◽

Cited By ~ 7

Author(s):

Jayme Garcia Arnal Barbedo ◽

Luciano Vieira Koenigkan ◽

Thiago Teixeira Santos ◽

Patrícia Menezes Santos

Keyword(s):

Deep Learning ◽

Bos Taurus ◽

Relevant Information ◽

Learning Technologies ◽

Aerial Images ◽

Ground Sample ◽

Bos Taurus Indicus ◽

Uav Images ◽

Production Areas ◽

Animal Detection

Unmanned aerial vehicles (UAVs) are being increasingly viewed as valuable tools to aid the management of farms. This kind of technology can be particularly useful in the context of extensive cattle farming, as production areas tend to be expansive and animals tend to be more loosely monitored. With the advent of deep learning, and convolutional neural networks (CNNs) in particular, extracting relevant information from aerial images has become more effective. Despite the technological advancements in drone, imaging and machine learning technologies, the application of UAVs for cattle monitoring is far from being thoroughly studied, with many research gaps still remaining. In this context, the objectives of this study were threefold: (1) to determine the highest possible accuracy that could be achieved in the detection of animals of the Canchim breed, which is visually similar to the Nelore breed (Bos taurus indicus); (2) to determine the ideal ground sample distance (GSD) for animal detection; (3) to determine the most accurate CNN architecture for this specific problem. The experiments involved 1853 images containing 8629 samples of animals, and 15 different CNN architectures were tested. A total of 900 models were trained (15 CNN architectures × 3 spacial resolutions × 2 datasets × 10-fold cross validation), allowing for a deep analysis of the several aspects that impact the detection of cattle using aerial images captured using UAVs. Results revealed that many CNN architectures are robust enough to reliably detect animals in aerial images even under far from ideal conditions, indicating the viability of using UAVs for cattle monitoring.

Download Full-text

A Study on the Detection of Cattle in UAV Images Using Deep Learning

10.20944/preprints201912.0089.v1 ◽

2019 ◽

Author(s):

Jayme Garcia Arnal Barbedo ◽

Luciano Vieira Koenigkan ◽

Thiago Teixeira Santos ◽

Patrícia Menezes Santos

Keyword(s):

Deep Learning ◽

Bos Taurus ◽

Relevant Information ◽

Learning Technologies ◽

Aerial Images ◽

Ground Sample ◽

Bos Taurus Indicus ◽

Uav Images ◽

Production Areas ◽

Animal Detection

Unmanned Aerial Vehicles (UAVs) are being increasingly viewed as valuable tools to aid the management of farms. This kind of technology can be particularly useful in the context of extensive cattle farming, as production areas tend to be expansive and animals tend to be more loosely monitored. With the advent of deep learning, and Convolutional Neural Networks (CNNs) in particular, extracting relevant information from aerial images has become more effective. Despite the technological advancements in drone, imaging and machine learning technologies, the application of UAVs for cattle monitoring is far from being thoroughly studied, with many research gaps still remaining. In this context, the objectives of this study were threefold: 1) to determine the highest possible accuracy that could be achieved in the detection of animals of the Canchim breed, which is visually similar to the Nelore breed (\textit{Bos taurus indicus}); 2) to determine the ideal Ground Sample Distance (GSD) for animal detection; 3) to determine the most accurate CNN architecture for this specific problem. The experiments involved 1,853 images containing 8,629 samples of animals, and 15 different CNN architectures were tested. A total of 900 models were trained (15 CNN architectures * 3 spacial resolutions * 2 datasets * 10-fold cross validation), allowing for a deep analysis of the several aspects that impact the detection of cattle using aerial images captured using UAVs. Results revealed that many CNN architectures are robust enough to reliably detect animals in aerial images even under far from ideal conditions, indicating the viability of using UAVs for cattle monitoring.

Download Full-text

DEEP LEARNING FOR MONOCULAR DEPTH ESTIMATION FROM UAV IMAGES

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-2-2020-451-2020 ◽

2020 ◽

Vol V-2-2020 ◽

pp. 451-458

Author(s):

L. Madhuanand ◽

F. Nex ◽

M. Y. Yang

Keyword(s):

Deep Learning ◽

Ground Level ◽

Depth Estimation ◽

Aerial Images ◽

Aerial Image ◽

Depth Information ◽

Single Image ◽

Monocular Depth ◽

Uav Images ◽

Image Depth

Abstract. Depth is an essential component for various scene understanding tasks and for reconstructing the 3D geometry of the scene. Estimating depth from stereo images requires multiple views of the same scene to be captured which is often not possible when exploring new environments with a UAV. To overcome this monocular depth estimation has been a topic of interest with the recent advancements in computer vision and deep learning techniques. This research has been widely focused on indoor scenes or outdoor scenes captured at ground level. Single image depth estimation from aerial images has been limited due to additional complexities arising from increased camera distance, wider area coverage with lots of occlusions. A new aerial image dataset is prepared specifically for this purpose combining Unmanned Aerial Vehicles (UAV) images covering different regions, features and point of views. The single image depth estimation is based on image reconstruction techniques which uses stereo images for learning to estimate depth from single images. Among the various available models for ground-level single image depth estimation, two models, 1) a Convolutional Neural Network (CNN) and 2) a Generative Adversarial model (GAN) are used to learn depth from aerial images from UAVs. These models generate pixel-wise disparity images which could be converted into depth information. The generated disparity maps from these models are evaluated for its internal quality using various error metrics. The results show higher disparity ranges with smoother images generated by CNN model and sharper images with lesser disparity range generated by GAN model. The produced disparity images are converted to depth information and compared with point clouds obtained using Pix4D. It is found that the CNN model performs better than GAN and produces depth similar to that of Pix4D. This comparison helps in streamlining the efforts to produce depth from a single aerial image.

Download Full-text

Self-Organizing Deep Learning (SO-UNet)—A Novel Framework to Classify Urban and Peri-Urban Forests

Sustainability ◽

10.3390/su13105548 ◽

2021 ◽

Vol 13 (10) ◽

pp. 5548

Author(s):

Mohamad M. Awad ◽

Marco Lauteri

Keyword(s):

Remote Sensing ◽

Deep Learning ◽

Spatial Resolution ◽

Forest Type ◽

Urban Forests ◽

Learning Technologies ◽

Aerial Images ◽

Hyper Spectral ◽

Type Classification ◽

Self Organizing

Forest-type classification is a very complex and difficult subject. The complexity increases with urban and peri-urban forests because of the variety of features that exist in remote sensing images. The success of forest management that includes forest preservation depends strongly on the accuracy of forest-type classification. Several classification methods are used to map urban and peri-urban forests and to identify healthy and non-healthy ones. Some of these methods have shown success in the classification of forests where others failed. The successful methods used specific remote sensing data technology, such as hyper-spectral and very high spatial resolution (VHR) images. However, both VHR and hyper-spectral sensors are very expensive, and hyper-spectral sensors are not widely available on satellite platforms, unlike multi-spectral sensors. Moreover, aerial images are limited in use, very expensive, and hard to arrange and manage. To solve the aforementioned problems, an advanced method, self-organizing–deep learning (SO-UNet), was created to classify forests in the urban and peri-urban environment using multi-spectral, multi-temporal, and medium spatial resolution Sentinel-2 images. SO-UNet is a combination of two different machine learning technologies: artificial neural network unsupervised self-organizing maps and deep learning UNet. Many experiments have been conducted, and the results showed that SO-UNet overwhelms UNet significantly. The experiments encompassed different settings for the parameters that control the algorithms.

Download Full-text

VddNet: Vine Disease Detection Network Based on Multispectral Images and Depth Map

Remote Sensing ◽

10.3390/rs12203305 ◽

2020 ◽

Vol 12 (20) ◽

pp. 3305

Author(s):

Mohamed Kerkech ◽

Adel Hafiane ◽

Raphael Canals

Keyword(s):

Deep Learning ◽

Precision Agriculture ◽

Depth Map ◽

Semantic Segmentation ◽

Learning Technologies ◽

Disease Detection ◽

Registration Method ◽

Multispectral Data ◽

3D Processing ◽

Uav Images

Vine pathologies generate several economic and environmental problems, causing serious difficulties for the viticultural activity. The early detection of vine disease can significantly improve the control of vine diseases and avoid spread of virus or fungi. Currently, remote sensing and artificial intelligence technologies are emerging in the field of precision agriculture. They offer interesting potential for crop disease management. However, despite the advances in these technologies, particularly deep learning technologies, many problems still present considerable challenges, such as semantic segmentation of images for disease mapping. In this paper, we present a new deep learning architecture called Vine Disease Detection Network (VddNet). It is based on three parallel auto-encoders integrating different information (i.e., visible, infrared and depth). Then, the decoder reconstructs and retrieves the features, and assigns a class to each output pixel. An orthophotos registration method is also proposed to align the three types of images and enable the processing by VddNet. The proposed architecture is assessed by comparing it with the most known architectures: SegNet, U-Net, DeepLabv3+ and PSPNet. The deep learning architectures were trained on multispectral data from an unmanned aerial vehicle (UAV) and depth map information extracted from 3D processing. The results of the proposed architecture show that the VddNet architecture achieves higher scores than the baseline methods. Moreover, this study demonstrates that the proposed method has many advantages compared to methods that directly use the UAV images.

Download Full-text

Deep-Learning-Based Automated Palm Tree Counting and Geolocation in Large Farms from Aerial Geotagged Images

Agronomy ◽

10.3390/agronomy11081458 ◽

2021 ◽

Vol 11 (8) ◽

pp. 1458

Author(s):

Adel Ammar ◽

Anis Koubaa ◽

Bilel Benjdira

Keyword(s):

Deep Learning ◽

Geographical Location ◽

Network Models ◽

Aerial Images ◽

Palm Tree ◽

Average Precision ◽

Neural Network Models ◽

Learning Framework ◽

Palm Trees ◽

Uav Images

In this paper, we propose an original deep learning framework for the automated counting and geolocation of palm trees from aerial images using convolutional neural networks. For this purpose, we collected aerial images from two different regions in Saudi Arabia, using two DJI drones, and we built a dataset of around 11,000 instances of palm trees. Then, we applied several recent convolutional neural network models (Faster R-CNN, YOLOv3, YOLOv4, and EfficientDet) to detect palms and other trees, and we conducted a complete comparative evaluation in terms of average precision and inference speed. YOLOv4 and EfficientDet-D5 yielded the best trade-off between accuracy and speed (up to 99% mean average precision and 7.4 FPS). Furthermore, using the geotagged metadata of aerial images, we used photogrammetry concepts and distance corrections to automatically detect the geographical location of detected palm trees. This geolocation technique was tested on two different types of drones (DJI Mavic Pro and Phantom 4 pro) and was assessed to provide an average geolocation accuracy that attains 1.6 m. This GPS tagging allows us to uniquely identify palm trees and count their number from a series of drone images, while correctly dealing with the issue of image overlapping. Moreover, this innovative combination between deep learning object detection and geolocalization can be generalized to any other objects in UAV images.

Download Full-text

Battery-Powered Wild Animal Detection Nodes with Deep Learning

IEICE Transactions on Communications ◽

10.1587/transcom.2020sep0004 ◽

2020 ◽

Vol E103.B (12) ◽

pp. 1394-1402

Author(s):

Hiroshi SAITO ◽

Tatsuki OTAKE ◽

Hayato KATO ◽

Masayuki TOKUTAKE ◽

Shogo SEMBA ◽

...

Keyword(s):

Deep Learning ◽

Wild Animal ◽

Animal Detection

Download Full-text

U-Infuse: Democratization of Customizable Deep Learning for Object Detection

Sensors ◽

10.3390/s21082611 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2611

Author(s):

Andrew Shepley ◽

Greg Falzon ◽

Christopher Lawson ◽

Paul Meek ◽

Paul Kwan

Keyword(s):

Deep Learning ◽

Intellectual Property ◽

Object Detection ◽

Image Data ◽

Learning Technologies ◽

Training Data ◽

Learning Models ◽

Ecological Data ◽

Single Class ◽

Large Numbers

Image data is one of the primary sources of ecological data used in biodiversity conservation and management worldwide. However, classifying and interpreting large numbers of images is time and resource expensive, particularly in the context of camera trapping. Deep learning models have been used to achieve this task but are often not suited to specific applications due to their inability to generalise to new environments and inconsistent performance. Models need to be developed for specific species cohorts and environments, but the technical skills required to achieve this are a key barrier to the accessibility of this technology to ecologists. Thus, there is a strong need to democratize access to deep learning technologies by providing an easy-to-use software application allowing non-technical users to train custom object detectors. U-Infuse addresses this issue by providing ecologists with the ability to train customised models using publicly available images and/or their own images without specific technical expertise. Auto-annotation and annotation editing functionalities minimize the constraints of manually annotating and pre-processing large numbers of images. U-Infuse is a free and open-source software solution that supports both multiclass and single class training and object detection, allowing ecologists to access deep learning technologies usually only available to computer scientists, on their own device, customised for their application, without sharing intellectual property or sensitive data. It provides ecological practitioners with the ability to (i) easily achieve object detection within a user-friendly GUI, generating a species distribution report, and other useful statistics, (ii) custom train deep learning models using publicly available and custom training data, (iii) achieve supervised auto-annotation of images for further training, with the benefit of editing annotations to ensure quality datasets. Broad adoption of U-Infuse by ecological practitioners will improve ecological image analysis and processing by allowing significantly more image data to be processed with minimal expenditure of time and resources, particularly for camera trap images. Ease of training and use of transfer learning means domain-specific models can be trained rapidly, and frequently updated without the need for computer science expertise, or data sharing, protecting intellectual property and privacy.

Download Full-text

Identifying the Branch of Kiwifruit Based on Unmanned Aerial Vehicle (UAV) Images Using Deep Learning Method

Sensors ◽

10.3390/s21134442 ◽

2021 ◽

Vol 21 (13) ◽

pp. 4442

Author(s):

Zijie Niu ◽

Juntao Deng ◽

Xu Zhang ◽

Jun Zhang ◽

Shijia Pan ◽

...

Keyword(s):

Deep Learning ◽

Unmanned Aerial Vehicle ◽

Semantic Segmentation ◽

Dynamic Monitoring ◽

Support Vector ◽

Distribution Maps ◽

Time Operation ◽

Aerial Vehicle ◽

Uav Images ◽

Segmentation Image

It is important to obtain accurate information about kiwifruit vines to monitoring their physiological states and undertake precise orchard operations. However, because vines are small and cling to trellises, and have branches laying on the ground, numerous challenges exist in the acquisition of accurate data for kiwifruit vines. In this paper, a kiwifruit canopy distribution prediction model is proposed on the basis of low-altitude unmanned aerial vehicle (UAV) images and deep learning techniques. First, the location of the kiwifruit plants and vine distribution are extracted from high-precision images collected by UAV. The canopy gradient distribution maps with different noise reduction and distribution effects are generated by modifying the threshold and sampling size using the resampling normalization method. The results showed that the accuracies of the vine segmentation using PSPnet, support vector machine, and random forest classification were 71.2%, 85.8%, and 75.26%, respectively. However, the segmentation image obtained using depth semantic segmentation had a higher signal-to-noise ratio and was closer to the real situation. The average intersection over union of the deep semantic segmentation was more than or equal to 80% in distribution maps, whereas, in traditional machine learning, the average intersection was between 20% and 60%. This indicates the proposed model can quickly extract the vine distribution and plant position, and is thus able to perform dynamic monitoring of orchards to provide real-time operation guidance.

Download Full-text

Catfish density estimation by aerial images analysis and deep learning

Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing - SAC '19 ◽

10.1145/3297280.3297575 ◽

2019 ◽

Author(s):

Donatello Conte ◽

Pierre Gaucher ◽

Carlo Sansone

Keyword(s):

Deep Learning ◽

Density Estimation ◽

Aerial Images

Download Full-text

Orchard Mapping with Deep Learning Semantic Segmentation

Sensors ◽

10.3390/s21113813 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3813

Author(s):

Athanasios Anagnostis ◽

Aristotelis C. Tagarakis ◽

Dimitrios Kateris ◽

Vasileios Moysiadis ◽

Claus Grøn Sørensen ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Semantic Segmentation ◽

Automated Detection ◽

Aerial Images ◽

Training Dataset ◽

Field Boundary ◽

Different Seasons ◽

Detection And Localization ◽

Different Levels

This study aimed to propose an approach for orchard trees segmentation using aerial images based on a deep learning convolutional neural network variant, namely the U-net network. The purpose was the automated detection and localization of the canopy of orchard trees under various conditions (i.e., different seasons, different tree ages, different levels of weed coverage). The implemented dataset was composed of images from three different walnut orchards. The achieved variability of the dataset resulted in obtaining images that fell under seven different use cases. The best-trained model achieved 91%, 90%, and 87% accuracy for training, validation, and testing, respectively. The trained model was also tested on never-before-seen orthomosaic images or orchards based on two methods (oversampling and undersampling) in order to tackle issues with out-of-the-field boundary transparent pixels from the image. Even though the training dataset did not contain orthomosaic images, it achieved performance levels that reached up to 99%, demonstrating the robustness of the proposed approach.

Download Full-text