scholarly journals An Autonomous Fruit and Vegetable Harvester with a Low-Cost Gripper Using a 3D Sensor

Sensors ◽  
2019 ◽  
Vol 20 (1) ◽  
pp. 93 ◽  
Author(s):  
Tan Zhang ◽  
Zhenhai Huang ◽  
Weijie You ◽  
Jiatao Lin ◽  
Xiaolong Tang ◽  
...  

Reliable and robust systems to detect and harvest fruits and vegetables in unstructured environments are crucial for harvesting robots. In this paper, we propose an autonomous system that harvests most types of crops with peduncles. A geometric approach is first applied to obtain the cutting points of the peduncle based on the fruit bounding box, for which we have adapted the model of the state-of-the-art object detector named Mask Region-based Convolutional Neural Network (Mask R-CNN). We designed a novel gripper that simultaneously clamps and cuts the peduncles of crops without contacting the flesh. We have conducted experiments with a robotic manipulator to evaluate the effectiveness of the proposed harvesting system in being able to efficiently harvest most crops in real laboratory environments.

2021 ◽  
Vol 7 (2) ◽  
pp. 356-362
Author(s):  
Harry Coppock ◽  
Alex Gaskell ◽  
Panagiotis Tzirakis ◽  
Alice Baird ◽  
Lyn Jones ◽  
...  

BackgroundSince the emergence of COVID-19 in December 2019, multidisciplinary research teams have wrestled with how best to control the pandemic in light of its considerable physical, psychological and economic damage. Mass testing has been advocated as a potential remedy; however, mass testing using physical tests is a costly and hard-to-scale solution.MethodsThis study demonstrates the feasibility of an alternative form of COVID-19 detection, harnessing digital technology through the use of audio biomarkers and deep learning. Specifically, we show that a deep neural network based model can be trained to detect symptomatic and asymptomatic COVID-19 cases using breath and cough audio recordings.ResultsOur model, a custom convolutional neural network, demonstrates strong empirical performance on a data set consisting of 355 crowdsourced participants, achieving an area under the curve of the receiver operating characteristics of 0.846 on the task of COVID-19 classification.ConclusionThis study offers a proof of concept for diagnosing COVID-19 using cough and breath audio signals and motivates a comprehensive follow-up research study on a wider data sample, given the evident advantages of a low-cost, highly scalable digital COVID-19 diagnostic tool.


2017 ◽  
Vol 2017 ◽  
pp. 1-11 ◽  
Author(s):  
Siqi Tang ◽  
Zhisong Pan ◽  
Xingyu Zhou

This paper proposes an accurate crowd counting method based on convolutional neural network and low-rank and sparse structure. To this end, we firstly propose an effective deep-fusion convolutional neural network to promote the density map regression accuracy. Furthermore, we figure out that most of the existing CNN based crowd counting methods obtain overall counting by direct integral of estimated density map, which limits the accuracy of counting. Instead of direct integral, we adopt a regression method based on low-rank and sparse penalty to promote accuracy of the projection from density map to global counting. Experiments demonstrate the importance of such regression process on promoting the crowd counting performance. The proposed low-rank and sparse based deep-fusion convolutional neural network (LFCNN) outperforms existing crowd counting methods and achieves the state-of-the-art performance.


2021 ◽  
Vol 11 (15) ◽  
pp. 6845
Author(s):  
Abu Sayeed ◽  
Jungpil Shin ◽  
Md. Al Mehedi Hasan ◽  
Azmain Yakin Srizon ◽  
Md. Mehedi Hasan

As it is the seventh most-spoken language and fifth most-spoken native language in the world, the domain of Bengali handwritten character recognition has fascinated researchers for decades. Although other popular languages i.e., English, Chinese, Hindi, Spanish, etc. have received many contributions in the area of handwritten character recognition, Bengali has not received many noteworthy contributions in this domain because of the complex curvatures and similar writing fashions of Bengali characters. Previously, studies were conducted by using different approaches based on traditional learning, and deep learning. In this research, we proposed a low-cost novel convolutional neural network architecture for the recognition of Bengali characters with only 2.24 to 2.43 million parameters based on the number of output classes. We considered 8 different formations of CMATERdb datasets based on previous studies for the training phase. With experimental analysis, we showed that our proposed system outperformed previous works by a noteworthy margin for all 8 datasets. Moreover, we tested our trained models on other available Bengali characters datasets such as Ekush, BanglaLekha, and NumtaDB datasets. Our proposed architecture achieved 96–99% overall accuracies for these datasets as well. We believe our contributions will be beneficial for developing an automated high-performance recognition tool for Bengali handwritten characters.


2018 ◽  
Vol 4 (9) ◽  
pp. 107 ◽  
Author(s):  
Mohib Ullah ◽  
Ahmed Mohammed ◽  
Faouzi Alaya Cheikh

Articulation modeling, feature extraction, and classification are the important components of pedestrian segmentation. Usually, these components are modeled independently from each other and then combined in a sequential way. However, this approach is prone to poor segmentation if any individual component is weakly designed. To cope with this problem, we proposed a spatio-temporal convolutional neural network named PedNet which exploits temporal information for spatial segmentation. The backbone of the PedNet consists of an encoder–decoder network for downsampling and upsampling the feature maps, respectively. The input to the network is a set of three frames and the output is a binary mask of the segmented regions in the middle frame. Irrespective of classical deep models where the convolution layers are followed by a fully connected layer for classification, PedNet is a Fully Convolutional Network (FCN). It is trained end-to-end and the segmentation is achieved without the need of any pre- or post-processing. The main characteristic of PedNet is its unique design where it performs segmentation on a frame-by-frame basis but it uses the temporal information from the previous and the future frame for segmenting the pedestrian in the current frame. Moreover, to combine the low-level features with the high-level semantic information learned by the deeper layers, we used long-skip connections from the encoder to decoder network and concatenate the output of low-level layers with the higher level layers. This approach helps to get segmentation map with sharp boundaries. To show the potential benefits of temporal information, we also visualized different layers of the network. The visualization showed that the network learned different information from the consecutive frames and then combined the information optimally to segment the middle frame. We evaluated our approach on eight challenging datasets where humans are involved in different activities with severe articulation (football, road crossing, surveillance). The most common CamVid dataset which is used for calculating the performance of the segmentation algorithm is evaluated against seven state-of-the-art methods. The performance is shown on precision/recall, F 1 , F 2 , and mIoU. The qualitative and quantitative results show that PedNet achieves promising results against state-of-the-art methods with substantial improvement in terms of all the performance metrics.


Author(s):  
Gauri Jain ◽  
Manisha Sharma ◽  
Basant Agarwal

This article describes how spam detection in the social media text is becoming increasing important because of the exponential increase in the spam volume over the network. It is challenging, especially in case of text within the limited number of characters. Effective spam detection requires more number of efficient features to be learned. In the current article, the use of a deep learning technology known as a convolutional neural network (CNN) is proposed for spam detection with an added semantic layer on the top of it. The resultant model is known as a semantic convolutional neural network (SCNN). A semantic layer is composed of training the random word vectors with the help of Word2vec to get the semantically enriched word embedding. WordNet and ConceptNet are used to find the word similar to a given word, in case it is missing in the word2vec. The architecture is evaluated on two corpora: SMS Spam dataset (UCI repository) and Twitter dataset (Tweets scrapped from public live tweets). The authors' approach outperforms the-state-of-the-art results with 98.65% accuracy on SMS spam dataset and 94.40% accuracy on Twitter dataset.


Author(s):  
Hongguo Su ◽  
Mingyuan Zhang ◽  
Shengyuan Li ◽  
Xuefeng Zhao

In the last couple of years, advancements in the deep learning, especially in convolutional neural networks, proved to be a boon for the image classification and recognition tasks. One of the important practical applications of object detection and image classification can be for security enhancement. If dangerous objects or scenes can be identified automatically, then a lot of accidents can be prevented. For this purpose, in this paper we made use of state-of-the-art implementation of Faster Region-based Convolutional Neural Network (Faster R-CNN) based on the monitoring video of hoisting sites to train a model to detect the dangerous object and the worker. By extracting the locations of them, object-human interactions during hoisting, mainly for changes in their spatial location relationship, can be understood whereby estimating whether the scene is safe or dangerous. Experimental results showed that the pre-trained model achieved good performance with a high mean average precision of 97.66% on object detection and the proposed method fulfilled the goal of dangerous scenes recognition perfectly.


Inventions ◽  
2021 ◽  
Vol 6 (4) ◽  
pp. 70
Author(s):  
Elena Solovyeva ◽  
Ali Abdullah

In this paper, the structure of a separable convolutional neural network that consists of an embedding layer, separable convolutional layers, convolutional layer and global average pooling is represented for binary and multiclass text classifications. The advantage of the proposed structure is the absence of multiple fully connected layers, which is used to increase the classification accuracy but raises the computational cost. The combination of low-cost separable convolutional layers and a convolutional layer is proposed to gain high accuracy and, simultaneously, to reduce the complexity of neural classifiers. Advantages are demonstrated at binary and multiclass classifications of written texts by means of the proposed networks under the sigmoid and Softmax activation functions in convolutional layer. At binary and multiclass classifications, the accuracy obtained by separable convolutional neural networks is higher in comparison with some investigated types of recurrent neural networks and fully connected networks.


Author(s):  
Truong Quang Vinh ◽  
Dinh Viet Hai

Convolutional neural network (CNN) is one of the most promising algorithms that outweighs other traditional methods in terms of accuracy in classification tasks. However, several CNNs, such as VGG, demand a huge computation in convolutional layers. Many accelerators implemented on powerful FPGAs have been introduced to address the problems. In this paper, we present a VGG-based accelerator which is optimized for a low-cost FPGA. In order to optimize the FPGA resource of logic element and memory, we propose a dedicated input buffer that maximizes the data reuse. In addition, we design a low resource processing engine with the optimal number of Multiply Accumulate (MAC) units. In the experiments, we use VGG16 model for inference to evaluate the performance of our accelerator and achieve a throughput of 38.8[Formula: see text]GOPS at a clock speed of 150[Formula: see text]MHz on Intel Cyclone V SX SoC. The experimental results show that our design is better than previous works in terms of resource efficiency.


Author(s):  
Qi Xin ◽  
Shaohao Hu ◽  
Shuaiqi Liu ◽  
Ling Zhao ◽  
Shuihua Wang

As one of the important tools of epilepsy diagnosis, the electroencephalogram (EEG) is noninvasive and presents no traumatic injury to patients. It contains a lot of physiological and pathological information that is easy to obtain. The automatic classification of epileptic EEG is important in the diagnosis and therapeutic efficacy of epileptics. In this article, an explainable graph feature convolutional neural network named WTRPNet is proposed for epileptic EEG classification. Since WTRPNet is constructed by a recurrence plot in the wavelet domain, it can fully obtain the graph feature of the EEG signal, which is established by an explainable graph features extracted layer called WTRP block . The proposed method shows superior performance over state-of-the-art methods. Experimental results show that our algorithm has achieved an accuracy of 99.67% in classification of focal and nonfocal epileptic EEG, which proves the effectiveness of the classification and detection of epileptic EEG.


Sign in / Sign up

Export Citation Format

Share Document