scholarly journals A PCB Electronic Components Detection Network Design Based on Effective Receptive Field Size and Anchor Size Matching

2021 ◽  
Vol 2021 ◽  
pp. 1-19
Author(s):  
Jing Li ◽  
Weiye Li ◽  
Yingqian Chen ◽  
Jinan Gu

Vision-based recognizing and positioning of electronic components on the PCB (printed circuit board) can improve the quality inspection efficiency of electronic products in the manufacturing process. With the improvement of the design and the production process, the electronic components on the PCB show the characteristics of small sizes and similar appearances, which brings challenges to visual object detection. This paper designs a real-time electronic component detection network through effective receptive field size and anchor size matching in YOLOv3. We make contributions in the following three aspects: (1) realizing the calculation and visualization of the effective receptive field size of the different depth layers of the CNN (convolutional neural network) based on gradient backpropagation; (2) proposing a modular YOLOv3 composition strategy that can be added and removed; and (3) designing a lightweight and efficient detection network by effective receptive field size and anchor size matching algorithm. Compared with the Faster-RCNN (regions with convolutional neural network) features, SSD (single-shot multibox detectors), and original YOLOv3, our method not only has the highest detection mAP (mean average precision) on the PCB electronic component dataset, which is 95.03%, the smallest parameter size of the memory, about 1/3 of the original YOLOv3 parameter amount, but also the second-best performance on FLOPs (floating point operations).

Sensors ◽  
2020 ◽  
Vol 20 (8) ◽  
pp. 2393 ◽  
Author(s):  
Daniel Octavian Melinte ◽  
Luige Vladareanu

The interaction between humans and an NAO robot using deep convolutional neural networks (CNN) is presented in this paper based on an innovative end-to-end pipeline method that applies two optimized CNNs, one for face recognition (FR) and another one for the facial expression recognition (FER) in order to obtain real-time inference speed for the entire process. Two different models for FR are considered, one known to be very accurate, but has low inference speed (faster region-based convolutional neural network), and one that is not as accurate but has high inference speed (single shot detector convolutional neural network). For emotion recognition transfer learning and fine-tuning of three CNN models (VGG, Inception V3 and ResNet) has been used. The overall results show that single shot detector convolutional neural network (SSD CNN) and faster region-based convolutional neural network (Faster R-CNN) models for face detection share almost the same accuracy: 97.8% for Faster R-CNN on PASCAL visual object classes (PASCAL VOCs) evaluation metrics and 97.42% for SSD Inception. In terms of FER, ResNet obtained the highest training accuracy (90.14%), while the visual geometry group (VGG) network had 87% accuracy and Inception V3 reached 81%. The results show improvements over 10% when using two serialized CNN, instead of using only the FER CNN, while the recent optimization model, called rectified adaptive moment optimization (RAdam), lead to a better generalization and accuracy improvement of 3%-4% on each emotion recognition CNN.


Electronics ◽  
2021 ◽  
Vol 10 (14) ◽  
pp. 1737
Author(s):  
Wooseop Lee ◽  
Min-Hee Kang ◽  
Jaein Song ◽  
Keeyeon Hwang

As automated vehicles have been considered one of the important trends in intelligent transportation systems, various research is being conducted to enhance their safety. In particular, the importance of technologies for the design of preventive automated driving systems, such as detection of surrounding objects and estimation of distance between vehicles. Object detection is mainly performed through cameras and LiDAR, but due to the cost and limits of LiDAR’s recognition distance, the need to improve Camera recognition technique, which is relatively convenient for commercialization, is increasing. This study learned convolutional neural network (CNN)-based faster regions with CNN (Faster R-CNN) and You Only Look Once (YOLO) V2 to improve the recognition techniques of vehicle-mounted monocular cameras for the design of preventive automated driving systems, recognizing surrounding vehicles in black box highway driving videos and estimating distances from surrounding vehicles through more suitable models for automated driving systems. Moreover, we learned the PASCAL visual object classes (VOC) dataset for model comparison. Faster R-CNN showed similar accuracy, with a mean average precision (mAP) of 76.4 to YOLO with a mAP of 78.6, but with a Frame Per Second (FPS) of 5, showing slower processing speed than YOLO V2 with an FPS of 40, and a Faster R-CNN, which we had difficulty detecting. As a result, YOLO V2, which shows better performance in accuracy and processing speed, was determined to be a more suitable model for automated driving systems, further progressing in estimating the distance between vehicles. For distance estimation, we conducted coordinate value conversion through camera calibration and perspective transform, set the threshold to 0.7, and performed object detection and distance estimation, showing more than 80% accuracy for near-distance vehicles. Through this study, it is believed that it will be able to help prevent accidents in automated vehicles, and it is expected that additional research will provide various accident prevention alternatives such as calculating and securing appropriate safety distances, depending on the vehicle types.


2005 ◽  
Vol 93 (6) ◽  
pp. 3537-3547 ◽  
Author(s):  
Chong Weng ◽  
Chun-I Yeh ◽  
Carl R. Stoelzel ◽  
Jose-Manuel Alonso

Each point in visual space is encoded at the level of the thalamus by a group of neighboring cells with overlapping receptive fields. Here we show that the receptive fields of these cells differ in size and response latency but not at random. We have found that in the cat lateral geniculate nucleus (LGN) the receptive field size and response latency of neighboring neurons are significantly correlated: the larger the receptive field, the faster the response to visual stimuli. This correlation is widespread in LGN. It is found in groups of cells belonging to the same type (e.g., Y cells), and of different types (i.e., X and Y), within a specific layer or across different layers. These results indicate that the inputs from the multiple geniculate afferents that converge onto a cortical cell (approximately 30) are likely to arrive in a sequence determined by the receptive field size of the geniculate afferents. Recent studies have shown that the peak of the spatial frequency tuning of a cortical cell shifts toward higher frequencies as the response progresses in time. Our results are consistent with the idea that these shifts in spatial frequency tuning arise from differences in the response time course of the thalamic inputs.


Feed-forward neural networks can be trained based on a gradient-descent based backpropagation algorithm. But, these algorithms require more computation time. Extreme Learning Machines (ELM’s) are time-efficient, and they are less complicated than the conventional gradient-based algorithm. In previous years, an SRAM based convolutional neural network using a receptive – field Approach was proposed. This neural network was used as an encoder for the ELM algorithm and was implemented on FPGA. But, this neural network used an inaccurate 3-stage pipelined parallel adder. Hence, this neural network generates imprecise stimuli to the hidden layer neurons. This paper presents an implementation of precise convolutional neural network for encoding in the ELM algorithm based on the receptive - field approach at the hardware level. In the third stage of the pipelined parallel adder, instead of approximating the output by using one 2-input 15-bit adder, one 4-input 14-bit adder is used. Also, an additional weighted pixel array block is used. This weighted pixel array improves the accuracy of generating 128 weighted pixels. This neural network was simulated using ModelSim-Altera 10.1d and synthesized using Quartus II 13.0 sp1. This neural network is implemented on Cyclone V FPGA and used for pattern recognition applications. Although this design consumes slightly more hardware resources, this design is more accurate compared to previously existing encoders


1987 ◽  
Vol 510 (1 Olfaction and) ◽  
pp. 504-505
Author(s):  
CHARLOTTE M. MISTRETTA ◽  
TAKATOSHI NAGAI ◽  
ROBERT M. BRADLEY

2008 ◽  
Vol 25 (4) ◽  
pp. 419-427 ◽  
Author(s):  
Kazunori Yamamoto ◽  
Hiroshi Jouhou ◽  
Masanori Iwasaki ◽  
Akimichi Kaneko ◽  
Masahiro Yamada

Sign in / Sign up

Export Citation Format

Share Document