ReS2tAC—UAV-Borne Real-Time SGM Stereo Optimized for Embedded ARM and CUDA Devices

Boitumelo Ruf; Jonas Mohrs; Martin Weinmann; Stefan Hinz; Jürgen Beyerer

doi:10.3390/s21113938

ReS2tAC—UAV-Borne Real-Time SGM Stereo Optimized for Embedded ARM and CUDA Devices

Sensors ◽

10.3390/s21113938 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3938

Author(s):

Boitumelo Ruf ◽

Jonas Mohrs ◽

Martin Weinmann ◽

Stefan Hinz ◽

Jürgen Beyerer

Keyword(s):

Power Consumption ◽

Real Time ◽

High Performance ◽

Low Cost ◽

Qualitative Evaluation ◽

Image Resolution ◽

Massively Parallel ◽

Graphics Hardware ◽

Global Matching ◽

Stereo Processing

With the emergence of low-cost robotic systems, such as *UAV, the importance of embedded high-performance image processing has increased. For a long time, FPGAs were the only processing hardware that were capable of high-performance computing, while at the same time preserving a low power consumption, essential for embedded systems. However, the recently increasing availability of embedded GPU-based systems, such as the NVIDIA Jetson series, comprised of an ARM CPU and a NVIDIA Tegra GPU, allows for massively parallel embedded computing on graphics hardware. With this in mind, we propose an approach for real-time embedded stereo processing on ARM and CUDA-enabled devices, which is based on the popular and widely used Semi-Global Matching algorithm. In this, we propose an optimization of the algorithm for embedded CUDA GPUs, by using massively parallel computing, as well as using the NEON intrinsics to optimize the algorithm for vectorized SIMD processing on embedded ARM CPUs. We have evaluated our approach with different configurations on two public stereo benchmark datasets to demonstrate that they can reach an error rate as low as 3.3%. Furthermore, our experiments show that the fastest configuration of our approach reaches up to 46 FPS on VGA image resolution. Finally, in a use-case specific qualitative evaluation, we have evaluated the power consumption of our approach and deployed it on the DJI Manifold 2-G attached to a DJI Matrix 210v2 RTK *UAV, demonstrating its suitability for real-time stereo processing onboard a *UAV.

Download Full-text

Constructing a Bioinformatics Platform with Web and Mobile Services Based on NVIDIA Jetson TK1

International Journal of Grid and High Performance Computing ◽

10.4018/ijghpc.2015100105 ◽

2015 ◽

Vol 7 (4) ◽

pp. 57-73 ◽

Cited By ~ 2

Author(s):

Chun-Yuan Lin ◽

Jin Ye ◽

Che-Lun Hung ◽

Chung-Hung Wang ◽

Min Su ◽

...

Keyword(s):

Power Consumption ◽

High Performance Computing ◽

Graphics Processing Units ◽

High Performance ◽

Low Cost ◽

Research Direction ◽

Mobile Services ◽

Performance Ratio ◽

The Cost ◽

Performance Computing

Current high-end graphics processing units (abbreviate to GPUs), such as NVIDIA Tesla, Fermi, Kepler series cards which contain up to thousand cores per-chip, are widely used in the high performance computing fields. These GPU cards (called desktop GPUs) should be installed in personal computers/servers with desktop CPUs; moreover, the cost and power consumption of constructing a high performance computing platform with these desktop CPUs and GPUs are high. NVIDIA releases Tegra K1, called Jetson TK1, which contains 4 ARM Cortex-A15 CPUs and 192 CUDA cores (Kepler GPU) and is an embedded board with low cost, low power consumption and high applicability advantages for embedded applications. NVIDIA Jetson TK1 becomes a new research direction. Hence, in this paper, a bioinformatics platform was constructed based on NVIDIA Jetson TK1. ClustalWtk and MCCtk tools for sequence alignment and compound comparison were designed on this platform, respectively. Moreover, the web and mobile services for these two tools with user friendly interfaces also were provided. The experimental results showed that the cost-performance ratio by NVIDIA Jetson TK1 is higher than that by Intel XEON E5-2650 CPU and NVIDIA Tesla K20m GPU card.

Download Full-text

Using AI at the Edge and Incremental Machine Learning to Process Onboard Instrument Data

10.5957/smc-2021-048 ◽

2021 ◽

Author(s):

Nicholas Parkyn

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Real Time ◽

Data Storage ◽

High Performance ◽

Heterogeneous Computing ◽

Low Cost ◽

Machine Intelligence ◽

Edge Computing ◽

Continual Learning

Emerging heterogeneous computing, computing at the edge, machine learning and AI at the edge technology drives approaches and techniques for processing and analysing onboard instrument data in near real-time. The author has used edge computing and neural networks combined with high performance heterogeneous computing platforms to accelerate AI workloads. Heterogeneous computing hardware used is readily available, low cost, delivers impressive AI performance and can run multiple neural networks in parallel. Collecting, processing and machine learning from onboard instruments data in near real-time is not a trivial problem due to data volumes, complexities of data filtering, data storage and continual learning. Little research has been done on continual machine learning which aims at a higher level of machine intelligence through providing the artificial agents with the ability to learn from a non-stationary and never-ending stream of data. The author has applied the concept of continual learning to building a system that continually learns from actual boat performance and refines predictions previously done using static VPP data. The neural networks used are initially trained using the output from traditional VPP software and continue to learn from actual data collected under real sailing conditions. The author will present the system design, AI, and edge computing techniques used and the approaches he has researched for incremental training to realise continual learning.

Download Full-text

Edge Computing Based IoT Architecture for Low Cost Air Pollution Monitoring Systems: A Comprehensive System Analysis, Design Considerations & Development

Sensors ◽

10.3390/s18093021 ◽

2018 ◽

Vol 18 (9) ◽

pp. 3021 ◽

Cited By ~ 26

Author(s):

Zeba Idrees ◽

Zhuo Zou ◽

Lirong Zheng

Keyword(s):

Air Quality ◽

Power Consumption ◽

Real Time ◽

Low Cost ◽

Edge Computing ◽

Quality Monitoring ◽

Quality Data ◽

Monitoring Systems ◽

Air Quality Monitoring ◽

Computing Device

With the swift growth in commerce and transportation in the modern civilization, much attention has been paid to air quality monitoring, however existing monitoring systems are unable to provide sufficient spatial and temporal resolutions of the data with cost efficient and real time solutions. In this paper we have investigated the issues, infrastructure, computational complexity, and procedures of designing and implementing real-time air quality monitoring systems. To daze the defects of the existing monitoring systems and to decrease the overall cost, this paper devised a novel approach to implement the air quality monitoring system, employing the edge-computing based Internet-of-Things (IoT). In the proposed method, sensors gather the air quality data in real time and transmit it to the edge computing device that performs necessary processing and analysis. The complete infrastructure & prototype for evaluation is developed over the Arduino board and IBM Watson IoT platform. Our model is structured in such a way that it reduces the computational burden over sensing nodes (reduced to 70%) that is battery powered and balanced it with edge computing device that has its local data base and can be powered up directly as it is deployed indoor. Algorithms were employed to avoid temporary errors in low cost sensor, and to manage cross sensitivity problems. Automatic calibration is set up to ensure the accuracy of the sensors reporting, hence achieving data accuracy around 75–80% under different circumstances. In addition, a data transmission strategy is applied to minimize the redundant network traffic and power consumption. Our model acquires a power consumption reduction up to 23% with a significant low cost. Experimental evaluations were performed under different scenarios to validate the system’s effectiveness.

Download Full-text

Low-Cost Receiver and Network Real-Time Kinematic Positioning for use in Connected and Autonomous Vehicles

Journal of Navigation ◽

10.1017/s037346331800111x ◽

2019 ◽

Vol 72 (04) ◽

pp. 917-930

Author(s):

Fang-Shii Ning ◽

Xiaolin Meng ◽

Yi-Ting Wang

Keyword(s):

Real Time ◽

Autonomous Vehicles ◽

High Performance ◽

Low Cost ◽

Satellite System ◽

High Accuracy ◽

Future Trend ◽

Intelligent Transport System ◽

Positioning System ◽

Gnss Receiver

Connected and Autonomous Vehicles (CAVs) have been researched extensively for solving traffic issues and for realising the concept of an intelligent transport system. A well-developed positioning system is critical for CAVs to achieve these aims. The system should provide high accuracy, mobility, continuity, flexibility and scalability. However, high-performance equipment is too expensive for the commercial use of CAVs; therefore, the use of a low-cost Global Navigation Satellite System (GNSS) receiver to achieve real-time, high-accuracy and ubiquitous positioning performance will be a future trend. This research used RTKLIB software to develop a low-cost GNSS receiver positioning system and assessed the developed positioning system according to the requirements of CAV applications. Kinematic tests were conducted to evaluate the positioning performance of the low-cost receiver in a CAV driving environment based on the accuracy requirements of CAVs. The results showed that the low-cost receiver satisfied the “Where in Lane” accuracy level (0·5 m) and achieved a similar positioning performance in rural, interurban, urban and motorway areas.

Download Full-text

Fast and Reliable Mouse Picking Using Graphics Hardware

International Journal of Computer Games Technology ◽

10.1155/2009/730894 ◽

2009 ◽

Vol 2009 ◽

pp. 1-7 ◽

Cited By ~ 1

Author(s):

Hanli Zhao ◽

Xiaogang Jin ◽

Jianbing Shen ◽

Shufang Lu

Keyword(s):

Real Time ◽

Time Complexity ◽

High Performance ◽

Linear Time ◽

Graphics Hardware ◽

3D Graphics ◽

Novel Approach ◽

Geometry Shader ◽

Intersection Test ◽

Fast Responses

Mouse picking is the most commonly used intuitive operation to interact with 3D scenes in a variety of 3D graphics applications. High performance for such operation is necessary in order to provide users with fast responses. This paper proposes a fast and reliable mouse picking algorithm using graphics hardware for 3D triangular scenes. Our approach uses a multi-layer rendering algorithm to perform the picking operation in linear time complexity. The objectspace based ray-triangle intersection test is implemented in a highly parallelized geometry shader. After applying the hardware-supported occlusion queries, only a small number of objects (or sub-objects) are rendered in subsequent layers, which accelerates the picking efficiency. Experimental results demonstrate the high performance of our novel approach. Due to its simplicity, our algorithm can be easily integrated into existing real-time rendering systems.

Download Full-text

HTNURL: Design of a High-Performance Low-Cost Triple-Node Upset Self-Recoverable Latch

Electronics ◽

10.3390/electronics10202457 ◽

2021 ◽

Vol 10 (20) ◽

pp. 2457

Author(s):

Hui Xu ◽

Zehua Peng ◽

Huaguo Liang ◽

Zhengfeng Huang ◽

Cong Sun ◽

...

Keyword(s):

Power Consumption ◽

High Performance ◽

Feedback Loop ◽

Low Cost ◽

Low Power Consumption ◽

Clock Gating ◽

Single Node ◽

Area Overhead ◽

Power Delay Product ◽

Reduced Power Consumption

A high-performance and low power consumption triple-node upset self-recoverable latch (HTNURL) is proposed. It can effectively tolerate single-node upset (SNU), double-node upset (DNU), and triple-node upset (TNU). This latch uses the C-element to construct a feedback loop, which reduces the delay and power consumption by fast path and clock gating techniques. Compared with the TNU-recoverable latches, HTNURL has a lower delay, reduced power consumption, and full self-recoverability. The delay, power consumption, area overhead, and area-power-delay product (APDP) of the HTNURL is reduced by 33.87%, 63.34%, 21.13%, and 81.71% on average.

Download Full-text

MEASUREMENT OF CO AND NO2 GAS CONCENTRATION'S BY MULTISENSOR MICROSYSTEM IN THE MODE OF PULSE HEATING

Devices and Methods of Measurements ◽

10.21122/2220-9506-2017-8-2-160-167 ◽

2017 ◽

Vol 8 (2) ◽

pp. 160-167

Author(s):

O. G. Reutskaya ◽

Y. M. Pleskachevsky

Keyword(s):

Power Consumption ◽

High Performance ◽

Pulse Heating ◽

High Reliability ◽

Low Cost ◽

Gas Analysis ◽

Sensor Response ◽

Point Of View ◽

Pulsed Heating ◽

Constant Heating

The most promising for mass use in gas analysis equipment are semiconductor gas sensors due to their high reliability, easy operation and relatively low cost. Power consumption in the single-sensor mode, constant heating is from 250 to 600 W average and in pulsed mode heating – ≤ 20 W. The aim of this work was to study the effectiveness of the pulsed heating for multisensor microsystems consisting of two sensors on the substrate of the nanostructured aluminum oxide, compared with the mode of constant heating.For sensitive layers were chosen compositions: SnO2+Pt+Pd at the first sensor of the microsystem and In2O3+Al2O3+Pt on the second. Measuring the sensor response in the pulse heating mode was carried out as follows. Power on each sensor microsystem was installed 1.3 mW. Then the short-term heating (theat.. = 5 s) was performed at the power 61 mW. The detected gases CO and NO2 with the concentration 200 ppm and 4 ppm, correspondingly, were submitted to the microsystem after 15 minutes. The resistance values for each of the sensor were fixed. According to the results determine the sensitivity (sensor response) the maximum value is after 60 s for the sensor with a sensing layer SnO2+Pt+Pd when exposed to CO was 670 %, and for the sensor with In2O3+Al2O3+Pt – 380 %.Advantages of using pulsed heating from the point of view of a power consumption multisensor microsystem mW-range and high performance sensors on substrates of nanostructured alumina were established.

Download Full-text

Multiple Compact Camera Fluorescence Detector for Real-Time PCR Devices

Sensors ◽

10.3390/s21217013 ◽

2021 ◽

Vol 21 (21) ◽

pp. 7013

Author(s):

Seul-Bit-Na Koo ◽

Hyeon-Gyu Chi ◽

Jong-Dae Kim ◽

Yu-Seop Kim ◽

Ji-Sung Park ◽

...

Keyword(s):

Real Time ◽

Real Time Pcr ◽

High Performance ◽

Signal To Noise Ratio ◽

Low Cost ◽

Optical Element ◽

Dna Amplification ◽

Biological Research ◽

Fluorescence Detector ◽

Fluorescence Excitation

The polymerase chain reaction is an important technique in biological research because it tests for diseases with a small amount of DNA. However, this process is time consuming and can lead to sample contamination. Recently, real-time PCR techniques have emerged which make it possible to monitor the amplification process for each cycle in real time. Existing camera-based systems that measure fluorescence after DNA amplification simultaneously process fluorescence excitation and emission for dozens of tubes. Therefore, there is a limit to the size, cost, and assembly of the optical element. In recent years, imaging devices for high-performance, open platforms have benefitted from significant innovations. In this paper, we propose a fluorescence detector for real-time PCR devices using an open platform camera. This system can reduce the cost, and can be miniaturized. To simplify the optical system, four low-cost, compact cameras were used. In addition, the field of view of the entire tube was minimized by dividing it into quadrants. An effective image processing method was used to compensate for the reduction in the signal-to-noise ratio. Using a reference fluorescence material, it was confirmed that the proposed system enables stable fluorescence detection according to the amount of DNA.

Download Full-text

HILS for the Design of Three-Wheeled Mobile Platform Motion Surveillance System with a Use of Energy Performance Index

Solid State Phenomena ◽

10.4028/www.scientific.net/ssp.198.90 ◽

2013 ◽

Vol 198 ◽

pp. 90-95 ◽

Cited By ~ 3

Author(s):

Krzysztof J. Kaliński ◽

Cezary Buchholz

Keyword(s):

System Design ◽

Real Time ◽

Performance Index ◽

Surveillance System ◽

High Performance ◽

Low Cost ◽

Energy Performance ◽

Mobile Platform ◽

Test Configuration ◽

Control Command

Current tendency in mechatronic design requires the use of comprehensive development of an environment, which gives the possibility to prototype, design, simulate and integrate with dedicated hardware. The paper discusses the Hardware-In-the-Loop Simulations (HILS) mechatronic technique [, used during the design of the surveillance system based on energy performance index [. The presented test configuration (physical controller emulated virtual research object) allows authors to verify responses (in the LabVIEW [) of the mobile platform model, to the optimal control commands (torques), generated by the Real Time controller. Defined energy performance index, supported by the correction velocities, controls the emulated platform while moving along three different trajectories. The demonstrated test results are compared with desired values obtained during numerical computation process of kinematic and dynamic equations of the presented model. The authors investigation of the HILS affected final optimisation of the motion surveillance system design. Real time requirements enforced authors to decrease sampling time of control command (signal generation frequency) and establish high performance execution strategy for on-line algorithm (algorithm execution performed both in Real Time processor and in the FPGA - Field Programmable Gate Array) [. The performed simulations confirmed that the HILS is a powerful technique, which improves system design making that more efficient and low cost consuming.

Download Full-text

Design of Velocity Measuring System of Car Based on MCU

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.373-375.363 ◽

2013 ◽

Vol 373-375 ◽

pp. 363-366

Author(s):

Jing Sheng Yu ◽

Hong Qiang Sun

Keyword(s):

Mathematical Model ◽

Power Consumption ◽

Low Power ◽

Control Systems ◽

Basic Principle ◽

High Performance ◽

Low Cost ◽

Measuring System ◽

Low Power Consumption ◽

Velocity Measuring

It describes the basic principle of velocity parameters measuring of car in operation, establishes the related mathematical model. It disigns an intelligent, integrated digital solutions to combination instrumentation of the car based on MC9S12DP256B. This system has advantages of high performance, high precision, low cost, low power consumption, good stability, sensitive respond and expandability. The system measures and shows online velocity parameters of the car. It has fuction such as safety alarm. The system reserves bus interface such as SCI and CAN, correspondences easily with other electronic engine control systems of the car.

Download Full-text