scholarly journals Real-Time Audio-Visual Analysis for Multiperson Videoconferencing

2013 ◽  
Vol 2013 ◽  
pp. 1-21 ◽  
Author(s):  
Petr Motlicek ◽  
Stefan Duffner ◽  
Danil Korchagin ◽  
Hervé Bourlard ◽  
Carl Scheffler ◽  
...  

We describe the design of a system consisting of several state-of-the-art real-time audio and video processing components enabling multimodal stream manipulation (e.g., automatic online editing for multiparty videoconferencing applications) in open, unconstrained environments. The underlying algorithms are designed to allow multiple people to enter, interact, and leave the observable scene with no constraints. They comprise continuous localisation of audio objects and its application for spatial audio object coding, detection, and tracking of faces, estimation of head poses and visual focus of attention, detection and localisation of verbal and paralinguistic events, and the association and fusion of these different events. Combined all together, they represent multimodal streams with audio objects and semantic video objects and provide semantic information for stream manipulation systems (like a virtual director). Various experiments have been performed to evaluate the performance of the system. The obtained results demonstrate the effectiveness of the proposed design, the various algorithms, and the benefit of fusing different modalities in this scenario.

Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6355
Author(s):  
Muhammad Sualeh ◽  
Gon-Woo Kim

The idea of SLAM (Simultaneous Localization and Mapping) being a solved problem revolves around the static world assumption, even though autonomous systems are gaining environmental perception capabilities by exploiting the advances in computer vision and data-driven approaches. The computational demands and time complexities remain the main impediment in the effective fusion of the paradigms. In this paper, a framework to solve the dynamic SLAM problem is proposed. The dynamic regions of the scene are handled by making use of Visual-LiDAR based MODT (Multiple Object Detection and Tracking). Furthermore, minimal computational demands and real-time performance are ensured. The framework is tested on the KITTI Datasets and evaluated against the publicly available evaluation tools for a fair comparison with state-of-the-art SLAM algorithms. The results suggest that the proposed dynamic SLAM framework can perform in real-time with budgeted computational resources. In addition, the fused MODT provides rich semantic information that can be readily integrated into SLAM.


Author(s):  
N. Jayanti ◽  

To achieve fully automatic surveillance of some specific color objects, an intelligent real-time detection method based on video processing is proposed. The main aim of this paper is to identify the colors and use them to achieve their applications. The proposed algorithm is used to detect a specific color and also to track it in the live video feed which could be eventually used for many different applications like surveillance cameras, fire detection in cases of forest fires, etc. For the color recognition part, several stages such as image subtraction, noise filtering, binary image, and blob extraction are used to recognize a specific color in the video feed. Then the corresponding pixels on the GUI are drawn to track where all the color has been. This might find application in various areas; one such area in which this has been used often is in the detection of forest fires.


Author(s):  
Mae L. Seto

A naval ship's acoustic signature is known after a ranging but changes the longer it is in-service away from a range. The Ship Signatures Management System (SSMS) provides an organic real-time capability to predict their own signature and enough information to mitigate signature issues. SSMS provides the Commanding Officer with a tool to determine the ship's acoustic signature in order to evaluate the impact of his/her proposed actions on the ship's counter-detection range and sensor performance. In this manner, the ship's protection is enhanced through insightful and timely signature management. DRDC has upgraded the SSMS hardware to state-of-the-art components to increase the number of sensors, the fidelity of the logged data, the dynamic range, and the processing power. This paper discusses some of the advanced SSMS features developed like tonal detection and tracking, tonal association, and the diagnostics used to determine the cause of features in the acoustic signature.


Sensors ◽  
2019 ◽  
Vol 19 (15) ◽  
pp. 3371 ◽  
Author(s):  
Hossain ◽  
Lee

In recent years, demand has been increasing for target detection and tracking from aerial imagery via drones using onboard powered sensors and devices. We propose a very effective method for this application based on a deep learning framework. A state-of-the-art embedded hardware system empowers small flying robots to carry out the real-time onboard computation necessary for object tracking. Two types of embedded modules were developed: one was designed using a Jetson TX or AGX Xavier, and the other was based on an Intel Neural Compute Stick. These are suitable for real-time onboard computing power on small flying drones with limited space. A comparative analysis of current state-of-the-art deep learning-based multi-object detection algorithms was carried out utilizing the designated GPU-based embedded computing modules to obtain detailed metric data about frame rates, as well as the computation power. We also introduce an effective target tracking approach for moving objects. The algorithm for tracking moving objects is based on the extension of simple online and real-time tracking. It was developed by integrating a deep learning-based association metric approach with simple online and real-time tracking (Deep SORT), which uses a hypothesis tracking methodology with Kalman filtering and a deep learning-based association metric. In addition, a guidance system that tracks the target position using a GPU-based algorithm is introduced. Finally, we demonstrate the effectiveness of the proposed algorithms by real-time experiments with a small multi-rotor drone.


2011 ◽  
Vol 328-330 ◽  
pp. 2234-2237
Author(s):  
Dong Sheng Liang ◽  
Zhao Hui Liu ◽  
Wen Liu

Achieving the detection and tracking of moving targets has been widely applied in all fields of today's society. Because of the shortcomings of traditional video tracking system, this paper proposes a novel method for designing video processing system based on hardware design of FPGA and DSP, and moving target in video can be detected and tracked by this system. In this system, DSP as the core of the system, it mainly completes the processing algorithms of video and image data, FPGA as a coprocessor, responsible for the completion of the processing of external data and logic. The hardware structure, link configuration, program code and other aspects of system are optimized. Finally, through the experiment, the input frame rate of video is 40frames/s, and the image resolution is 512pixels × 512pixels, median 16bites quantitative image sequence, the system can complete the relevant real-time detection and tracking algorithm and extract targets position of image sequences correctly. The results show that the advantage is that this system has powerful operation speed, real time, high accuracy and stability.


2010 ◽  
Vol 20 (1) ◽  
pp. 9-13 ◽  
Author(s):  
Glenn Tellis ◽  
Lori Cimino ◽  
Jennifer Alberti

Abstract The purpose of this article is to provide clinical supervisors with information pertaining to state-of-the-art clinic observation technology. We use a novel video-capture technology, the Landro Play Analyzer, to supervise clinical sessions as well as to train students to improve their clinical skills. We can observe four clinical sessions simultaneously from a central observation center. In addition, speech samples can be analyzed in real-time; saved on a CD, DVD, or flash/jump drive; viewed in slow motion; paused; and analyzed with Microsoft Excel. Procedures for applying the technology for clinical training and supervision will be discussed.


2002 ◽  
Author(s):  
Wei Liu ◽  
Zeying Chi ◽  
Wenjian Chen

Author(s):  
Gabriel Wilkes ◽  
Roman Engelhardt ◽  
Lars Briem ◽  
Florian Dandl ◽  
Peter Vortisch ◽  
...  

This paper presents the coupling of a state-of-the-art ride-pooling fleet simulation package with the mobiTopp travel demand modeling framework. The coupling of both models enables a detailed agent- and activity-based demand model, in which travelers have the option to use ride-pooling based on real-time offers of an optimized ride-pooling operation. On the one hand, this approach allows the application of detailed mode-choice models based on agent-level attributes coming from mobiTopp functionalities. On the other hand, existing state-of-the-art ride-pooling optimization can be applied to utilize the full potential of ride-pooling. The introduced interface allows mode choice based on real-time fleet information and thereby does not require multiple iterations per simulated day to achieve a balance of ride-pooling demand and supply. The introduced methodology is applied to a case study of an example model where in total approximately 70,000 trips are performed. Simulations with a simplified mode-choice model with varying fleet size (0–150 vehicles), fares, and further fleet operators’ settings show that (i) ride-pooling can be a very attractive alternative to existing modes and (ii) the fare model can affect the mode shifts to ride-pooling. Depending on the scenario, the mode share of ride-pooling is between 7.6% and 16.8% and the average distance-weighed occupancy of the ride-pooling fleet varies between 0.75 and 1.17.


Sign in / Sign up

Export Citation Format

Share Document