scholarly journals Toward Mass Video Data Analysis: Interactive and Immersive 4D Scene Reconstruction

Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5426
Author(s):  
Matthias Kraus ◽  
Thomas Pollok ◽  
Matthias Miller ◽  
Timon Kilian ◽  
Tobias Moritz ◽  
...  

The technical progress in the last decades makes photo and video recording devices omnipresent. This change has a significant impact, among others, on police work. It is no longer unusual that a myriad of digital data accumulates after a criminal act, which must be reviewed by criminal investigators to collect evidence or solve the crime. This paper presents the VICTORIA Interactive 4D Scene Reconstruction and Analysis Framework (“ISRA-4D” 1.0), an approach for the visual consolidation of heterogeneous video and image data in a 3D reconstruction of the corresponding environment. First, by reconstructing the environment in which the materials were created, a shared spatial context of all available materials is established. Second, all footage is spatially and temporally registered within this 3D reconstruction. Third, a visualization of the hereby created 4D reconstruction (3D scene + time) is provided, which can be analyzed interactively. Additional information on video and image content is also extracted and displayed and can be analyzed with supporting visualizations. The presented approach facilitates the process of filtering, annotating, analyzing, and getting an overview of large amounts of multimedia material. The framework is evaluated using four case studies which demonstrate its broad applicability. Furthermore, the framework allows the user to immerse themselves in the analysis by entering the scenario in virtual reality. This feature is qualitatively evaluated by means of interviews of criminal investigators and outlines potential benefits such as improved spatial understanding and the initiation of new fields of application.

Author(s):  
M. Russo ◽  
S. Menconero ◽  
L. Baglioni

<p><strong>Abstract.</strong> Augmented Reality (AR) represents a growing communication channel, responding to the need to expand reality with additional information, offering easy and engaging access to digital data. AR for architectural representation allows a simple interaction with 3D models, facilitating spatial understanding of complex volumes and topological relationships between parts, overcoming some limitations related to Virtual Reality. In the last decade different developments in the pipeline process have seen a significant advancement in technological and algorithmic aspects, paying less attention to 3D modeling generation. For this, the article explores the construction of basic geometries for 3D model’s generation, highlighting the relationship between geometry and topology, basic for a consistent normal distribution. Moreover, a critical evaluation about corrective paths of existing 3D models is presented, analysing a complex architectural case study, the virtual model of Villa del Verginese, an emblematic example for topological emerged problems. The final aim of the paper is to refocus attention on 3D model construction, suggesting some "good practices" useful for preventing, minimizing or correcting topological problems, extending the accessibility of AR to people engaged in architectural representation.</p>


2020 ◽  
pp. 1-10
Author(s):  
Bryce J. Dietrich

Abstract Although previous scholars have used image data to answer important political science questions, less attention has been paid to video-based measures. In this study, I use motion detection to understand the extent to which members of Congress (MCs) literally cross the aisle, but motion detection can be used to study a wide range of political phenomena, like protests, political speeches, campaign events, or oral arguments. I find not only are Democrats and Republicans less willing to literally cross the aisle, but this behavior is also predictive of future party voting, even when previous party voting is included as a control. However, this is one of the many ways motion detection can be used by social scientists. In this way, the present study is not the end, but the beginning of an important new line of research in which video data is more actively used in social science research.


2011 ◽  
Vol 6 (4) ◽  
pp. 179-185 ◽  
Author(s):  
Michelle O'Reilly ◽  
Nicola Parker ◽  
Ian Hutchby

Using video to facilitate data collection has become increasingly common in health research. Using video in research, however, does raise additional ethical concerns. In this paper we utilize family therapy data to provide empirical evidence of how recording equipment is treated. We show that families made a distinction between what was observed through the video by the reflecting team and what was being recorded onto videotape. We show that all parties actively negotiated what should and should not go ‘on the record’, with particular attention to sensitive topics and the responsibility of the therapist. Our findings have important implications for both clinical professionals and researchers using video data. We maintain that informed consent should be an ongoing process and with this in mind we present some arguments pertaining to the current debates in this field of health-care practice.


Robotica ◽  
2000 ◽  
Vol 18 (3) ◽  
pp. 299-303 ◽  
Author(s):  
Carl-Henrik Oertel

Machine vision-based sensing enables automatic hover stabilization of helicopters. The evaluation of image data, which is produced by a camera looking straight to the ground, results in a drift free autonomous on-board position measurement system. No additional information about the appearance of the scenery seen by the camera (e.g. landmarks) is needed. The technique being applied is a combination of the 4D-approach with two dimensional template tracking of a priori unknown features.


2011 ◽  
Vol 383-390 ◽  
pp. 5193-5199 ◽  
Author(s):  
Jian Ying Yuan ◽  
Xian Yong Liu ◽  
Zhi Qiang Qiu

In optical measuring system with a handheld digital camera, image points matching is very important for 3-dimensional(3D) reconstruction. The traditional matching algorithms are usually based on epipolar geometry or multi-base lines. Mistaken matching points can not be eliminated by epipolar geometry and many matching points will be lost by multi-base lines. In this paper, a robust algorithm is presented to eliminate mistaken matching feature points in the process of 3D reconstruction from multiple images. The algorithm include three steps: (1) pre-matching the feature points using constraints of epipolar geometry and image topological structure firstly; (2) eliminating the mistaken matching points by the principle of triangulation in multi-images; (3) refining camera external parameters by bundle adjustment. After the external parameters of every image refined, repeat step (1) to step (3) until all the feature points been matched. Comparative experiments with real image data have shown that mistaken matching feature points can be effectively eliminated, and nearly no matching points have been lost, which have a better performance than traditonal matching algorithms do.


Author(s):  
Daniel Danso Essel ◽  
Ben-Bright Benuwa ◽  
Benjamin Ghansah

Sparse Representation (SR) and Dictionary Learning (DL) based Classifier have shown promising results in classification tasks, with impressive recognition rate on image data. In Video Semantic Analysis (VSA) however, the local structure of video data contains significant discriminative information required for classification. To the best of our knowledge, this has not been fully explored by recent DL-based approaches. Further, similar coding findings are not being realized from video features with the same video category. Based on the foregoing, a novel learning algorithm, Sparsity based Locality-Sensitive Discriminative Dictionary Learning (SLSDDL) for VSA is proposed in this paper. In the proposed algorithm, a discriminant loss function for the category based on sparse coding of the sparse coefficients is introduced into structure of Locality-Sensitive Dictionary Learning (LSDL) algorithm. Finally, the sparse coefficients for the testing video feature sample are solved by the optimized method of SLSDDL and the classification result for video semantic is obtained by minimizing the error between the original and reconstructed samples. The experimental results show that, the proposed SLSDDL significantly improves the performance of video semantic detection compared with state-of-the-art approaches. The proposed approach also shows robustness to diverse video environments, proving the universality of the novel approach.


2019 ◽  
Vol 8 (1) ◽  
pp. 47 ◽  
Author(s):  
Franz Kurz ◽  
Seyed Azimi ◽  
Chun-Yu Sheu ◽  
Pablo d’Angelo

The 3D information of road infrastructures is growing in importance with the development of autonomous driving. In this context, the exact 2D position of road markings as well as height information play an important role in, e.g., lane-accurate self-localization of autonomous vehicles. In this paper, the overall task is divided into an automatic segmentation followed by a refined 3D reconstruction. For the segmentation task, we applied a wavelet-enhanced fully convolutional network on multiview high-resolution aerial imagery. Based on the resulting 2D segments in the original images, we propose a successive workflow for the 3D reconstruction of road markings based on a least-squares line-fitting in multiview imagery. The 3D reconstruction exploits the line character of road markings with the aim to optimize the best 3D line location by minimizing the distance from its back projection to the detected 2D line in all the covering images. Results showed an improved IoU of the automatic road marking segmentation by exploiting the multiview character of the aerial images and a more accurate 3D reconstruction of the road surface compared to the semiglobal matching (SGM) algorithm. Further, the approach avoids the matching problem in non-textured image parts and is not limited to lines of finite length. In this paper, the approach is presented and validated on several aerial image data sets covering different scenarios like motorways and urban regions.


Sign in / Sign up

Export Citation Format

Share Document