scholarly journals PGNet: Pipeline Guidance for Human Key-Point Detection

Entropy ◽  
2020 ◽  
Vol 22 (3) ◽  
pp. 369 ◽  
Author(s):  
Feng Hong ◽  
Changhua Lu ◽  
Chun Liu ◽  
Ruru Liu ◽  
Weiwei Jiang ◽  
...  

Human key-point detection is a challenging research field in computer vision. Convolutional neural models limit the number of parameters and mine the local structure, and have made great progress in significant target detection and key-point detection. However, the features extracted by shallow layers mainly contain a lack of semantic information, while the features extracted by deep layers contain rich semantic information but a lack of spatial information that results in information imbalance and feature extraction imbalance. With the complexity of the network structure and the increasing amount of computation, the balance between the time of communication and the time of calculation highlights the importance. Based on the improvement of hardware equipment, network operation time is greatly improved by optimizing the network structure and data operation methods. However, as the network structure becomes deeper and deeper, the communication consumption between networks also increases, and network computing capacity is optimized. In addition, communication overhead is also the focus of recent attention. We propose a novel network structure PGNet, which contains three parts: pipeline guidance strategy (PGS); Cross-Distance-IoU Loss (CIoU); and Cascaded Fusion Feature Model (CFFM).

2021 ◽  
Vol 7 ◽  
pp. e704
Author(s):  
Wei Ma ◽  
Shuai Zhang ◽  
Jincai Huang

Unlike traditional visualization methods, augmented reality (AR) inserts virtual objects and information directly into digital representations of the real world, which makes these objects and data more easily understood and interactive. The integration of AR and GIS is a promising way to display spatial information in context. However, most existing AR-GIS applications only provide local spatial information in a fixed location, which is exposed to a set of problems, limited legibility, information clutter and the incomplete spatial relationships. In addition, the indoor space structure is complex and GPS is unavailable, so that indoor AR systems are further impeded by the limited capacity of these systems to detect and display location and semantic information. To address this problem, the localization technique for tracking the camera positions was fused by Bluetooth low energy (BLE) and pedestrian dead reckoning (PDR). The multi-sensor fusion-based algorithm employs a particle filter. Based on the direction and position of the phone, the spatial information is automatically registered onto a live camera view. The proposed algorithm extracts and matches a bounding box of the indoor map to a real world scene. Finally, the indoor map and semantic information were rendered into the real world, based on the real-time computed spatial relationship between the indoor map and live camera view. Experimental results demonstrate that the average positioning error of our approach is 1.47 m, and 80% of proposed method error is within approximately 1.8 m. The positioning result can effectively support that AR and indoor map fusion technique links rich indoor spatial information to real world scenes. The method is not only suitable for traditional tasks related to indoor navigation, but it is also promising method for crowdsourcing data collection and indoor map reconstruction.


2021 ◽  
Vol 15 ◽  
Author(s):  
Liqun Gao ◽  
Yujia Liu ◽  
Hongwu Zhuang ◽  
Haiyang Wang ◽  
Bin Zhou ◽  
...  

With the rapid popularity of agent technology, a public opinion early warning agent has attracted wide attention. Furthermore, a deep learning model can make the agent more automatic and efficient. Therefore, for the agency of a public opinion early warning task, the deep learning model is very suitable for completing tasks such as popularity prediction or emergency outbreak. In this context, improving the ability to automatically analyze and predict the virality of information cascades is one of the tasks that deep learning model approaches address. However, most of the existing studies sought to address this task by analyzing cascade underlying network structure. Recent studies proposed cascade virality prediction for agnostic-networks (without network structure), but did not consider the fusion of more effective features. In this paper, we propose an innovative cascade virus prediction model named CasWarn. It can be quickly deployed in intelligent agents to effectively predict the virality of public opinion information for different industries. Inspired by the agnostic-network model, this model extracts the key features (independent of the underlying network structure) of an information cascade, including dissemination scale, emotional polarity ratio, and semantic evolution. We use two improved neural network frameworks to embed these features, and then apply the classification task to predict the cascade virality. We conduct comprehensive experiments on two large social network datasets. Furthermore, the experimental results prove that CasWarn can make timely and effective cascade virality predictions and verify that each feature model of CasWarn is beneficial to improve performance.


2020 ◽  
Vol 2020 ◽  
pp. 1-20
Author(s):  
Cheng Xu ◽  
Hengjie Luo ◽  
Hong Bao ◽  
Pengfei Wang

The Internet of Vehicles (IoV) is an important artificial intelligence research field for intelligent transportation applications. Complex event interactions are important methods for data flow processing in a Vehicle to Everything (V2X) environment. Unlike the classic Internet of Things (IoT) systems, data streams in V2X include both temporal information and spatial information. Thus, effectively expressing and addressing spatiotemporal data interactions in the IoV is an urgent problem. To solve this problem, we propose a spatiotemporal event interaction model (STEIM). STEIM uses a time period and a raster map for its temporal model and spatial model, respectively. In this paper, first, we provide a spatiotemporal operator and a complete STEIM grammar that effectively expresses the spatiotemporal information of the spatiotemporal event flow in the V2X environment. Second, we describe the design of the operational semantics of the STEIM from the formal semantics. In addition, we provide a spatiotemporal event-stream processing algorithm that is based on the Petri net model. The STEIM establishes a mechanism for V2X event-stream temporal and spatial processing. Finally, the effectiveness of the STEIM-based system is demonstrated experimentally.


Author(s):  
M. Chi ◽  
Y. Liu

Abstract. Since the Tang Dynasty (618–907 AD), the Tang-Tibet Road has been the only way from inland China to Qinghai and Tibet, and even to other countries such as Nepal and India. It ties and bonds various ethnic groups and regions, integrates cultural memories and cross-cultural communication achievements from ancient times to the present, and witnesses the dynamic propagation of the culture. Affected by the environment, climate, and wars, Tang-Tibet Road was often impossible to travel on or through intermittently during its progressive development in history. Routes and lines of each of its sections changed from time to time; eventually, an ancient road network was formed, consisting of one trunk road, two subsidiary roads in the north and south, several branches, and scattered auxiliary routes separated from the system, among which there were both outward-oriented international passages and inward-oriented passages. Nonetheless, research on Tang-Tibet Road is insufficient at the current stage. Regarding the problems summarized based on the review of the research situation, the present work probes deeper into the network structure of Tang-Tibet Road. How historical corridor is generated and evolved is understood from a regional perspective. In particular, strategies to design a space information system for the Tibet section of Tang-Tibet Road are explained to promote the exploration and use of cultural heritages in Tibet, in an effort to preserve these heritages while developing Tibet’s society and economy.


2005 ◽  
Vol 47 (3) ◽  
Author(s):  
Thomas Barkowsky ◽  
John Bateman ◽  
Christian Freksa ◽  
Wolfram Burgard ◽  
Markus Knauff

SummuryThe Transregional Collaborative Research Center SFB/TR 8 Spatial Cognition was established by the German Science Foundation (DFG) at the Universities of Bremen and Freiburg in January 2003. 13 Research projects pursue interdisciplinary research on intelligent spatial information processing. This article introduces the research field of spatial cognition and reports on aspects from cognitive psychology, cognitive robotics, linguistics, and artificial intelligence.


2021 ◽  
Vol 18 (2) ◽  
pp. 172988142110076
Author(s):  
Tao Ku ◽  
Qirui Yang ◽  
Hao Zhang

Recently, convolutional neural network (CNN) has led to significant improvement in the field of computer vision, especially the improvement of the accuracy and speed of semantic segmentation tasks, which greatly improved robot scene perception. In this article, we propose a multilevel feature fusion dilated convolution network (Refine-DeepLab). By improving the space pyramid pooling structure, we propose a multiscale hybrid dilated convolution module, which captures the rich context information and effectively alleviates the contradiction between the receptive field size and the dilated convolution operation. At the same time, the high-level semantic information and low-level semantic information obtained through multi-level and multi-scale feature extraction can effectively improve the capture of global information and improve the performance of large-scale target segmentation. The encoder–decoder gradually recovers spatial information while capturing high-level semantic information, resulting in sharper object boundaries. Extensive experiments verify the effectiveness of our proposed Refine-DeepLab model, evaluate our approaches thoroughly on the PASCAL VOC 2012 data set without MS COCO data set pretraining, and achieve a state-of-art result of 81.73% mean interaction-over-union in the validate set.


Author(s):  
F.-L. Krause ◽  
M. Ciesla ◽  
E. Rieger ◽  
M. Stephan ◽  
A. Ulbrich

Abstract For the integration of the tasks to be mastered within the development process it is necessary to take account of non-geometrical information, from product design to manufacturing process, alongside geometrical design shape. The introduction of objects as carriers of semantic information leads to the use of features. In the present contribution, a concept and its realization are described that facilitate a flexible definition and computer-internal representation as well as the interpretation of features as semantically endowed objects. Based on the feature model, approaches are introduced for the support of product development with the partial tasks conception, design, integrated planning of manufacture and quality assurance.


2012 ◽  
Vol 18 (1) ◽  
Author(s):  
A. Nagy ◽  
P. Riczu ◽  
J. Tamás ◽  
Z. Szabó ◽  
J. Nyéki ◽  
...  

The research field was at Siófok, in Hungary, which is situated in the South East side of Lake Balaton. The physical characteristic of the soil is sandy loam and loam and the peach orchard is irrigated. Mainly Sweet Lady (early ripening), Red Heaven (medium ripening) and Weinberger (early ripening) species were installed. In order to achieve the optimal developement level of trees and maximal yield amount and fruit diameter (Sweet Lady 60–75 mm, Red Heaven 60–70 mm, Veinberger 50–60 mm) continous water and nutrient supply is required. The irrigation modeling was set by CROPWAT 8.0 based on the climatic, crop and soil data inputs of the last 10 years. Based on the results, large amount of water is needed for optimal growth of fruit trees, particularly in the summer months, in case of active ground cover (+) and bare soil (–) as well. The irrigation requirement of a tree was found maximum 4 l/hour in certain cases. This irrigation intensity can be achieved – calculated with 12-hour operating time – by using continuous water NAAN Tif drip tube with 2 l/h flux on 3 atm pressure with 16 mm pipe diameter. If lower irrigation intensity is required irrigation can be controlled by the decreased the operation time.


2011 ◽  
Vol 225-226 ◽  
pp. 827-830
Author(s):  
Ai Wen Jiang ◽  
Gao Rong Zeng

Video text provides important semantic information in video content analysis. However, video text with complex background has a poor recognition performance for OCR. Most of the previous approaches to extracting overlay text from videos are based on traditional binarization and give little attention on multi-information integration, especially fusing the background information. This paper presents an effective method to precisely extract characters from videos to enable it for OCR with a good recognition performance. The proposed method combines multi-information together including background information, edge information, and character’s spatial information. Experimental results show that it is robust to complex background and various text appearances.


2018 ◽  
Vol 2018 ◽  
pp. 1-11
Author(s):  
Linyuan Xia ◽  
Qiumei Huang ◽  
Dongjin Wu

Contextual location prediction is an important topic in the field of personalized location recommendation in LBS (location-based services). With the advancement of mobile positioning techniques and various sensors embedded in smartphones, it is convenient to obtain massive human mobile trajectories and to derive a large amount of valuable information from geospatial big data. Extracting and recognizing personally interesting places and predicting next semantic location become a research hot spot in LBS. In this paper, we proposed an approach to predict next personally semantic place with historical visiting patterns derived from mobile device logs. To address the problems of location imprecision and lack of semantic information, a modified trip-identify method is employed to extract key visit points from GPS trajectories to a more accurate extent while semantic information are added through stay point detection and semantic places recognition. At last, a decision tree model is adopted to explore the spatial, temporal, and sequential features in contextual location prediction. To validate the effectiveness of our approach, experiments were conducted based on a trajectory collection in Guangzhou downtown area. The results verified the feasibility of our approach on contextual location prediction from continuous mobile devices logs.


Sign in / Sign up

Export Citation Format

Share Document