Towards a Distributed Large-Scale Dynamic Graph Data Store

Marbor: A novel large-scale graph data storage and processing framework

2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC) ◽

10.1109/pccc.2014.7017031 ◽

2014 ◽

Author(s):

Wei Zhou ◽

Yun Gao ◽

Jizhong Han ◽

Zhiyong Xu

Keyword(s):

Data Storage ◽

Large Scale ◽

Graph Data ◽

Processing Framework

Download Full-text

Multi-modal transportation recommendation with unified route representation learning

Proceedings of the VLDB Endowment ◽

10.14778/3430915.3430924 ◽

2020 ◽

Vol 14 (3) ◽

pp. 342-350

Author(s):

Hao Liu ◽

Jindong Han ◽

Yanjie Fu ◽

Jingbo Zhou ◽

Xinjiang Lu ◽

...

Keyword(s):

Large Scale ◽

Transportation Networks ◽

Representation Learning ◽

Transportation Systems ◽

Graph Representation ◽

Dynamic Graph ◽

Arbitrary Length ◽

Task Learning ◽

Semantic Coherence ◽

Spatio Temporal

Multi-modal transportation recommendation aims to provide the most appropriate travel route with various transportation modes according to certain criteria. After analyzing large-scale navigation data, we find that route representations exhibit two patterns: spatio-temporal autocorrelations within transportation networks and the semantic coherence of route sequences. However, there are few studies that consider both patterns when developing multi-modal transportation systems. To this end, in this paper, we study multi-modal transportation recommendation with unified route representation learning by exploiting both spatio-temporal dependencies in transportation networks and the semantic coherence of historical routes. Specifically, we propose to unify both dynamic graph representation learning and hierarchical multi-task learning for multi-modal transportation recommendations. Along this line, we first transform the multi-modal transportation network into time-dependent multi-view transportation graphs and propose a spatiotemporal graph neural network module to capture the spatial and temporal autocorrelation. Then, we introduce a coherent-aware attentive route representation learning module to project arbitrary-length routes into fixed-length representation vectors, with explicit modeling of route coherence from historical routes. Moreover, we develop a hierarchical multi-task learning module to differentiate route representations for different transport modes, and this is guided by the final recommendation feedback as well as multiple auxiliary tasks equipped in different network layers. Extensive experimental results on two large-scale real-world datasets demonstrate the performance of the proposed system outperforms eight baselines.

Download Full-text

Towards Massive RDF Storage in NoSQL Databases

Advances in Data Mining and Database Management - Emerging Technologies and Applications in Data Processing and Management ◽

10.4018/978-1-5225-8446-9.ch013 ◽

2019 ◽

pp. 263-284 ◽

Cited By ~ 2

Author(s):

Zongmin Ma ◽

Li Yan

Keyword(s):

Data Storage ◽

Large Scale ◽

Future Research ◽

Nosql Databases ◽

Current State ◽

Data Store ◽

Rdf Data ◽

Description Framework ◽

Resource Description ◽

The Web

The resource description framework (RDF) is a model for representing information resources on the web. With the widespread acceptance of RDF as the de-facto standard recommended by W3C (World Wide Web Consortium) for the representation and exchange of information on the web, a huge amount of RDF data is being proliferated and becoming available. So, RDF data management is of increasing importance and has attracted attention in the database community as well as the Semantic Web community. Currently, much work has been devoted to propose different solutions to store large-scale RDF data efficiently. In order to manage massive RDF data, NoSQL (not only SQL) databases have been used for scalable RDF data store. This chapter focuses on using various NoSQL databases to store massive RDF data. An up-to-date overview of the current state of the art in RDF data storage in NoSQL databases is provided. The chapter aims at suggestions for future research.

Download Full-text

A Review of RDF Storage in NoSQL Databases

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Managing Big Data in Cloud Computing Environments ◽

10.4018/978-1-4666-9834-5.ch009 ◽

2016 ◽

pp. 210-229 ◽

Cited By ~ 2

Author(s):

Zongmin Ma ◽

Li Yan

Keyword(s):

Data Storage ◽

Large Scale ◽

Future Research ◽

Nosql Databases ◽

Current State ◽

Data Store ◽

Rdf Data ◽

Description Framework ◽

Resource Description ◽

The Web

The Resource Description Framework (RDF) is a model for representing information resources on the Web. With the widespread acceptance of RDF as the de-facto standard recommended by W3C (World Wide Web Consortium) for the representation and exchange of information on the Web, a huge amount of RDF data is being proliferated and becoming available. So RDF data management is of increasing importance, and has attracted attentions in the database community as well as the Semantic Web community. Currently much work has been devoted to propose different solutions to store large-scale RDF data efficiently. In order to manage massive RDF data, NoSQL (“not only SQL”) databases have been used for scalable RDF data store. This chapter focuses on using various NoSQL databases to store massive RDF data. An up-to-date overview of the current state of the art in RDF data storage in NoSQL databases is provided. The chapter aims at suggestions for future research.

Download Full-text

Learning Graph Convolutional Network for Skeleton-Based Human Action Recognition by Neural Searching

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i03.5652 ◽

2020 ◽

Vol 34 (03) ◽

pp. 2669-2676 ◽

Cited By ~ 11

Author(s):

Wei Peng ◽

Xiaopeng Hong ◽

Haoyu Chen ◽

Guoying Zhao

Keyword(s):

Action Recognition ◽

Large Scale ◽

Order Approximation ◽

Human Action Recognition ◽

Search Space ◽

Human Action ◽

Higher Order ◽

Dynamic Graph ◽

Convolutional Network ◽

Representational Capacity

Human action recognition from skeleton data, fuelled by the Graph Convolutional Network (GCN) with its powerful capability of modeling non-Euclidean data, has attracted lots of attention. However, many existing GCNs provide a pre-defined graph structure and share it through the entire network, which can loss implicit joint correlations especially for the higher-level features. Besides, the mainstream spectral GCN is approximated by one-order hop such that higher-order connections are not well involved. All of these require huge efforts to design a better GCN architecture. To address these problems, we turn to Neural Architecture Search (NAS) and propose the first automatically designed GCN for this task. Specifically, we explore the spatial-temporal correlations between nodes and build a search space with multiple dynamic graph modules. Besides, we introduce multiple-hop modules and expect to break the limitation of representational capacity caused by one-order approximation. Moreover, a corresponding sampling- and memory-efficient evolution strategy is proposed to search in this space. The resulted architecture proves the effectiveness of the higher-order approximation and the layer-wise dynamic graph modules. To evaluate the performance of the searched model, we conduct extensive experiments on two very large scale skeleton-based action recognition datasets. The results show that our model gets the state-of-the-art results in term of given metrics.

Download Full-text

Label Propagation-Based Parallel Graph Partitioning for Large-Scale Graph Data

IEEE Access ◽

10.1109/access.2020.2987355 ◽

2020 ◽

Vol 8 ◽

pp. 72801-72813

Author(s):

Minho Bae ◽

Minjoong Jeong ◽

Sangyoon Oh

Keyword(s):

Graph Partitioning ◽

Large Scale ◽

Label Propagation ◽

Graph Data ◽

Parallel Graph

Download Full-text

DEM Extraction from ALS Point Clouds in Forest Areas via Graph Convolution Network

Remote Sensing ◽

10.3390/rs12010178 ◽

2020 ◽

Vol 12 (1) ◽

pp. 178 ◽

Cited By ~ 1

Author(s):

Jinming Zhang ◽

Xiangyun Hu ◽

Hengming Dai ◽

ShenRun Qu

Keyword(s):

Deep Learning ◽

Point Cloud ◽

Laser Scanning ◽

Large Scale ◽

Spatial Relationship ◽

Point Clouds ◽

Current Data ◽

Data Sampling ◽

Dynamic Graph ◽

Convolution Model

It is difficult to extract a digital elevation model (DEM) from an airborne laser scanning (ALS) point cloud in a forest area because of the irregular and uneven distribution of ground and vegetation points. Machine learning, especially deep learning methods, has shown powerful feature extraction in accomplishing point cloud classification. However, most of the existing deep learning frameworks, such as PointNet, dynamic graph convolutional neural network (DGCNN), and SparseConvNet, cannot consider the particularity of ALS point clouds. For large-scene laser point clouds, the current data preprocessing methods are mostly based on random sampling, which is not suitable for DEM extraction tasks. In this study, we propose a novel data sampling algorithm for the data preparation of patch-based training and classification named T-Sampling. T-Sampling uses the set of the lowest points in a certain area as basic points with other points added to supplement it, which can guarantee the integrity of the terrain in the sampling area. In the learning part, we propose a new convolution model based on terrain named Tin-EdgeConv that fully considers the spatial relationship between ground and non-ground points when constructing a directed graph. We design a new network based on Tin-EdgeConv to extract local features and use PointNet architecture to extract global context information. Finally, we combine this information effectively with a designed attention fusion module. These aspects are important in achieving high classification accuracy. We evaluate the proposed method by using large-scale data from forest areas. Results show that our method is more accurate than existing algorithms.

Download Full-text

Optimal Representation of Large-Scale Graph Data Based on Grid Clustering and K2-Tree

Mathematical Problems in Engineering ◽

10.1155/2020/2354875 ◽

2020 ◽

Vol 2020 ◽

pp. 1-8

Author(s):

Fengying Li ◽

Enyi Yang ◽

Anqiao Ma ◽

Rongsheng Dong

Keyword(s):

Adjacency Matrix ◽

Large Scale ◽

Compact Representation ◽

Graph Data ◽

Storage Overhead ◽

Time Space ◽

Query Algorithm ◽

Representation Scheme ◽

The Given ◽

Density Threshold

The application of appropriate graph data compression technology to store and manipulate graph data with tens of thousands of nodes and edges is a prerequisite for analyzing large-scale graph data. The traditional K2-tree representation scheme mechanically partitions the adjacency matrix, which causes the dense interval to be split, resulting in additional storage overhead. As the size of the graph data increases, the query time of K2-tree continues to increase. In view of the above problems, we propose a compact representation scheme for graph data based on grid clustering and K2-tree. Firstly, we divide the adjacency matrix into several grids of the same size. Then, we continuously filter and merge these grids until grid density satisfies the given density threshold. Finally, for each large grid that meets the density, K2-tree compact representation is performed. On this basis, we further give the relevant node neighbor query algorithm. The experimental results show that compared with the current best K2-BDC algorithm, our scheme can achieve better time/space tradeoff.

Download Full-text

Commonsense Knowledge Aware Conversation Generation with Graph Attention

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/643 ◽

2018 ◽

Cited By ~ 35

Author(s):

Hao Zhou ◽

Tom Young ◽

Minlie Huang ◽

Haizhou Zhao ◽

Jingfang Xu ◽

...

Keyword(s):

Language Processing ◽

Large Scale ◽

Semantic Information ◽

Attention Mechanism ◽

Generation Model ◽

Dynamic Graph ◽

Commonsense Knowledge ◽

Word Generation ◽

Proposed Model ◽

Knowledge Graphs

Commonsense knowledge is vital to many natural language processing tasks. In this paper, we present a novel open-domain conversation generation model to demonstrate how large-scale commonsense knowledge can facilitate language understanding and generation. Given a user post, the model retrieves relevant knowledge graphs from a knowledge base and then encodes the graphs with a static graph attention mechanism, which augments the semantic information of the post and thus supports better understanding of the post. Then, during word generation, the model attentively reads the retrieved knowledge graphs and the knowledge triples within each graph to facilitate better generation through a dynamic graph attention mechanism. This is the first attempt that uses large-scale commonsense knowledge in conversation generation. Furthermore, unlike existing models that use knowledge triples (entities) separately and independently, our model treats each knowledge graph as a whole, which encodes more structured, connected semantic information in the graphs. Experiments show that the proposed model can generate more appropriate and informative responses than state-of-the-art baselines.

Download Full-text

A New Dynamic Graph Structure for Large-Scale Transportation Networks

Lecture Notes in Computer Science - Algorithms and Complexity ◽

10.1007/978-3-642-38233-8_26 ◽

2013 ◽

pp. 312-323 ◽

Cited By ~ 3

Author(s):

Georgia Mali ◽

Panagiotis Michail ◽

Andreas Paraskevopoulos ◽

Christos Zaroliagis

Keyword(s):

Large Scale ◽

Transportation Networks ◽

Graph Structure ◽

Dynamic Graph

Download Full-text