scholarly journals Optimal Representation of Large-Scale Graph Data Based on Grid Clustering and K2-Tree

2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Fengying Li ◽  
Enyi Yang ◽  
Anqiao Ma ◽  
Rongsheng Dong

The application of appropriate graph data compression technology to store and manipulate graph data with tens of thousands of nodes and edges is a prerequisite for analyzing large-scale graph data. The traditional K2-tree representation scheme mechanically partitions the adjacency matrix, which causes the dense interval to be split, resulting in additional storage overhead. As the size of the graph data increases, the query time of K2-tree continues to increase. In view of the above problems, we propose a compact representation scheme for graph data based on grid clustering and K2-tree. Firstly, we divide the adjacency matrix into several grids of the same size. Then, we continuously filter and merge these grids until grid density satisfies the given density threshold. Finally, for each large grid that meets the density, K2-tree compact representation is performed. On this basis, we further give the relevant node neighbor query algorithm. The experimental results show that compared with the current best K2-BDC algorithm, our scheme can achieve better time/space tradeoff.

Author(s):  
P. Viswanath ◽  
M. Narasimha Murty ◽  
Shalabh Bhatnagar

Two major problems in applying any pattern recognition technique for large and high-dimensional data are (a) high computational requirements and (b) curse of dimensionality (Duda, Hart, & Stork, 2000). Algorithmic improvements and approximate methods can solve the first problem, whereas feature selection (Guyon & Elisseeff, 2003), feature extraction (Terabe, Washio, Motoda, Katai, & Sawaragi, 2002), and bootstrapping techniques (Efron, 1979; Hamamoto, Uchimura, & Tomita, 1997) can tackle the second problem. We propose a novel and unified solution for these problems by deriving a compact and generalized abstraction of the data. By this term, we mean a compact representation of the given patterns from which one can retrieve not only the original patterns but also some artificial patterns. The compactness of the abstraction reduces the computational requirements, and its generalization reduces the curse of dimensionality effect. Pattern synthesis techniques accompanied with compact representations attempt to derive compact and generalized abstractions of the data. These techniques are applied with nearest neighbor classifier (NNC), which is a popular nonparametric classifier used in many fields, including data mining, since its conception in the early 1950s (Dasarathy, 2002).


2021 ◽  
Author(s):  
Toshitake Asabuki ◽  
Tomoki Fukai

The brain performs various cognitive functions by learning the spatiotemporal salient features of the environment. This learning likely requires unsupervised segmentation of hierarchically organized spike sequences, but the underlying neural mechanism is only poorly understood. Here, we show that a recurrent gated network of neurons with dendrites can context-dependently solve difficult segmentation tasks. Dendrites in this model learn to predict somatic responses in a self-supervising manner while recurrent connections learn a context-dependent gating of dendro-somatic current flows to minimize a prediction error. These connections select particular information suitable for the given context from input features redundantly learned by the dendrites. The model selectively learned salient segments in complex synthetic sequences. Furthermore, the model was also effective for detecting multiple cell assemblies repeating in large-scale calcium imaging data of more than 6,500 cortical neurons. Our results suggest that recurrent gating and dendrites are crucial for cortical learning of context-dependent segmentation tasks.


Author(s):  
Arsenii Shirokov ◽  
Denis Kuplyakov ◽  
Anton Konushin

The article deals with the problem of counting cars in large-scale video surveillance systems. The proposed method is based on car tracking and counting the number of tracks intersecting the given signal line. We use a distributed tracking algorithm. It reduces the amount of necessary computational resources and increases performance up to realtime by detecting vehicles in a sparse set of frames. We adapted and modified the approach previously proposed for people tracking. Proposed improvement of the speed estimation module and refinement of the motion model reduced the detection frequency by 3 times. The experimental evaluation shows that the proposed algorithm allows reaching an acceptable counting quality with a detection frequency of 3 Hz.


Author(s):  
Josef Los ◽  
Jiří Fryč ◽  
Zdeněk Konrád

The method of drying maize for grain has been recently employed on a large scale in the Czech Republic not only thanks to new maize hybrids but also thanks to the existence of new models of drying plants. One of the new post-harvest lines is a plant in Lipoltice (mobile dryer installed in 2010, storage base in 2012) where basic operational measurements were made of the energy intensiveness of drying and operating parameters of the maize dryer were evaluated. The process of maize drying had two stages, i.e. pre-drying from the initial average grain humidity of 28.55% to 19.6% in the first stage, and the additional drying from 16.7% to a final storage grain humidity of 13.7%. Mean volumes of natural gas consumed per 1 t% for drying in the first and second stage amounted to 1.275 m3 and 1.56 m3, respectively. The total mean consumption of electric energy per 1 t% was calculated to be 1.372 kWh for the given configuration of the post-harvest line.


2020 ◽  
Author(s):  
Fayyaz Minhas ◽  
Dimitris Grammatopoulos ◽  
Lawrence Young ◽  
Imran Amin ◽  
David Snead ◽  
...  

AbstractOne of the challenges in the current COVID-19 crisis is the time and cost of performing tests especially for large-scale population surveillance. Since, the probability of testing positive in large population studies is expected to be small (<15%), therefore, most of the test outcomes will be negative. Here, we propose the use of agglomerative sampling which can prune out multiple negative cases in a single test by intelligently combining samples from different individuals. The proposed scheme builds on the assumption that samples from the population may not be independent of each other. Our simulation results show that the proposed sampling strategy can significantly increase testing capacity under resource constraints: on average, a saving of ~40% tests can be expected assuming a positive test probability of 10% across the given samples. The proposed scheme can also be used in conjunction with heuristic or Machine Learning guided clustering for improving the efficiency of large-scale testing further. The code for generating the simulation results for this work is available here: https://github.com/foxtrotmike/AS.


Author(s):  
Hengyi Cai ◽  
Hongshen Chen ◽  
Yonghao Song ◽  
Xiaofang Zhao ◽  
Dawei Yin

Humans benefit from previous experiences when taking actions. Similarly, related examples from the training data also provide exemplary information for neural dialogue models when responding to a given input message. However, effectively fusing such exemplary information into dialogue generation is non-trivial: useful exemplars are required to be not only literally-similar, but also topic-related with the given context. Noisy exemplars impair the neural dialogue models understanding the conversation topics and even corrupt the response generation. To address the issues, we propose an exemplar guided neural dialogue generation model where exemplar responses are retrieved in terms of both the text similarity and the topic proximity through a two-stage exemplar retrieval model. In the first stage, a small subset of conversations is retrieved from a training set given a dialogue context. These candidate exemplars are then finely ranked regarding the topical proximity to choose the best-matched exemplar response. To further induce the neural dialogue generation model consulting the exemplar response and the conversation topics more faithfully, we introduce a multi-source sampling mechanism to provide the dialogue model with both local exemplary semantics and global topical guidance during decoding. Empirical evaluations on a large-scale conversation dataset show that the proposed approach significantly outperforms the state-of-the-art in terms of both the quantitative metrics and human evaluations.


Author(s):  
Tymofii HAVRYLIV

This article is one of the first scholarly attempts to analyze the creative work of Ukrainian filmmaker and traveler Sofiia Yablonska-Uden. For the first time in the Ukrainian and the world literary studies, identical implications are analyzed in the «From the Country of Rice and Opium» by S. Yablonska. The purpose of the article is to highlight the complex nature of identity issues in travel literature. In terms of identity, the journey performs two fundamental, closely interconnected tasks: knowledge of the other and self-knowledge. Hermeneutic approaches are used in the article. The main results can be summarized as follows: 1) the journey has its own time-spatial dimension, consisting of two disproportionate moments: preparation for travel and travel itself, and begins literally and symbolically with the overcoming, or the crossing of the border; 2) the intention of the trip contains an identity challenge that affects the preparation, organization, realization of the travel, the way and the content of documenting impressions; 3) such parameters of travel as an accident, an adventure, a game which formed the world of traveler's impressions, are subordinated to the identity problem in the given work; 4) the essay character of the book makes it possible to talk about implications as a response to an identity challenge. The book of travel essays «From the Country of Rice and Opium» of S. Yablonska-Uden is a sample of a successful combination of the business and private aspects of travel, intentions of knowledge and self-knowledge, poetry and faculty; learning about another people and countries, the writer learns a lot of things about himself. Travel literature is an important study object of Ukrainian writing, which opens the prospects for further interdisciplinary studies. The study of travel literature, an identity issue, is extremely relevant both for the development of Ukrainian society and for the formation of optimal responses to the challenges of our time. Keywords travel, travel literature, identity, identical implications, time-space disposition.


2013 ◽  
Vol 441 ◽  
pp. 691-694
Author(s):  
Yi Qun Zeng ◽  
Jing Bin Wang

With the rapid development of information technology, data grows explosionly, how to deal with the large scale data become more and more important. Based on the characteristics of RDF data, we propose to compress RDF data. We construct an index structure called PAR-Tree Index, then base on the MapReduce parallel computing framework and the PAR-Tree Index to execute the query. Experimental results show that the algorithm can improve the efficiency of large data query.


2006 ◽  
Vol 19 (7) ◽  
pp. 1238-1260 ◽  
Author(s):  
Hiroki Ichikawa ◽  
Tetsuzo Yasunari

Abstract Five years of Tropical Rainfall Measuring Mission (TRMM) Precipitation Radar (PR) data were used to investigate the time and space characteristics of the diurnal cycle of rainfall over and around Borneo, an island in the Maritime Continent. The diurnal cycle shows a systematic modulation that is associated with intraseasonal variability in the large-scale circulation pattern, with regimes associated with low-level easterlies or westerlies over the island. The lower-tropospheric westerly (easterly) components correspond to periods of active (inactive) convection over the island that are associated with the passage of intraseasonal atmospheric disturbances related to the Madden–Julian oscillation. A striking feature is that rainfall activity propagates to the leeward side of the island between midnight and morning. The inferred phase speed of the propagation is about 3 m s−1 in the easterly regime and 7 m s−1 in the westerly regime. Propagation occurs over the entire island, causing a leeward enhancement of rainfall. The vertical structure of the developed convection/rainfall system differs remarkably between the two regimes. In the easterly regime, stratiform rains are widespread over the island at midnight, whereas in the westerly regime, local convective rainfall dominates. Over offshore regions, convective rainfall initially dominates then gradually decreases in both regimes, while the storms develop into deeper convective systems in the easterly regime. Aside from leeward rainfall propagation, shallow storms develop over the South China Sea region during the westerly regime, resulting in heavy precipitation from midnight through morning.


Sign in / Sign up

Export Citation Format

Share Document