FSees: Customized Enumeration of Chemical Subspaces with Limited Main Memory Consumption

2016 ◽  
Vol 56 (9) ◽  
pp. 1641-1653 ◽  
Author(s):  
Florian Lauck ◽  
Matthias Rarey
2014 ◽  
pp. 105-113 ◽  
Author(s):  
R. V. Nataraj ◽  
S. Selvan

In this paper, we propose a parallel algorithm for mining large maximal bicliques from graph datasets. We propose POP-MBC (Parallel Order Preserving Maximal BiClique mining algorithm), a fast and memory efficient parallel algorithm, which enumerates all the maximal bicliques independently and concurrently across several processors without any synchronization between the processors. The POP-MBC algorithm is highly memory efficient since it does not store the previously computed patterns in the main memory and requires only the dataset to be stored in the memory. To enhance the load sharing among different nodes, POP-MBC uses a round robin strategy which enables to achieve load balancing as high as 90%. We have also incorporated bit-vectors and numerous optimization techniques exploiting the symmetric property of the graph dataset to reduce the memory consumption and overall running time of the algorithm. Our comp rehensive experimental analyses involving publicly available datasets show that our algorithm distributes the load among the different processors equally and takes less memory, less running time than other maximal biclique mining algorithms.


Author(s):  
Huazhuang Yao ◽  
Yongyan Wang ◽  
Shuai Wang ◽  
Kun Li ◽  
Chao Guo

Author(s):  
R. A. Morozov ◽  
P. V. Trifonov

Introduction:Practical implementation of a communication system which employs a family of polar codes requires either to store a number of large specifications or to construct the codes by request. The first approach assumes extensive memory consumption, which is inappropriate for many applications, such as those for mobile devices. The second approach can be numerically unstable and hard to implement in low-end hardware. One of the solutions is specifying a family of codes by a sequence of subchannels sorted by reliability. However, this solution makes it impossible to separately optimize each code from the family.Purpose:Developing a method for compact specifications of polar codes and subcodes.Results:A method is proposed for compact specification of polar codes. It can be considered a trade-off between real-time construction and storing full-size specifications in memory. We propose to store compact specifications of polar codes which contain frozen set differences between the original pre-optimized polar codes and the polar codes constructed for a binary erasure channel with some erasure probability. Full-size specification needed for decoding can be restored from a compact one by a low-complexity hardware-friendly procedure. The proposed method can work with either polar codes or polar subcodes, allowing you to reduce the memory consumption by 15–50 times.Practical relevance:The method allows you to use families of individually optimized polar codes in devices with limited storage capacity. 


2021 ◽  
Vol 54 (1-2) ◽  
pp. 141-151
Author(s):  
Dragan Živanović ◽  
Milan Simić

An implementation of a two-stage piece-wise linearization method for reduction of the thermocouple approximation error is presented in the paper. First, the whole thermocouple measurement chain of a transducer is described, and possible error is analysed to define the required level of accuracy for linearization of the transfer characteristics. Evaluation of linearization functions and analysis of approximation errors are performed by the virtual instrumentation software package LabVIEW. The method is appropriate for thermocouples and other sensors where nonlinearity varies a lot over the range of input values. The basic principle of this method is to first transform the abscissa of the transfer function by a linear segment look-up table in such a way that significantly nonlinear parts of the input range are expanded before a standard piece-wise linearization. In this way, applying equal-segment linearization two times has a similar effect to non-equal-segment linearization. For a given examples of the thermocouple transfer functions, the suggested method provides significantly better reduction of the approximation error, than the standard segment linearization, with equal memory consumption for look-up tables. The simple software implementation of this two-stage linearization method allows it to be applied in low calculation power microcontroller measurement transducers, as a replacement of the standard piece-wise linear approximation method.


2021 ◽  
Author(s):  
Bashar Romanous ◽  
Skyler Windh ◽  
Ildar Absalyamov ◽  
Prerna Budhkar ◽  
Robert Halstead ◽  
...  

AbstractThe join and group-by aggregation are two memory intensive operators that are affecting the performance of relational databases. Hashing is a common approach used to implement both operators. Recent paradigm shifts in multi-core processor architectures have reinvigorated research into how the join and group-by aggregation operators can leverage these advances. However, the poor spatial locality of the hashing approach has hindered performance on multi-core processor architectures which rely on using large cache hierarchies for latency mitigation. Multithreaded architectures can better cope with poor spatial locality by masking memory latency with many outstanding requests. Nevertheless, the number of parallel threads, even in the most advanced multithreaded processors, such as UltraSPARC, is not enough to fully cover the main memory access latency. In this paper, we explore the hardware re-configurability of FPGAs to enable deeper execution pipelines that maintain hundreds (instead of tens) of outstanding memory requests across four FPGAs-drastically increasing concurrency and throughput. We present two end-to-end in-memory accelerators for the join and group-by aggregation operators using FPGAs. Both accelerators use massive multithreading to mask long memory delays of traversing linked-list data structures, while concurrently managing hundreds of thread states across four FPGAs locally. We explore how content addressable memories can be intermixed within our multithreaded designs to act as a synchronizing cache, which enforces locks and merges jobs together before they are written to memory. Throughput results for our hash-join operator accelerator show a speedup between 2$$\times $$ × and 3.4$$\times $$ × over the best multi-core approaches with comparable memory bandwidths on uniform and skewed datasets. The accelerator for the hash-based group-by aggregation operator demonstrates that leveraging CAMs achieves average speedup of 3.3$$\times $$ × with a best case of 9.4$$\times $$ × in terms of throughput over CPU implementations across five types of data distributions.


2021 ◽  
Vol 11 (5) ◽  
pp. 2405
Author(s):  
Yuxiang Sun ◽  
Tianyi Zhao ◽  
Seulgi Yoon ◽  
Yongju Lee

Semantic Web has recently gained traction with the use of Linked Open Data (LOD) on the Web. Although numerous state-of-the-art methodologies, standards, and technologies are applicable to the LOD cloud, many issues persist. Because the LOD cloud is based on graph-based resource description framework (RDF) triples and the SPARQL query language, we cannot directly adopt traditional techniques employed for database management systems or distributed computing systems. This paper addresses how the LOD cloud can be efficiently organized, retrieved, and evaluated. We propose a novel hybrid approach that combines the index and live exploration approaches for improved LOD join query performance. Using a two-step index structure combining a disk-based 3D R*-tree with the extended multidimensional histogram and flash memory-based k-d trees, we can efficiently discover interlinked data distributed across multiple resources. Because this method rapidly prunes numerous false hits, the performance of join query processing is remarkably improved. We also propose a hot-cold segment identification algorithm to identify regions of high interest. The proposed method is compared with existing popular methods on real RDF datasets. Results indicate that our method outperforms the existing methods because it can quickly obtain target results by reducing unnecessary data scanning and reduce the amount of main memory required to load filtering results.


Author(s):  
Muhammad Attahir Jibril ◽  
Philipp Götze ◽  
David Broneske ◽  
Kai-Uwe Sattler

AbstractAfter the introduction of Persistent Memory in the form of Intel’s Optane DC Persistent Memory on the market in 2019, it has found its way into manifold applications and systems. As Google and other cloud infrastructure providers are starting to incorporate Persistent Memory into their portfolio, it is only logical that cloud applications have to exploit its inherent properties. Persistent Memory can serve as a DRAM substitute, but guarantees persistence at the cost of compromised read/write performance compared to standard DRAM. These properties particularly affect the performance of index structures, since they are subject to frequent updates and queries. However, adapting each and every index structure to exploit the properties of Persistent Memory is tedious. Hence, we require a general technique that hides this access gap, e.g., by using DRAM caching strategies. To exploit Persistent Memory properties for analytical index structures, we propose selective caching. It is based on a mixture of dynamic and static caching of tree nodes in DRAM to reach near-DRAM access speeds for index structures. In this paper, we evaluate selective caching on the OLAP-optimized main-memory index structure Elf, because its memory layout allows for an easy caching. Our experiments show that if configured well, selective caching with a suitable replacement strategy can keep pace with pure DRAM storage of Elf while guaranteeing persistence. These results are also reflected when selective caching is used for parallel workloads.


Sign in / Sign up

Export Citation Format

Share Document