Balanced counting Bloom filters: a space-efficient synoptic data structure for a high-performance network

2012 ◽  
Vol 6 (15) ◽  
pp. 2259-2266 ◽  
Author(s):  
Z. Zhang ◽  
J. Liu ◽  
B.Q. Wang
2018 ◽  
Vol 15 (10) ◽  
pp. 117-128 ◽  
Author(s):  
Jinyuan Zhao ◽  
Zhigang Hu ◽  
Bing Xiong ◽  
Keqin Li

2015 ◽  
Vol 2015 ◽  
pp. 1-8 ◽  
Author(s):  
Inanç Birol ◽  
Justin Chu ◽  
Hamid Mohamadi ◽  
Shaun D. Jackman ◽  
Karthika Raghavan ◽  
...  

De novoassembly of the genome of a species is essential in the absence of a reference genome sequence. Many scalable assembly algorithms use the de Bruijn graph (DBG) paradigm to reconstruct genomes, where a table of subsequences of a certain length is derived from the reads, and their overlaps are analyzed to assemble sequences. Despite longer subsequences unlocking longer genomic features for assembly, associated increase in compute resources limits the practicability of DBG over other assembly archetypes already designed for longer reads. Here, we revisit the DBG paradigm to adapt it to the changing sequencing technology landscape and introduce three data structure designs for spaced seeds in the form of paired subsequences. These data structures address memory and run time constraints imposed by longer reads. We observe that when a fixed distance separates seed pairs, it provides increased sequence specificity with increased gap length. Further, we note that Bloom filters would be suitable to implicitly store spaced seeds and be tolerant to sequencing errors. Building on this concept, we describe a data structure for tracking the frequencies of observed spaced seeds. These data structure designs will have applications in genome, transcriptome and metagenome assemblies, and read error correction.


VLSI Design ◽  
2000 ◽  
Vol 11 (4) ◽  
pp. 405-415
Author(s):  
D. Torres ◽  
A. Larios ◽  
M. Guzmán

The design for a routing table circuit for Ethernet-, IP- and ATM-applications is presented. Starting point for the design was an object-oriented general behavior of the routing table. The selected data structure for the routing table is based on a modification of the structure denominated trie, saving one search level and memory space. The architecture for searching and sorting of data, implemented in hardware, is explained. This modified trie stores 64 K addresses and the associated data, achieving a high performance too. The circuit, which can support a flow of 500000 frames/s, is connected to the PCI Bus. For the implementation a FLEX10K100 from Altera Company was used.


2019 ◽  
Vol 35 (23) ◽  
pp. 4907-4911 ◽  
Author(s):  
Jianglin Feng ◽  
Aakrosh Ratan ◽  
Nathan C Sheffield

Abstract Motivation Genomic data is frequently stored as segments or intervals. Because this data type is so common, interval-based comparisons are fundamental to genomic analysis. As the volume of available genomic data grows, developing efficient and scalable methods for searching interval data is necessary. Results We present a new data structure, the Augmented Interval List (AIList), to enumerate intersections between a query interval q and an interval set R. An AIList is constructed by first sorting R as a list by the interval start coordinate, then decomposing it into a few approximately flattened components (sublists), and then augmenting each sublist with the running maximum interval end. The query time for AIList is O(log2N+n+m), where n is the number of overlaps between R and q, N is the number of intervals in the set R and m is the average number of extra comparisons required to find the n overlaps. Tested on real genomic interval datasets, AIList code runs 5–18 times faster than standard high-performance code based on augmented interval-trees, nested containment lists or R-trees (BEDTools). For large datasets, the memory-usage for AIList is 4–60% of other methods. The AIList data structure, therefore, provides a significantly improved fundamental operation for highly scalable genomic data analysis. Availability and implementation An implementation of the AIList data structure with both construction and search algorithms is available at http://ailist.databio.org. Supplementary information Supplementary data are available at Bioinformatics online.


2014 ◽  
Vol 50 (22) ◽  
pp. 1602-1604 ◽  
Author(s):  
P. Reviriego ◽  
J.A. Maestro

2001 ◽  
Vol 01 (02) ◽  
pp. 251-271
Author(s):  
KWANG-MAN OH ◽  
JEONG-DAN CHOI ◽  
CHAN-SU LEE ◽  
CHAN-JONG PARK ◽  
EE-TAEK LEE

This paper presents an efficient and simple quad edge conversion method of polygonal (manifold) objects. In a wide variety of applications such as scientific visualization, virtual reality and computer aided geometric design, polygonal objects are expected to be visualized and manipulated within a given time constraint. To achieve these expectations, it is necessary to introduce an efficient data structure as well as high performance graphics hardware and real-time processing techniques such as simplification and level of details. The quad edge data structure is very efficient for handling polygonal objects even though it was originally designed to handle the subdivisions of manifold objects such as Delaunay triangulations and Voronoi diagrams. It, however, has not been used widely because there is no efficient algorithm for quad edge conversion of conventional polygonal objects. In this paper, we propose a new incremental quad edge conversion algorithm that processes the triangles one by one. Since quad edge has only the splice as a topological operator, the quad edge conversion of each triangle is done by applying three splice operations, a splice per vertex. As an applicaion for the quad edge, a simplification of conventional polygonal objects is implemented. It includes the removing, moving, replacing, and inserting of vertices and edges.


Sign in / Sign up

Export Citation Format

Share Document