Balanced counting Bloom filters: a space-efficient synoptic data structure for a high-performance network

Z. Zhang; J. Liu; B.Q. Wang

doi:10.1049/iet-com.2011.0961

Accelerating packet classification with counting bloom filters for virtual OpenFlow switching

China Communications ◽

10.1109/cc.2018.8485474 ◽

2018 ◽

Vol 15 (10) ◽

pp. 117-128 ◽

Cited By ~ 2

Author(s):

Jinyuan Zhao ◽

Zhigang Hu ◽

Bing Xiong ◽

Keqin Li

Keyword(s):

Packet Classification ◽

Bloom Filters ◽

Counting Bloom Filters

Download Full-text

Spaced Seed Data Structures forDe NovoAssembly

International Journal of Genomics ◽

10.1155/2015/196591 ◽

2015 ◽

Vol 2015 ◽

pp. 1-8 ◽

Cited By ~ 3

Author(s):

Inanç Birol ◽

Justin Chu ◽

Hamid Mohamadi ◽

Shaun D. Jackman ◽

Karthika Raghavan ◽

...

Keyword(s):

Data Structure ◽

Data Structures ◽

De Novo ◽

Bloom Filters ◽

De Bruijn Graph ◽

Sequence Specificity ◽

Sequencing Errors ◽

Spaced Seeds ◽

Read Error Correction ◽

Seed Data

De novoassembly of the genome of a species is essential in the absence of a reference genome sequence. Many scalable assembly algorithms use the de Bruijn graph (DBG) paradigm to reconstruct genomes, where a table of subsequences of a certain length is derived from the reads, and their overlaps are analyzed to assemble sequences. Despite longer subsequences unlocking longer genomic features for assembly, associated increase in compute resources limits the practicability of DBG over other assembly archetypes already designed for longer reads. Here, we revisit the DBG paradigm to adapt it to the changing sequencing technology landscape and introduce three data structure designs for spaced seeds in the form of paired subsequences. These data structures address memory and run time constraints imposed by longer reads. We observe that when a fixed distance separates seed pairs, it provides increased sequence specificity with increased gap length. Further, we note that Bloom filters would be suitable to implicitly store spaced seeds and be tolerant to sequencing errors. Building on this concept, we describe a data structure for tracking the frequencies of observed spaced seeds. These data structure designs will have applications in genome, transcriptome and metagenome assemblies, and read error correction.

Download Full-text

A Chip for a Routing Table Based on a Novel Modified Trie Algorithm

VLSI Design ◽

10.1155/2000/81057 ◽

2000 ◽

Vol 11 (4) ◽

pp. 405-415

Author(s):

D. Torres ◽

A. Larios ◽

M. Guzmán

Keyword(s):

Data Structure ◽

High Performance ◽

Object Oriented ◽

Pci Bus ◽

Memory Space ◽

General Behavior ◽

Routing Table ◽

Starting Point ◽

Associated Data

The design for a routing table circuit for Ethernet-, IP- and ATM-applications is presented. Starting point for the design was an object-oriented general behavior of the routing table. The selected data structure for the routing table is based on a modification of the structure denominated trie, saving one search level and memory space. The architecture for searching and sorting of data, implemented in hardware, is explained. This modified trie stores 64 K addresses and the associated data, achieving a high performance too. The circuit, which can support a flow of 500000 frames/s, is connected to the PCI Bus. For the implementation a FLEX10K100 from Altera Company was used.

Download Full-text

A Multi-attribute Data Structure with Parallel Bloom Filters for Network Services

High Performance Computing - HiPC 2006 - Lecture Notes in Computer Science ◽

10.1007/11945918_30 ◽

2006 ◽

pp. 277-288 ◽

Cited By ~ 9

Author(s):

Yu Hua ◽

Bin Xiao

Keyword(s):

Data Structure ◽

Bloom Filters ◽

Network Services ◽

Attribute Data

Download Full-text

Definition of a Generic Mesh Data Structure in the High Performance Computing Context

Computational Science, Engineering & Technology Series - Developments in Engineering Computational Technology ◽

10.4203/csets.26.3 ◽

2010 ◽

pp. 49-80 ◽

Cited By ~ 1

Author(s):

F. Ledoux ◽

J.-C. Weill ◽

Y. Bertrand

Keyword(s):

Data Structure ◽

High Performance Computing ◽

High Performance ◽

Definition Of ◽

Performance Computing

Download Full-text

Augmented Interval List: a novel data structure for efficient genomic interval search

Bioinformatics ◽

10.1093/bioinformatics/btz407 ◽

2019 ◽

Vol 35 (23) ◽

pp. 4907-4911 ◽

Cited By ~ 8

Author(s):

Jianglin Feng ◽

Aakrosh Ratan ◽

Nathan C Sheffield

Keyword(s):

Data Structure ◽

High Performance ◽

Genomic Analysis ◽

Genomic Data ◽

Interval Data ◽

Supplementary Information ◽

Genomic Interval ◽

Interval Trees ◽

Running Maximum ◽

Scalable Methods

Abstract Motivation Genomic data is frequently stored as segments or intervals. Because this data type is so common, interval-based comparisons are fundamental to genomic analysis. As the volume of available genomic data grows, developing efficient and scalable methods for searching interval data is necessary. Results We present a new data structure, the Augmented Interval List (AIList), to enumerate intersections between a query interval q and an interval set R. An AIList is constructed by first sorting R as a list by the interval start coordinate, then decomposing it into a few approximately flattened components (sublists), and then augmenting each sublist with the running maximum interval end. The query time for AIList is O(log2N+n+m), where n is the number of overlaps between R and q, N is the number of intervals in the set R and m is the average number of extra comparisons required to find the n overlaps. Tested on real genomic interval datasets, AIList code runs 5–18 times faster than standard high-performance code based on augmented interval-trees, nested containment lists or R-trees (BEDTools). For large datasets, the memory-usage for AIList is 4–60% of other methods. The AIList data structure, therefore, provides a significantly improved fundamental operation for highly scalable genomic data analysis. Availability and implementation An implementation of the AIList data structure with both construction and search algorithms is available at http://ailist.databio.org. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Implementing error detection in fast counting Bloom filters

Electronics Letters ◽

10.1049/el.2014.3097 ◽

2014 ◽

Vol 50 (22) ◽

pp. 1602-1604 ◽

Cited By ~ 1

Author(s):

P. Reviriego ◽

J.A. Maestro

Keyword(s):

Error Detection ◽

Bloom Filters ◽

Counting Bloom Filters

Download Full-text

Discovery protocol for data distribution service in naval warships using extended counting bloom filters

2013 IEEE 18th Conference on Emerging Technologies & Factory Automation (ETFA) ◽

10.1109/etfa.2013.6648047 ◽

2013 ◽

Cited By ~ 2

Author(s):

Handityo Aulia Putra ◽

Dwi Agung Nugroho ◽

Dong-Seong Kim ◽

Yoon-Suk Choi

Keyword(s):

Data Distribution ◽

Bloom Filters ◽

Data Distribution Service ◽

Counting Bloom Filters

Download Full-text

A Cache Architecture for Counting Bloom Filters

2007 15th IEEE International Conference on Networks ◽

10.1109/icon.2007.4444089 ◽

2007 ◽

Cited By ~ 12

Author(s):

Mahmood Ahmadi ◽

Stephan Wong

Keyword(s):

Bloom Filters ◽

Cache Architecture ◽

Counting Bloom Filters

Download Full-text

AN EFFICIENT AND SIMPLE QUAD EDGE CONVERSION OF POLYGONAL MAINFOLD OBJECTS

International Journal of Image and Graphics ◽

10.1142/s0219467801000165 ◽

2001 ◽

Vol 01 (02) ◽

pp. 251-271

Author(s):

KWANG-MAN OH ◽

JEONG-DAN CHOI ◽

CHAN-SU LEE ◽

CHAN-JONG PARK ◽

EE-TAEK LEE

Keyword(s):

Data Structure ◽

High Performance ◽

Graphics Hardware ◽

Delaunay Triangulations ◽

Computer Aided Geometric Design ◽

Real Time Processing ◽

Time Processing ◽

Level Of Details ◽

Efficient Data ◽

Processing Techniques

This paper presents an efficient and simple quad edge conversion method of polygonal (manifold) objects. In a wide variety of applications such as scientific visualization, virtual reality and computer aided geometric design, polygonal objects are expected to be visualized and manipulated within a given time constraint. To achieve these expectations, it is necessary to introduce an efficient data structure as well as high performance graphics hardware and real-time processing techniques such as simplification and level of details. The quad edge data structure is very efficient for handling polygonal objects even though it was originally designed to handle the subdivisions of manifold objects such as Delaunay triangulations and Voronoi diagrams. It, however, has not been used widely because there is no efficient algorithm for quad edge conversion of conventional polygonal objects. In this paper, we propose a new incremental quad edge conversion algorithm that processes the triangles one by one. Since quad edge has only the splice as a topological operator, the quad edge conversion of each triangle is done by applying three splice operations, a splice per vertex. As an applicaion for the quad edge, a simplification of conventional polygonal objects is implemented. It includes the removing, moving, replacing, and inserting of vertices and edges.

Download Full-text