Data Reduction for Maximum Matching on Real-World Graphs

Journal of Experimental Algorithmics ◽

10.1145/3439801 ◽

2021 ◽

Vol 26 ◽

pp. 1-30

Author(s):

Tomohiro Koana ◽

Viatcheslav Korenwein ◽

André Nichterlein ◽

Rolf Niedermeier ◽

Philipp Zschoche

Keyword(s):

Real World ◽

Data Reduction ◽

Complexity Analysis ◽

Linear Time ◽

Theoretical Work ◽

Maximum Matching ◽

Maximum Cardinality ◽

Time Data ◽

Maximum Weight Matching ◽

Weighted Case

Finding a maximum-cardinality or maximum-weight matching in (edge-weighted) undirected graphs is among the most prominent problems of algorithmic graph theory. For n -vertex and m -edge graphs, the best-known algorithms run in Õ( m √ n ) time. We build on recent theoretical work focusing on linear-time data reduction rules for finding maximum-cardinality matchings and complement the theoretical results by presenting and analyzing (thereby employing the kernelization methodology of parameterized complexity analysis) new (near-)linear-time data reduction rules for both the unweighted and the positive-integer-weighted case. Moreover, we experimentally demonstrate that these data reduction rules provide significant speedups of the state-of-the art implementations for computing matchings in real-world graphs: the average speedup factor is 4.7 in the unweighted case and 12.72 in the weighted case.

Download Full-text

The Power of Linear-Time Data Reduction for Maximum Matching

Algorithmica ◽

10.1007/s00453-020-00736-0 ◽

2020 ◽

Vol 82 (12) ◽

pp. 3521-3565

Author(s):

George B. Mertzios ◽

André Nichterlein ◽

Rolf Niedermeier

Keyword(s):

Systematic Study ◽

Data Reduction ◽

Linear Time ◽

Maximum Matching ◽

Maximum Cardinality ◽

Solution Strategy ◽

Time Data ◽

Undirected Graphs ◽

Running Time ◽

General Graphs

Abstract Finding maximum-cardinality matchings in undirected graphs is arguably one of the most central graph primitives. For m-edge and n-vertex graphs, it is well-known to be solvable in $$O(m\sqrt{n})$$ O ( m n ) time; however, for several applications this running time is still too slow. We investigate how linear-time (and almost linear-time) data reduction (used as preprocessing) can alleviate the situation. More specifically, we focus on linear-time kernelization. We start a deeper and systematic study both for general graphs and for bipartite graphs. Our data reduction algorithms easily comply (in form of preprocessing) with every solution strategy (exact, approximate, heuristic), thus making them attractive in various settings.

Download Full-text

Machine Learning for Real-time Data Reduction in Cloud of Things

2020 2nd International Conference on Computer and Information Sciences (ICCIS) ◽

10.1109/iccis49240.2020.9257645 ◽

2020 ◽

Author(s):

Atheer Alahmed ◽

Amal Alrasheedi ◽

Maha Alharbi ◽

Norah Alrebdi ◽

Marwan Aleasa ◽

...

Keyword(s):

Machine Learning ◽

Real Time ◽

Data Reduction ◽

Time Data ◽

Real Time Data

Download Full-text

Advancing Clinical Cohort Selection with Genomics Analysis on a Distributed Platform

10.21203/rs.2.9249/v1 ◽

2019 ◽

Author(s):

Jaclyn Marjorie Smith ◽

Melvin Lathara ◽

Hollis Wright ◽

Brian Hill ◽

Nalini Ganapati ◽

...

Keyword(s):

Precision Medicine ◽

Large Scale ◽

Linear Time ◽

Distributed Storage ◽

Treatment Options ◽

Ease Of Use ◽

Time Data ◽

Worst Case ◽

Research Perspective ◽

Medicine Analysis

Abstract Background The affordability of next-generation genomic sequencing and the improvement of medical data management have contributed largely to the evolution of biological analysis from both a clinical and research perspective. Precision medicine is a response to these advancements that places individuals into better-defined subsets based on shared clinical and genetic features. The identification of personalized diagnosis and treatment options is dependent on the ability to draw insights from large-scale, multi-modal analysis of biomedical datasets. Driven by a real use case, we premise that platforms that support precision medicine analysis should maintain data in their optimal data stores, should support distributed storage and query mechanisms, and should scale as more samples are added to the system. Results We extended a genomics-based columnar data store, GenomicsDB, for ease of use within a distributed analytics platform for clinical and genomic data integration, known as the ODA framework. The framework supports interaction from an i2b2 plugin as well as a notebook environment. We show that the ODA framework exhibits worst-case linear scaling for array size (storage), import time (data construction), and query time for an increasing number of samples. We go on to show worst-case linear time for both import of clinical data and aggregate query execution time within a distributed environment. Conclusions This work highlights the integration of a distributed genomic database with a distributed compute environment to support scalable and efficient precision medicine queries from a HIPAA-compliant, cohort system in a real-world setting. The ODA framework is currently deployed in production to support precision medicine exploration and analysis from clinicians and researchers at UCLA David Geffen School of Medicine.

Download Full-text

Linear-Time Approximation for Maximum Weight Matching

Journal of the ACM ◽

10.1145/2529989 ◽

2014 ◽

Vol 61 (1) ◽

pp. 1-23 ◽

Cited By ~ 56

Author(s):

Ran Duan ◽

Seth Pettie

Keyword(s):

Linear Time ◽

Maximum Weight ◽

Time Approximation ◽

Maximum Weight Matching

Download Full-text

A simple reduction from maximum weight matching to maximum cardinality matching

Information Processing Letters ◽

10.1016/j.ipl.2012.08.010 ◽

2012 ◽

Vol 112 (23) ◽

pp. 893-898 ◽

Cited By ~ 6

Author(s):

S. Pettie

Keyword(s):

Maximum Cardinality ◽

Maximum Weight ◽

Maximum Weight Matching ◽

Simple Reduction

Download Full-text

An adaptive framework for real-time data reduction in AMI

Journal of King Saud University - Computer and Information Sciences ◽

10.1016/j.jksuci.2018.02.012 ◽

2019 ◽

Vol 31 (3) ◽

pp. 392-402 ◽

Cited By ~ 3

Author(s):

Marwa F. Mohamed ◽

Abd El-Rahman Shabayek ◽

Mahmoud El-Gayyar ◽

Hamed Nassar

Keyword(s):

Real Time ◽

Data Reduction ◽

Time Data ◽

Real Time Data ◽

Adaptive Framework

Download Full-text

Superbubbles as an empirical characteristic of directed networks

Network Science ◽

10.1017/nws.2020.32 ◽

2020 ◽

pp. 1-10

Author(s):

Fabian Gärtner ◽

Felix Kühnl ◽

Carsten R. Seemann ◽

Christian Höner Zu Siederdissen ◽

Peter F. Stadler ◽

...

Keyword(s):

Computational Biology ◽

Graphical Models ◽

Genome Assembly ◽

Real World ◽

Linear Time ◽

Directed Networks ◽

Induced Subgraphs ◽

Convenient Means ◽

Graph Parameters

Abstract Superbubbles are acyclic induced subgraphs of a digraph with single entrance and exit that naturally arise in the context of genome assembly and the analysis of genome alignments in computational biology. These structures can be computed in linear time and are confined to non-symmetric digraphs. We demonstrate empirically that graph parameters derived from superbubbles provide a convenient means of distinguishing different classes of real-world graphical models, while being largely unrelated to simple, commonly used parameters.

Download Full-text

The on-line curvilinear component analysis (onCCA) for real-time data reduction

2015 International Joint Conference on Neural Networks (IJCNN) ◽

10.1109/ijcnn.2015.7280318 ◽

2015 ◽

Cited By ~ 9

Author(s):

G. Cirrincione ◽

J. Herault ◽

V. Randazzo

Keyword(s):

Real Time ◽

Data Reduction ◽

Component Analysis ◽

Time Data ◽

Real Time Data ◽

On Line

Download Full-text

Identifying Hierarchical Structure in Sequences: A linear-time algorithm

Journal of Artificial Intelligence Research ◽

10.1613/jair.374 ◽

1997 ◽

Vol 7 ◽

pp. 67-82 ◽

Cited By ~ 234

Author(s):

C. G. Nevill-Manning ◽

I. H. Witten

Keyword(s):

Hierarchical Structure ◽

Real World ◽

Simple Structure ◽

Linear Time ◽

Time Algorithm ◽

Linear Time Algorithm ◽

Original Sequence ◽

Lexical Structure ◽

Extensive Range ◽

Grammatical Rule

SEQUITUR is an algorithm that infers a hierarchical structure from a sequence of discrete symbols by replacing repeated phrases with a grammatical rule that generates the phrase, and continuing this process recursively. The result is a hierarchical representation of the original sequence, which offers insights into its lexical structure. The algorithm is driven by two constraints that reduce the size of the grammar, and produce structure as a by-product. SEQUITUR breaks new ground by operating incrementally. Moreover, the method's simple structure permits a proof that it operates in space and time that is linear in the size of the input. Our implementation can process 50,000 symbols per second and has been applied to an extensive range of real world sequences.

Download Full-text

Solving the Maximum Matching Problem on Bipartite <i>Star</i><sub>123</sub>-Free Graphs in Linear Time

Open Journal of Discrete Mathematics ◽

10.4236/ojdm.2016.61003 ◽

2016 ◽

Vol 06 (01) ◽

pp. 13-24

Author(s):

Ruzayn Quaddoura

Keyword(s):

Linear Time ◽

Maximum Matching ◽

Matching Problem

Download Full-text