scholarly journals Superbubbles as an empirical characteristic of directed networks

2020 ◽  
pp. 1-10
Author(s):  
Fabian Gärtner ◽  
Felix Kühnl ◽  
Carsten R. Seemann ◽  
Christian Höner Zu Siederdissen ◽  
Peter F. Stadler ◽  
...  

Abstract Superbubbles are acyclic induced subgraphs of a digraph with single entrance and exit that naturally arise in the context of genome assembly and the analysis of genome alignments in computational biology. These structures can be computed in linear time and are confined to non-symmetric digraphs. We demonstrate empirically that graph parameters derived from superbubbles provide a convenient means of distinguishing different classes of real-world graphical models, while being largely unrelated to simple, commonly used parameters.

2014 ◽  
Vol 10 (1) ◽  
pp. 1-19 ◽  
Author(s):  
J. Wang ◽  
J. Emile-Geay ◽  
D. Guillot ◽  
J. E. Smerdon ◽  
B. Rajaratnam

Abstract. Pseudoproxy experiments (PPEs) have become an important framework for evaluating paleoclimate reconstruction methods. Most existing PPE studies assume constant proxy availability through time and uniform proxy quality across the pseudoproxy network. Real multiproxy networks are, however, marked by pronounced disparities in proxy quality, and a steep decline in proxy availability back in time, either of which may have large effects on reconstruction skill. A suite of PPEs constructed from a millennium-length general circulation model (GCM) simulation is thus designed to mimic these various real-world characteristics. The new pseudoproxy network is used to evaluate four climate field reconstruction (CFR) techniques: truncated total least squares embedded within the regularized EM (expectation-maximization) algorithm (RegEM-TTLS), the Mann et al. (2009) implementation of RegEM-TTLS (M09), canonical correlation analysis (CCA), and Gaussian graphical models embedded within RegEM (GraphEM). Each method's risk properties are also assessed via a 100-member noise ensemble. Contrary to expectation, it is found that reconstruction skill does not vary monotonically with proxy availability, but also is a function of the type and amplitude of climate variability (forced events vs. internal variability). The use of realistic spatiotemporal pseudoproxy characteristics also exposes large inter-method differences. Despite the comparable fidelity in reconstructing the global mean temperature, spatial skill varies considerably between CFR techniques. Both GraphEM and CCA efficiently exploit teleconnections, and produce consistent reconstructions across the ensemble. RegEM-TTLS and M09 appear advantageous for reconstructions on highly noisy data, but are subject to larger stochastic variations across different realizations of pseudoproxy noise. Results collectively highlight the importance of designing realistic pseudoproxy networks and implementing multiple noise realizations of PPEs. The results also underscore the difficulty in finding the proper bias-variance tradeoff for jointly optimizing the spatial skill of CFRs and the fidelity of the global mean reconstructions.


2020 ◽  
Vol 34 (02) ◽  
pp. 1644-1651
Author(s):  
Yuki Satake ◽  
Hiroshi Unno ◽  
Hinata Yanagi

In this paper, we present a novel constraint solving method for a class of predicate Constraint Satisfaction Problems (pCSP) where each constraint is represented by an arbitrary clause of first-order predicate logic over predicate variables. The class of pCSP properly subsumes the well-studied class of Constrained Horn Clauses (CHCs) where each constraint is restricted to a Horn clause. The class of CHCs has been widely applied to verification of linear-time safety properties of programs in different paradigms. In this paper, we show that pCSP further widens the applicability to verification of branching-time safety properties of programs that exhibit finitely-branching non-determinism. Solving pCSP (and CHCs) however is challenging because the search space of solutions is often very large (or unbounded), high-dimensional, and non-smooth. To address these challenges, our method naturally combines techniques studied separately in different literatures: counterexample guided inductive synthesis (CEGIS) and probabilistic inference in graphical models. We have implemented the presented method and obtained promising results on existing benchmarks as well as new ones that are beyond the scope of existing CHC solvers.


2020 ◽  
Vol 21 (1) ◽  
pp. 139-162 ◽  
Author(s):  
Jordan M. Eizenga ◽  
Adam M. Novak ◽  
Jonas A. Sibbesen ◽  
Simon Heumos ◽  
Ali Ghaffaari ◽  
...  

Low-cost whole-genome assembly has enabled the collection of haplotype-resolved pangenomes for numerous organisms. In turn, this technological change is encouraging the development of methods that can precisely address the sequence and variation described in large collections of related genomes. These approaches often use graphical models of the pangenome to support algorithms for sequence alignment, visualization, functional genomics, and association studies. The additional information provided to these methods by the pangenome allows them to achieve superior performance on a variety of bioinformatic tasks, including read alignment, variant calling, and genotyping. Pangenome graphs stand to become a ubiquitous tool in genomics. Although it is unclear whether they will replace linearreference genomes, their ability to harmoniously relate multiple sequence and coordinate systems will make them useful irrespective of which pangenomic models become most common in the future.


2017 ◽  
Vol 2017 ◽  
pp. 1-4 ◽  
Author(s):  
Brahim Chaourar

Given a graph G=V,E, a connected sides cut U,V\U or δU is the set of edges of E linking all vertices of U to all vertices of V\U such that the induced subgraphs GU and GV\U are connected. Given a positive weight function w defined on E, the maximum connected sides cut problem (MAX CS CUT) is to find a connected sides cut Ω such that wΩ is maximum. MAX CS CUT is NP-hard. In this paper, we give a linear time algorithm to solve MAX CS CUT for series parallel graphs. We deduce a linear time algorithm for the minimum cut problem in the same class of graphs without computing the maximum flow.


2016 ◽  
Vol 609 ◽  
pp. 374-383 ◽  
Author(s):  
Ljiljana Brankovic ◽  
Costas S. Iliopoulos ◽  
Ritu Kundu ◽  
Manal Mohamed ◽  
Solon P. Pissis ◽  
...  

1997 ◽  
Vol 7 ◽  
pp. 67-82 ◽  
Author(s):  
C. G. Nevill-Manning ◽  
I. H. Witten

SEQUITUR is an algorithm that infers a hierarchical structure from a sequence of discrete symbols by replacing repeated phrases with a grammatical rule that generates the phrase, and continuing this process recursively. The result is a hierarchical representation of the original sequence, which offers insights into its lexical structure. The algorithm is driven by two constraints that reduce the size of the grammar, and produce structure as a by-product. SEQUITUR breaks new ground by operating incrementally. Moreover, the method's simple structure permits a proof that it operates in space and time that is linear in the size of the input. Our implementation can process 50,000 symbols per second and has been applied to an extensive range of real world sequences.


2016 ◽  
Vol 2016 ◽  
pp. 1-10 ◽  
Author(s):  
Bilal Wajid ◽  
Muhammad U. Sohail ◽  
Ali R. Ekti ◽  
Erchin Serpedin

Genome assembly in its two decades of history has produced significant research, in terms of both biotechnology and computational biology. This contribution delineates sequencing platforms and their characteristics, examines key steps involved in filtering and processing raw data, explains assembly frameworks, and discusses quality statistics for the assessment of the assembled sequence. Furthermore, the paper explores recent Ubuntu-based software environments oriented towards genome assembly as well as some avenues for future research.


Mathematics ◽  
2021 ◽  
Vol 9 (14) ◽  
pp. 1592
Author(s):  
Iztok Peterin ◽  
Gabriel Semanišin

A shortest path P of a graph G is maximal if P is not contained as a subpath in any other shortest path. A set S⊆V(G) is a maximal shortest paths cover if every maximal shortest path of G contains a vertex of S. The minimum cardinality of a maximal shortest paths cover is called the maximal shortest paths cover number and is denoted by ξ(G). We show that it is NP-hard to determine ξ(G). We establish a connection between ξ(G) and several other graph parameters. We present a linear time algorithm that computes exact value for ξ(T) of a tree T.


2021 ◽  
Vol 26 ◽  
pp. 1-30
Author(s):  
Tomohiro Koana ◽  
Viatcheslav Korenwein ◽  
André Nichterlein ◽  
Rolf Niedermeier ◽  
Philipp Zschoche

Finding a maximum-cardinality or maximum-weight matching in (edge-weighted) undirected graphs is among the most prominent problems of algorithmic graph theory. For n -vertex and m -edge graphs, the best-known algorithms run in Õ( m √ n ) time. We build on recent theoretical work focusing on linear-time data reduction rules for finding maximum-cardinality matchings and complement the theoretical results by presenting and analyzing (thereby employing the kernelization methodology of parameterized complexity analysis) new (near-)linear-time data reduction rules for both the unweighted and the positive-integer-weighted case. Moreover, we experimentally demonstrate that these data reduction rules provide significant speedups of the state-of-the art implementations for computing matchings in real-world graphs: the average speedup factor is 4.7 in the unweighted case and 12.72 in the weighted case.


Sign in / Sign up

Export Citation Format

Share Document