scholarly journals On String Graph Limits and the Structure of a Typical String Graph

2016 ◽  
Vol 84 (4) ◽  
pp. 386-407 ◽  
Author(s):  
Svante Janson ◽  
Andrew J. Uzzell
Keyword(s):  
2017 ◽  
Vol 24 (10) ◽  
pp. 953-968 ◽  
Author(s):  
Paola Bonizzoni ◽  
Gianluca Della Vedova ◽  
Yuri Pirola ◽  
Marco Previtali ◽  
Raffaella Rizzi

2016 ◽  
Vol 26 (4) ◽  
pp. 2193-2210
Author(s):  
Siva Athreya ◽  
Adrian Röllin

10.37236/4266 ◽  
2014 ◽  
Vol 21 (3) ◽  
Author(s):  
Svante Janson ◽  
Andrew J. Uzzell

Given a graph property $\mathcal{P}$, it is interesting to determine the typical structure of graphs that satisfy $\mathcal{P}$.  In this paper, we consider monotone properties, that is, properties that are closed under taking subgraphs.  Using results from the theory of graph limits, we show that if $\mathcal{P}$ is a monotone property and $r$ is the largest integer for which every $r$-colorable graph satisfies $\mathcal{P}$, then almost every graph with $\mathcal{P}$ is close to being a balanced $r$-partite graph.


2019 ◽  
Vol 13 (S1) ◽  
Author(s):  
Alexander J. Paul ◽  
Dylan Lawrence ◽  
Myoungkyu Song ◽  
Seung-Hwan Lim ◽  
Chongle Pan ◽  
...  

Abstract Background De novo genome assembly is a technique that builds the genome of a specimen using overlaps of genomic fragments without additional work with reference sequence. Sequence fragments (called reads) are assembled as contigs and scaffolds by the overlaps. The quality of the de novo assembly depends on the length and continuity of the assembly. To enable faster and more accurate assembly of species, existing sequencing techniques have been proposed, for example, high-throughput next-generation sequencing and long-reads-producing third-generation sequencing. However, these techniques require a large amounts of computer memory when very huge-size overlap graphs are resolved. Also, it is challenging for parallel computation. Results To address the limitations, we propose an innovative algorithmic approach, called Scalable Overlap-graph Reduction Algorithms (SORA). SORA is an algorithm package that performs string graph reduction algorithms by Apache Spark. The SORA’s implementations are designed to execute de novo genome assembly on either a single machine or a distributed computing platform. SORA efficiently compacts the number of edges on enormous graphing paths by adapting scalable features of graph processing libraries provided by Apache Spark, GraphX and GraphFrames. Conclusions We shared the algorithms and the experimental results at our project website, https://github.com/BioHPC/SORA. We evaluated SORA with the human genome samples. First, it processed a nearly one billion edge graph on a distributed cloud cluster. Second, it processed mid-to-small size graphs on a single workstation within a short time frame. Overall, SORA achieved the linear-scaling simulations for the increased computing instances.


Sign in / Sign up

Export Citation Format

Share Document