scholarly journals CAMSA: a Tool for Comparative Analysis and Merging of Scaffold Assemblies

2016 ◽  
Author(s):  
Sergey S. Aganezov ◽  
Max A. Alekseyev

MotivationDespite the recent progress in genome sequencing and assembly, many of the currently available assembled genomes come in a draft form. Such draft genomes consist of a large number of genomic fragments (scaffolds), whose positions and orientations along the genome are unknown. While there exists a number of methods for reconstruction of the genome from its scaffolds, utilizing various computational and wet-lab techniques, they often can produce only partial error-prone scaffold assemblies. It therefore becomes important to compare and merge scaffold assemblies produced by different methods, thus combining their advantages and highlighting present conflicts for further investigation. These tasks may be labor intensive if performed manually.ResultsWe present CAMSA—a tool for comparative analysis and merging of two or more given scaffold assemblies. The tool (i) creates an extensive report with several comparative quality metrics; (ii) constructs the most confident merged scaffold assembly; and (iii) provides an interactive framework for a visual comparative analysis of the given assemblies. Among the CAMSA features, only scaffold merging can be evaluated in comparison to existing methods. Namely, it resembles the functionality of assembly reconciliation tools, although their primary targets are somewhat different. Our evaluations show that CAMSA produces merged assemblies of comparable or better quality than existing assembly reconciliation tools while being the fastest in terms of the total running time.AvailabilityCAMSA is distributed under the MIT license and is available at http://cblab.org/camsa/.

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Omar Abou Saada ◽  
Andreas Tsouris ◽  
Chris Eberlein ◽  
Anne Friedrich ◽  
Joseph Schacherer

AbstractWhile genome sequencing and assembly are now routine, we do not have a full, precise picture of polyploid genomes. No existing polyploid phasing method provides accurate and contiguous haplotype predictions. We developed nPhase, a ploidy agnostic tool that leverages long reads and accurate short reads to solve alignment-based phasing for samples of unspecified ploidy (https://github.com/OmarOakheart/nPhase). nPhase is validated by tests on simulated and real polyploids. nPhase obtains on average over 95% accuracy and a contiguous 1.25 haplotigs per haplotype to cover more than 90% of each chromosome (heterozygosity rate ≥ 0.5%). nPhase allows population genomics and hybrid studies of polyploids.


2017 ◽  
Author(s):  
Morgan N. Price ◽  
Adam P. Arkin

AbstractLarge-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources that link protein sequences to scientific articles (Swiss-Prot, GeneRIF, and EcoCyc). PaperBLAST’s database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quickly finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. PaperBLAST is available at http://papers.genomics.lbl.gov/.


2021 ◽  
Vol 7 (2) ◽  
Author(s):  
Ahmad-Kamal Ghazali ◽  
Su-Anne Eng ◽  
Jia-Shiun Khoo ◽  
Seddon Teoh ◽  
Chee-Choong Hoh ◽  
...  

Burkholderia pseudomallei , a soil-dwelling Gram-negative bacterium, is the causative agent of the endemic tropical disease melioidosis. Clinical manifestations of B. pseudomallei infection range from acute or chronic localized infection in a single organ to fulminant septicaemia in multiple organs. The diverse clinical manifestations are attributed to various factors, including the genome plasticity across B. pseudomallei strains. We previously characterized B. pseudomallei strains isolated in Malaysia and noted different levels of virulence in model hosts. We hypothesized that the difference in virulence might be a result of variance at the genome level. In this study, we sequenced and assembled four Malaysian clinical B. pseudomallei isolates, UKMR15, UKMPMC2000, UKMD286 and UKMH10. Phylogenomic analysis showed that Malaysian subclades emerged from the Asian subclade, suggesting that the Malaysian strains originated from the Asian region. Interestingly, the low-virulence strain, UKMH10, was the most distantly related compared to the other Malaysian isolates. Genomic island (GI) prediction analysis identified a new island of 23 kb, GI9c, which is present in B. pseudomallei and Burkholderia mallei , but not Burkholderia thailandensis . Genes encoding known B. pseudomallei virulence factors were present across all four genomes, but comparative analysis of the total gene content across the Malaysian strains identified 104 genes that are absent in UKMH10. We propose that these genes may encode novel virulence factors, which may explain the reduced virulence of this strain. Further investigation on the identity and role of these 104 proteins may aid in understanding B. pseudomallei pathogenicity to guide the design of new therapeutics for treating melioidosis.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e12446
Author(s):  
Darlene D. Wagner ◽  
Heather A. Carleton ◽  
Eija Trees ◽  
Lee S. Katz

Background Whole genome sequencing (WGS) has gained increasing importance in responses to enteric bacterial outbreaks. Common analysis procedures for WGS, single nucleotide polymorphisms (SNPs) and genome assembly, are highly dependent upon WGS data quality. Methods Raw, unprocessed WGS reads from Escherichia coli, Salmonella enterica, and Shigella sonnei outbreak clusters were characterized for four quality metrics: PHRED score, read length, library insert size, and ambiguous nucleotide composition. PHRED scores were strongly correlated with improved SNPs analysis results in E. coli and S. enterica clusters. Results Assembly quality showed only moderate correlations with PHRED scores and library insert size, and then only for Salmonella. To improve SNP analyses and assemblies, we compared seven read-healing pipelines to improve these four quality metrics and to see how well they improved SNP analysis and genome assembly. The most effective read healing pipelines for SNPs analysis incorporated quality-based trimming, fixed-width trimming, or both. The Lyve-SET SNPs pipeline showed a more marked improvement than the CFSAN SNP Pipeline, but the latter performed better on raw, unhealed reads. For genome assembly, SPAdes enabled significant improvements in healed E. coli reads only, while Skesa yielded no significant improvements on healed reads. Conclusions PHRED scores will continue to be a crucial quality metric albeit not of equal impact across all types of analyses for all enteric bacteria. While trimming-based read healing performed well for SNPs analyses, different read healing approaches are likely needed for genome assembly or other, emerging WGS analysis methodologies.


Author(s):  
V. Dzonic

This paper is devoted to identification of specific characteristics of Russian and Serbian phraseological units. The author considers the phraseological units from structural and semantic aspect and pays special attention to the national and cultural component of the studied units, which cause the greatest difficulties for foreigners. Identification of the given component is carried out by linguocultural analysis of components of phraseologicaly related word combinations. The material of research was comprised based on data from lexicographical dictionaries of Russian and Serbian languages. The phraseological units – toponyms are reviewed as a separate group and are, in the author’s opinion, bearers of rich linguoculturological information. The author identifies three main sources of imagery of these units: characteristics of the geographical position of the object; important historical and cultural events, as well as prominent historical figures, which brought fame to the region; lifestyle and crafts of local residents. The analysis allowed the author to identify specific national and cultural characteristics of a number of Russian and Serbian toponyms. This work is of an applied nature. Results of the study can be used in the teaching the Russian language as second Slavic language.


2018 ◽  
Vol 12 (6) ◽  
pp. e0006566 ◽  
Author(s):  
Elizabeth M. Batty ◽  
Suwittra Chaemchuen ◽  
Stuart Blacksell ◽  
Allen L. Richards ◽  
Daniel Paris ◽  
...  

Author(s):  
Kathleen Araújo

This chapter returns to the overarching questions of this book, namely, how can national energy transitions be explained, to what extent do patterns of change align and differ in the transitions of this study, and how does policy play a role, particularly with innovations that emerged amid the transitions. To broadly answer, the four cases are comparatively examined here. The conceptual tools from Chapter 3 are also elaborated based on the findings. Implications of the results are discussed, and will serve as a basis for further discussion in Chapter 9 on how to think about energy transitions as a planner, decision-maker, and researcher. Among the more significant findings are the following. Greater energy substitution (in relative terms) occurred initially within the countries that extended or repurposed existing energy systems versus the country (i.e., Denmark) that developed a new energy system from a nearly non-existent one. Cost improvements were evident in all cases; however, a number of caveats are worth noting. Among the energy technologies and their services that were studied, only Icelandic geothermal-based heating was competitive in its home market in the 1970s; nonetheless, the remaining energy technologies that were studied later became cost competitive. As the national industries of this book became globally recognized, increases in the quality of living within the given countries also occurred, as gauged by the Human Development Index (HDI). With respect to timescales, substantial energy transitions were evident in all cases within a period of 15 years or less. In terms of technology complexity, this attribute was not a confounding barrier to change. Finally, government was instrumental to change, but not always the driver. There are countless ways to compare national energy transitions. This section illustrates ways of doing so, first by describing broadly observed, socio-technical patterns with the tool typologies outlined in Chapter 3. A discussion of tool refinement follows. The section then turns to more systematically assess key, qualitative and quantitative dimensions of the four transition cases.


2015 ◽  
Vol 770 ◽  
pp. 739-743 ◽  
Author(s):  
A.S. Yuanyushkin ◽  
D.V. Lobanov ◽  
D.A. Rychkov

The key task of the tool manufacturing is to create or to choose such a type of tool, which would permit to provide high processing efficiency, the best tool`s workability and the quality of the machined surfaces with minimum expenses and resources. The optimal choice of the constructive tool modifications from a variety of options takes much time required for the preparation of the tool to work. To solve this problem, we have developed software that allows you to create, organize and carry out a comparative analysis of structural instruments in order to identify rational option for the given conditions of production. Ordering and selection of a rational design of the instrument is carried out in accordance with established procedures of modeling and comparative analysis of design solutions. Application software can reduce design time technological process by 80...90%, and get a substantial annual economic effect.


Sign in / Sign up

Export Citation Format

Share Document