PMBD: a Comprehensive Plastics Microbial Biodegradation Database

Database ◽

10.1093/database/baz119 ◽

2019 ◽

Vol 2019 ◽

Cited By ~ 6

Author(s):

Zhiqiang Gan ◽

Houjin Zhang

Keyword(s):

Sequence Alignment ◽

Prediction Tool ◽

Literature Searching ◽

Environmentally Conscious ◽

Uniprot Database ◽

Online Resource ◽

Alignment Tool ◽

Microbial Biodegradation ◽

Tremendous Amount ◽

Natural Way

Abstract Since the invention over a hundred years ago, plastics have been used in many applications, and they are involved in every aspect of our lives. The extensive usage of plastics results in a tremendous amount of waste, which has become a severe burden on the environment. Several degradation approaches exist in nature to cope with ever-increasing plastic waste. Among these approaches, biodegradation by microorganisms has emerged as a natural way, which is favored by many environmentally conscious societies. To facilitate the study on biodegradation of plastics, we developed an online resource, Plastics Microbial Biodegradation Database (PMBD), to gather and present the information about microbial biodegradation of plastics. In this database, 949 microorganisms–plastics relationships and 79 genes involved in the biodegradation of plastics were manually collected and confirmed through literature searching. In addition, more than 8000 automatically annotated enzyme sequences, which were predicted to be involved in the plastics biodegradation, were extracted from the TrEMBL section of the UniProt database. The PMBD database is presented with a website at http://pmbd.genome-mining.cn/home. Data may be accessed through browsing or searching. Also included on the website are a sequence alignment tool and a function prediction tool.

Download Full-text

The BioCyc collection of microbial genomes and metabolic pathways

Briefings in Bioinformatics ◽

10.1093/bib/bbx085 ◽

2017 ◽

Vol 20 (4) ◽

pp. 1085-1093 ◽

Cited By ~ 107

Author(s):

Peter D Karp ◽

Richard Billington ◽

Ron Caspi ◽

Carol A Fulcher ◽

Mario Latendresse ◽

...

Keyword(s):

Sequence Alignment ◽

Metabolic Pathways ◽

Biomedical Literature ◽

Analysis Software ◽

New Developments ◽

Microbial Genomes ◽

Additional Information ◽

Alignment Tool ◽

Extensive Range ◽

Types Of Information

Abstract BioCyc.org is a microbial genome Web portal that combines thousands of genomes with additional information inferred by computer programs, imported from other databases and curated from the biomedical literature by biologist curators. BioCyc also provides an extensive range of query tools, visualization services and analysis software. Recent advances in BioCyc include an expansion in the content of BioCyc in terms of both the number of genomes and the types of information available for each genome; an expansion in the amount of curated content within BioCyc; and new developments in the BioCyc software tools including redesigned gene/protein pages and metabolite pages; new search tools; a new sequence-alignment tool; a new tool for visualizing groups of related metabolic pathways; and a facility called SmartTables, which enables biologists to perform analyses that previously would have required a programmer’s assistance.

Download Full-text

VirusDIP: Virus Data Integration Platform

10.1101/2020.06.08.139451 ◽

2020 ◽

Cited By ~ 1

Author(s):

Lina Wang ◽

Fengzhen Chen ◽

Xueqin Guo ◽

Lijin You ◽

Xiaoxia Yang ◽

...

Keyword(s):

Sequence Alignment ◽

Sequence Data ◽

Data Retrieval ◽

Viral Sequence ◽

Origin And Evolution ◽

Alignment Tool ◽

Public Data ◽

Virus Research ◽

Global Initiative ◽

Tree Building

AbstractMotivationThe Coronavirus Disease 2019 (COVID-19) pandemic poses a huge threat to human public health. Viral sequence data plays an important role in the scientific prevention and control of epidemics. A comprehensive virus database will be vital useful for virus data retrieval and deep analysis. To promote sharing of virus data, several virus databases and related analyzing tools have been created.ResultsTo facilitate virus research and promote the global sharing of virus data, we present here VirusDIP, a one-stop service platform for archive, integration, access, analysis of virus data. It accepts the submission of viral sequence data from all over the world and currently integrates data resources from the National GeneBank Database (CNGBdb), Global initiative on sharing all influenza data (GISAID), and National Center for Biotechnology Information (NCBI). Moreover, based on the comprehensive data resources, BLAST sequence alignment tool and multi-party security computing tools are deployed for multi-sequence alignment, phylogenetic tree building and global trusted sharing. VirusDIP is gradually establishing cooperation with more databases, and paving the way for the analysis of virus origin and evolution. All public data in VirusDIP are freely available for all researchers worldwide.Availabilityhttps://db.cngb.org/virus/[email protected]

Download Full-text

Refining pairwise sequence alignments of membrane proteins by the incorporation of anchors

10.1101/2020.09.16.299453 ◽

2020 ◽

Author(s):

René Staritzbichler ◽

Edoardo Sarti ◽

Emily Yaklich ◽

Antoniya Aleksandrova ◽

Markus Stamm ◽

...

Keyword(s):

Membrane Proteins ◽

Sequence Alignment ◽

Ad Hoc ◽

Pairwise Alignment ◽

Low Complexity ◽

Pairwise Sequence Alignment ◽

Sequence Alignments ◽

Alignment Procedure ◽

Alignment Tool ◽

Optimum Alignment

AbstractThe alignment of primary sequences is a fundamental step in the analysis of protein structure, function, and evolution. Integral membrane proteins pose a significant challenge for such sequence alignment approaches, because their evolutionary relationships can be very remote, and because a high content of hydrophobic amino acids reduces their complexity. Frequently, biochemical or biophysical data is available that informs the optimum alignment, for example, indicating specific positions that share common functional or structural roles. Currently, if those positions are not correctly aligned by a standard pairwise alignment procedure, the incorporation of such information into the alignment is typically addressed in an ad hoc manner, with manual adjustments. However, such modifications are problematic because they reduce the robustness and reproducibility of the alignment. An alternative approach is the use of restraints, or anchors, to incorporate such position-matching explicitly during alignment. Here we introduce position anchoring in the alignment tool AlignMe as an aid to pairwise sequence alignment of membrane proteins. Applying this approach to realistic scenarios involving distantly-related and low complexity sequences, we illustrate how the addition of even a single anchor can dramatically improve the accuracy of the alignments, while maintaining the reproducibility and rigor of the overall alignment.

Download Full-text

Evaluating the computing efficiencies (specificity and sensitivity) of graphics processing unit (GPU)-accelerated DNA sequence alignment tools against central processing unit (CPU) alignment tool

Journal of Bioinformatics and Sequence Analysis ◽

10.5897/jbsa2018.0109 ◽

2018 ◽

Vol 9 (2) ◽

pp. 10-14 ◽

Cited By ~ 1

Author(s):

Pawar Shrikant ◽

Stanam Aditya ◽

Zhu Ying

Keyword(s):

Dna Sequence ◽

Sequence Alignment ◽

Graphics Processing Unit ◽

Central Processing Unit ◽

Processing Unit ◽

Central Processing ◽

Dna Sequence Alignment ◽

Specificity And Sensitivity ◽

Alignment Tool ◽

Graphics Processing

Download Full-text

Constrained Multiple Sequence Alignment Tool Development and Its Application to RNase Family Alignment

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720003000095 ◽

2003 ◽

Vol 01 (02) ◽

pp. 267-287 ◽

Cited By ~ 33

Author(s):

Chuan Yi Tang ◽

Chin Lung Lu ◽

Margaret Dah-Tsyr Chang ◽

Yin-Te Tsai ◽

Yuh-Ju Sun ◽

...

Keyword(s):

Sequence Alignment ◽

Heuristic Algorithm ◽

Multiple Sequence Alignment ◽

Time Complexity ◽

Software System ◽

Tool Development ◽

Multiple Sequence ◽

Rna Molecules ◽

Alignment Tool ◽

Multiple Sequence Alignment Tool

In this paper, we design a heuristic algorithm of computing a constrained multiple sequence alignment (CMSA for short) for guaranteeing that the generated alignment satisfies the user-specified constraints that some particular residues should be aligned together. If the number of residues needed to be aligned together is a constant α, then the time-complexity of our CMSA algorithm for aligning K sequences is O(αKn4), where n is the maximum of the lengths of sequences. In addition, we have built up such a CMSA software system and made several experiments on the RNase sequences, which mainly function in catalyzing the degradation of RNA molecules. The resulting alignments illustrate the practicability of our method.

Download Full-text

Fast and SNP-aware short read alignment with SALT

BMC Bioinformatics ◽

10.1186/s12859-021-04088-6 ◽

2021 ◽

Vol 22 (S9) ◽

Author(s):

Wei Quan ◽

Bo Liu ◽

Yadong Wang

Keyword(s):

Sequence Alignment ◽

Genetic Variants ◽

High Throughput Sequencing ◽

Reference Genome ◽

Graph Model ◽

Sequence Alignments ◽

Short Read ◽

Read Alignment ◽

Short Read Alignment ◽

Alignment Tool

Abstract Background DNA sequence alignment is a common first step in most applications of high-throughput sequencing technologies. The accuracy of sequence alignments directly affects the accuracy of downstream analyses, such as variant calling and quantitative analysis of transcriptome; therefore, rapidly and accurately mapping reads to a reference genome is a significant topic in bioinformatics. Conventional DNA read aligners map reads to a linear reference genome (such as the GRCh38 primary assembly). However, such a linear reference genome represents the genome of only one or a few individuals and thus lacks information on variations in the population. This limitation can introduce bias and impact the sensitivity and accuracy of mapping. Recently, a number of aligners have begun to map reads to populations of genomes, which can be represented by a reference genome and a large number of genetic variants. However, compared to linear reference aligners, an aligner that can store and index all genetic variants has a high cost in memory (RAM) space and leads to extremely long run time. Aligning reads to a graph-model-based index that includes all types of variants is ultimately an NP-hard problem in theory. By contrast, considering only single nucleotide polymorphism (SNP) information will reduce the complexity of the index and improve the speed of sequence alignment. Results The SNP-aware alignment tool (SALT) is a fast, memory-efficient, and SNP-aware short read alignment tool. SALT uses 5.8 GB of RAM to index a human reference genome (GRCh38) and incorporates 12.8M UCSC common SNPs. Compared with a state-of-the-art aligner, SALT has a similar speed but higher accuracy. Conclusions Herein, we present an SNP-aware alignment tool (SALT) that aligns reads to a reference genome that incorporates an SNP database. We benchmarked SALT using simulated and real datasets. The results demonstrate that SALT can efficiently map reads to the reference genome with significantly improved accuracy. Incorporating SNP information can improve the accuracy of read alignment and can reveal novel variants. The source code is freely available at https://github.com/weiquan/SALT.

Download Full-text

Refining pairwise sequence alignments of membrane proteins by the incorporation of anchors

PLoS ONE ◽

10.1371/journal.pone.0239881 ◽

2021 ◽

Vol 16 (4) ◽

pp. e0239881

Author(s):

René Staritzbichler ◽

Edoardo Sarti ◽

Emily Yaklich ◽

Antoniya Aleksandrova ◽

Marcus Stamm ◽

...

Keyword(s):

Membrane Proteins ◽

Sequence Alignment ◽

Ad Hoc ◽

Low Complexity ◽

Pairwise Sequence Alignment ◽

Sequence Alignments ◽

Alignment Procedure ◽

Alignment Tool ◽

Hydrophobic Amino Acids ◽

Optimum Alignment

The alignment of primary sequences is a fundamental step in the analysis of protein structure, function, and evolution, and in the generation of homology-based models. Integral membrane proteins pose a significant challenge for such sequence alignment approaches, because their evolutionary relationships can be very remote, and because a high content of hydrophobic amino acids reduces their complexity. Frequently, biochemical or biophysical data is available that informs the optimum alignment, for example, indicating specific positions that share common functional or structural roles. Currently, if those positions are not correctly matched by a standard pairwise sequence alignment procedure, the incorporation of such information into the alignment is typically addressed in an ad hoc manner, with manual adjustments. However, such modifications are problematic because they reduce the robustness and reproducibility of the aligned regions either side of the newly matched positions. Previous studies have introduced restraints as a means to impose the matching of positions during sequence alignments, originally in the context of genome assembly. Here we introduce position restraints, or “anchors” as a feature in our alignment tool AlignMe, providing an aid to pairwise global sequence alignment of alpha-helical membrane proteins. Applying this approach to realistic scenarios involving distantly-related and low complexity sequences, we illustrate how the addition of anchors can be used to modify alignments, while still maintaining the reproducibility and rigor of the rest of the alignment. Anchored alignments can be generated using the online version of AlignMe available at www.bioinfo.mpg.de/AlignMe/.

Download Full-text

ViralMSA: Massively scalable reference-guided multiple sequence alignment of viral genomes

10.1101/2020.04.20.052068 ◽

2020 ◽

Cited By ~ 1

Author(s):

Niema Moshiri

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Genomic Sequence ◽

Sequence Data ◽

Software Project ◽

Multiple Sequence ◽

Viral Genomes ◽

Alignment Tool ◽

Multiple Sequence Alignment Tool ◽

Algorithmic Techniques

AbstractMotivationIn molecular epidemiology, the identification of clusters of transmissions typically requires the alignment of viral genomic sequence data. However, existing methods of multiple sequence alignment scale poorly with respect to the number of sequences.ResultsViralMSA is a user-friendly reference-guided multiple sequence alignment tool that leverages the algorithmic techniques of read mappers to enable the multiple sequence alignment of ultra-large viral genome datasets. It scales linearly with the number of sequences, and it is able to align tens of thousands of full viral genomes in seconds.AvailabilityViralMSA is freely available at https://github.com/niemasd/ViralMSA as an open-source software [email protected]

Download Full-text

Distant homology detection using a LEngth and STructure-based sequence Alignment Tool (LESTAT)

Proteins Structure Function and Bioinformatics ◽

10.1002/prot.21830 ◽

2007 ◽

Vol 71 (3) ◽

pp. 1409-1419 ◽

Cited By ~ 4

Author(s):

Marianne M. Lee ◽

Ralf Bundschuh ◽

Michael K. Chan

Keyword(s):

Sequence Alignment ◽

Homology Detection ◽

Alignment Tool

Download Full-text

TM-Aligner: Multiple sequence alignment tool for transmembrane proteins with reduced time and improved accuracy

Scientific Reports ◽

10.1038/s41598-017-13083-y ◽

2017 ◽

Vol 7 (1) ◽

Cited By ~ 7

Author(s):

Basharat Bhat ◽

Nazir A. Ganai ◽

Syed Mudasir Andrabi ◽

Riaz A. Shah ◽

Ashutosh Singh

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Transmembrane Proteins ◽

Multiple Sequence ◽

Alignment Tool ◽

Multiple Sequence Alignment Tool ◽

Improved Accuracy ◽

Reduced Time

Download Full-text