Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper

Fast Genome-Wide Functional Annotation through Orthology Assignment by eggNOG-Mapper

Molecular Biology and Evolution ◽

10.1093/molbev/msx148 ◽

2017 ◽

Vol 34 (8) ◽

pp. 2115-2122 ◽

Cited By ~ 461

Author(s):

Jaime Huerta-Cepas ◽

Kristoffer Forslund ◽

Luis Pedro Coelho ◽

Damian Szklarczyk ◽

Lars Juhl Jensen ◽

...

Keyword(s):

Functional Annotation ◽

Orthology Assignment ◽

Genome Wide

Download Full-text

VCF2PopTree: a one-click client-side software to construct population phylogeny from genome-wide SNPs

10.7287/peerj.preprints.27682 ◽

2019 ◽

Author(s):

Sankar Subramanian ◽

Umayal Ramasamy ◽

David Chen

Keyword(s):

Phylogenetic Trees ◽

Large Scale ◽

Web Applications ◽

Third Party ◽

Genotype Data ◽

Whole Genome ◽

Genome Data ◽

Genome Wide ◽

Software Programs ◽

Computationally Intensive

In the past decades a number of software programs have been developed to deduce the phylogenetic relationship between populations. However, these programs are not suited for large-scale whole genome data. Recently, a few standalone or web applications have been developed to handle genome-wide data, but they were either computationally intensive, dependent on third party software or required significant time and resource of a web server. In the post-genomic era, researchers are able to obtain bioinformatically processed high-quality publication-ready whole genome data for many individuals in a population from next generation sequencing companies due to the reduction in the cost of sequencing and analysis. Such genotype data is typically presented in the Variant Call Format (VCF) and there is no simple software available that uses this data to construct the phylogeny of populations in a short time. To address this limitation, we have developed a one-click user-friendly software, VCF2PopTree that uses gnome-wide SNPs to construct and display phylogenetic trees in seconds to minutes. For example, it reads a 1 GB VCF file and draws a tree in less than 5 minutes. VCF2PopTree accepts genotype data from a local machine, constructs a tree using UPGMA and Neighbour-Joining algorithms and displays it on a web-browser. It also produces pairwise-diversity matrix in MEGA and PHYLIP file formats as well as trees in the Newick format which could be directly used by other popular phylogenetic software programs. The software including the source code, a test VCF input file and short documentation are available at: https://github.com/sansubs/vcf2pop.

Download Full-text

VCF2PopTree: a one-click client-side software to construct population phylogeny from genome-wide SNPs

10.7287/peerj.preprints.27682v1 ◽

2019 ◽

Author(s):

Sankar Subramanian ◽

Umayal Ramasamy ◽

David Chen

Keyword(s):

Phylogenetic Trees ◽

Large Scale ◽

Web Applications ◽

Third Party ◽

Genotype Data ◽

Whole Genome ◽

Genome Data ◽

Genome Wide ◽

Software Programs ◽

Computationally Intensive

In the past decades a number of software programs have been developed to deduce the phylogenetic relationship between populations. However, these programs are not suited for large-scale whole genome data. Recently, a few standalone or web applications have been developed to handle genome-wide data, but they were either computationally intensive, dependent on third party software or required significant time and resource of a web server. In the post-genomic era, researchers are able to obtain bioinformatically processed high-quality publication-ready whole genome data for many individuals in a population from next generation sequencing companies due to the reduction in the cost of sequencing and analysis. Such genotype data is typically presented in the Variant Call Format (VCF) and there is no simple software available that uses this data to construct the phylogeny of populations in a short time. To address this limitation, we have developed a one-click user-friendly software, VCF2PopTree that uses gnome-wide SNPs to construct and display phylogenetic trees in seconds to minutes. For example, it reads a 1 GB VCF file and draws a tree in less than 5 minutes. VCF2PopTree accepts genotype data from a local machine, constructs a tree using UPGMA and Neighbour-Joining algorithms and displays it on a web-browser. It also produces pairwise-diversity matrix in MEGA and PHYLIP file formats as well as trees in the Newick format which could be directly used by other popular phylogenetic software programs. The software including the source code, a test VCF input file and short documentation are available at: http://sankarsubramanian.net/dat/index.html.

Download Full-text

VCF2PopTree: a client-side software to construct population phylogeny from genome-wide SNPs

PeerJ ◽

10.7717/peerj.8213 ◽

2019 ◽

Vol 7 ◽

pp. e8213 ◽

Cited By ~ 5

Author(s):

Sankar Subramanian ◽

Umayal Ramasamy ◽

David Chen

Keyword(s):

Phylogenetic Trees ◽

Large Scale ◽

Web Applications ◽

Third Party ◽

Genotype Data ◽

Whole Genome ◽

Genome Data ◽

Genome Wide ◽

Software Programs ◽

Computationally Intensive

In the past decades a number of software programs have been developed to infer phylogenetic relationships between populations. However, most of these programs typically use alignments of sequences from genes to build phylogeny. Recently, many standalone or web applications have been developed to handle large-scale whole genome data, but they are either computationally intensive, dependent on third party software or required significant time and resource of a web server. In the post-genomic era, researchers are able to obtain bioinformatically processed high-quality publication-ready whole genome data for many individuals in a population from next generation sequencing companies due to the reduction in the cost of sequencing and analysis. Such genotype data is typically presented in the Variant Call Format (VCF) and there is no simple software available that directly uses this data format to construct the phylogeny of populations in a short time. To address this limitation, we have developed a user-friendly software, VCF2PopTree that uses genome-wide SNPs to construct and display phylogenetic trees in seconds to minutes. For example, it reads a VCF file containing 4 million SNPs and draws a tree in less than 30 seconds. VCF2PopTree accepts genotype data from a local machine, constructs a tree using UPGMA and Neighbour-Joining algorithms and displays it on a web-browser. It also produces pairwise-diversity matrix in MEGA and PHYLIP file formats as well as trees in the Newick format which could be directly used by other popular phylogenetic software programs. The software including the source code, a test VCF file and a documentation are available at: https://github.com/sansubs/vcf2pop.

Download Full-text

VCF2PopTree: a one-click client-side software to construct population phylogeny from genome-wide SNPs

10.7287/peerj.preprints.27682v2 ◽

2019 ◽

Author(s):

Sankar Subramanian ◽

Umayal Ramasamy ◽

David Chen

Keyword(s):

Phylogenetic Trees ◽

Large Scale ◽

Web Applications ◽

Third Party ◽

Genotype Data ◽

Whole Genome ◽

Genome Data ◽

Genome Wide ◽

Software Programs ◽

Computationally Intensive

In the past decades a number of software programs have been developed to deduce the phylogenetic relationship between populations. However, these programs are not suited for large-scale whole genome data. Recently, a few standalone or web applications have been developed to handle genome-wide data, but they were either computationally intensive, dependent on third party software or required significant time and resource of a web server. In the post-genomic era, researchers are able to obtain bioinformatically processed high-quality publication-ready whole genome data for many individuals in a population from next generation sequencing companies due to the reduction in the cost of sequencing and analysis. Such genotype data is typically presented in the Variant Call Format (VCF) and there is no simple software available that uses this data to construct the phylogeny of populations in a short time. To address this limitation, we have developed a one-click user-friendly software, VCF2PopTree that uses gnome-wide SNPs to construct and display phylogenetic trees in seconds to minutes. For example, it reads a 1 GB VCF file and draws a tree in less than 5 minutes. VCF2PopTree accepts genotype data from a local machine, constructs a tree using UPGMA and Neighbour-Joining algorithms and displays it on a web-browser. It also produces pairwise-diversity matrix in MEGA and PHYLIP file formats as well as trees in the Newick format which could be directly used by other popular phylogenetic software programs. The software including the source code, a test VCF input file and short documentation are available at: https://github.com/sansubs/vcf2pop.

Download Full-text

Faculty Opinions recommendation of Genome-wide association and large-scale follow up identifies 16 new loci influencing lung function.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.13345956.14715054 ◽

2011 ◽

Author(s):

Craig Hersh

Keyword(s):

Lung Function ◽

Large Scale ◽

Genome Wide Association ◽

Genome Wide

Download Full-text

Faculty Opinions recommendation of Large-scale genome-wide enrichment analyses identify new trait-associated genes and pathways across 31 human phenotypes.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.734261365.793558023 ◽

2019 ◽

Author(s):

Jason Flannick

Keyword(s):

Large Scale ◽

Genome Wide

Download Full-text

Publisher Correction: Genome-wide association study of individual differences of human lymphocyte profiles using large-scale cytometry data

Journal of Human Genetics ◽

10.1038/s10038-020-00890-x ◽

2021 ◽

Author(s):

Daigo Okada ◽

Naotoshi Nakamura ◽

Kazuya Setoh ◽

Takahisa Kawaguchi ◽

Koichiro Higasa ◽

...

Keyword(s):

Individual Differences ◽

Association Study ◽

Human Lymphocyte ◽

Large Scale ◽

Genome Wide Association Study ◽

Genome Wide Association ◽

Genome Wide

Download Full-text

BonMOLière: Small-Sized Libraries of Readily Purchasable Compounds, Optimized to Produce Genuine Hits in Biological Screens across the Protein Space

International Journal of Molecular Sciences ◽

10.3390/ijms22157773 ◽

2021 ◽

Vol 22 (15) ◽

pp. 7773

Author(s):

Neann Mathai ◽

Conrad Stork ◽

Johannes Kirchmair

Keyword(s):

Large Scale ◽

Computational Approach ◽

Large Sets ◽

Compound Libraries ◽

Wide Range ◽

Protein Space ◽

High Chance ◽

Large Scale Screening ◽

Early Drug ◽

Selection Of

Experimental screening of large sets of compounds against macromolecular targets is a key strategy to identify novel bioactivities. However, large-scale screening requires substantial experimental resources and is time-consuming and challenging. Therefore, small to medium-sized compound libraries with a high chance of producing genuine hits on an arbitrary protein of interest would be of great value to fields related to early drug discovery, in particular biochemical and cell research. Here, we present a computational approach that incorporates drug-likeness, predicted bioactivities, biological space coverage, and target novelty, to generate optimized compound libraries with maximized chances of producing genuine hits for a wide range of proteins. The computational approach evaluates drug-likeness with a set of established rules, predicts bioactivities with a validated, similarity-based approach, and optimizes the composition of small sets of compounds towards maximum target coverage and novelty. We found that, in comparison to the random selection of compounds for a library, our approach generates substantially improved compound sets. Quantified as the “fitness” of compound libraries, the calculated improvements ranged from +60% (for a library of 15,000 compounds) to +184% (for a library of 1000 compounds). The best of the optimized compound libraries prepared in this work are available for download as a dataset bundle (“BonMOLière”).

Download Full-text

BiPSim: a flexible and generic stochastic simulator for polymerization processes

Scientific Reports ◽

10.1038/s41598-021-92833-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Stephan Fischer ◽

Marc Dinh ◽

Vincent Henry ◽

Philippe Robert ◽

Anne Goelzer ◽

...

Keyword(s):

Large Scale ◽

Stochastic Simulation Algorithm ◽

Specific Information ◽

Whole Cell ◽

Simulation Speed ◽

Genome Wide ◽

Cell Simulation ◽

Stochastic Simulator ◽

Modeling Formalisms ◽

Stochastic Phenomena

AbstractDetailed whole-cell modeling requires an integration of heterogeneous cell processes having different modeling formalisms, for which whole-cell simulation could remain tractable. Here, we introduce BiPSim, an open-source stochastic simulator of template-based polymerization processes, such as replication, transcription and translation. BiPSim combines an efficient abstract representation of reactions and a constant-time implementation of the Gillespie’s Stochastic Simulation Algorithm (SSA) with respect to reactions, which makes it highly efficient to simulate large-scale polymerization processes stochastically. Moreover, multi-level descriptions of polymerization processes can be handled simultaneously, allowing the user to tune a trade-off between simulation speed and model granularity. We evaluated the performance of BiPSim by simulating genome-wide gene expression in bacteria for multiple levels of granularity. Finally, since no cell-type specific information is hard-coded in the simulator, models can easily be adapted to other organismal species. We expect that BiPSim should open new perspectives for the genome-wide simulation of stochastic phenomena in biology.

Download Full-text