ncbi taxonomy database
Recently Published Documents


TOTAL DOCUMENTS

7
(FIVE YEARS 1)

H-INDEX

2
(FIVE YEARS 0)

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Tetsu Sakamoto ◽  
J. Miguel Ortega

Abstract Background NCBI Taxonomy is the main taxonomic source for several bioinformatics tools and databases since all organisms with sequence accessions deposited on INSDC are organized in its hierarchical structure. Despite the extensive use and application of this data source, an alternative representation of data as a table would facilitate the use of information for processing bioinformatics data. To do so, since some taxonomic-ranks are missing in some lineages, an algorithm might propose provisional names for all taxonomic-ranks. Results To address this issue, we developed an algorithm that takes the tree structure from NCBI Taxonomy and generates a hierarchically complete taxonomic table, maintaining its compatibility with the original tree. The procedures performed by the algorithm consist of attempting to assign a taxonomic-rank to an existing clade or “no rank” node when possible, using its name as part of the created taxonomic-rank name (e.g. Ord_Ornithischia) or interpolating parent nodes when needed (e.g. Cla_of_Ornithischia), both examples given for the dinosaur Brachylophosaurus lineage. The new hierarchical structure was named Taxallnomy because it contains names for all taxonomic-ranks, and it contains 41 hierarchical levels corresponding to the 41 taxonomic-ranks currently found in the NCBI Taxonomy database. From Taxallnomy, users can obtain the complete taxonomic lineage with 41 nodes of all taxa available in the NCBI Taxonomy database, without any hazard to the original tree information. In this work, we demonstrate its applicability by embedding taxonomic information of a specified rank into a phylogenetic tree and by producing metagenomics profiles. Conclusion Taxallnomy applies to any bioinformatics analyses that depend on the information from NCBI Taxonomy. Taxallnomy is updated periodically but with a distributed PERL script users can generate it locally using NCBI Taxonomy as input. All Taxallnomy resources are available at http://bioinfo.icb.ufmg.br/taxallnomy.


Author(s):  
Tetsu Sakamoto ◽  
J. Miguel Ortega

ABSTRACTNCBI Taxonomy is the main taxonomic source for several bioinformatics tools and databases since all organisms with sequence accessions deposited on INSDC are organized in its hierarchical structure. Despite the extensive use and application of this data source, taking advantage of its taxonomic tree could be challenging because (1) some taxonomic ranks are missing in some lineages and (2) some nodes in the tree do not have a taxonomic rank assigned (referred to as “no rank”). To address this issue, we developed an algorithm that takes the tree structure from NCBI Taxonomy and generates a hierarchically complete taxonomic tree. The procedures performed by the algorithm consist of attempting to assign a taxonomic rank to “no rank” nodes and of creating/deleting nodes throughout the tree. The algorithm also creates a name for the new nodes by borrowing the names from its ranked child or, if there is no child, from its ranked parent node. The new hierarchical structure was named taxallnomy and it contains 33 hierarchical levels corresponding to the 33 taxonomic ranks currently used in the NCBI Taxonomy database. From taxallnomy, users can obtain the complete taxonomic lineage with 33 nodes of all taxa available in the NCBI Taxonomy database. Taxallnomy is applicable to several bioinformatics analyses that depend on NCBI Taxonomy data. In this work, we demonstrate its applicability by embedding taxonomic information of a specified rank into a phylogenetic tree; and by making metagenomics profiles. Taxallnomy algorithm was written in PERL and all its resources are available at bioinfo.icb.ufmg.br/taxallnomy.Database URL: http://bioinfo.icb.ufmg.br/taxallnomy


Zootaxa ◽  
2019 ◽  
Vol 4706 (3) ◽  
pp. 401-407 ◽  
Author(s):  
AKHIL GARG ◽  
DETLEF LEIPE ◽  
PETER UETZ

We compared the species names in the Reptile Database, a dedicated taxonomy database, with those in the NCBI taxonomy database, which provides the taxonomic backbone for the GenBank sequence database. About 67% of the known ~11,000 reptile species are represented with at least one DNA sequence and a binary species name in GenBank. However, a common problem arises through the submission of preliminary species names (such as “Pelomedusa sp. A CK-2014”) to GenBank and thus the NCBI taxonomy. These names cannot be assigned to any accepted species names and thus create a disconnect between DNA sequences and species. While these names of unknown taxonomic meaning sometimes get updated, often they remain in GenBank which now contains sequences from ~1,300 such “putative” reptile species tagged by informal names (~15% of its reptile names). We estimate that NCBI/GenBank probably contain tens of thousands of such “disconnected” entries. We encourage sequence submitters to update informal species names after they have been published, otherwise the disconnect will cause increasing confusion and possibly misleading taxonomic conclusions.


2017 ◽  
Author(s):  
Eneida Hatcher ◽  
Yiming Bao ◽  
Paolo Amedeo ◽  
Olga Blinkova ◽  
Guy Cochrane ◽  
...  

Currently the National Center of Biotechnology Information (NCBI) assigns individual taxonomy identifiers to each distinct influenza virus isolate submitted to GenBank. To support this practice, individual flu isolates must be manually added to the NCBI taxonomy database and unique taxonomy identifiers generated. This added layer of manual processing is unique to influenza virus and prevents automatization of the flu sequence submission process. Here we outline a new NCBI policy that normalizes influenza virus taxonomy processing but maintains features supported by the previous approach. This change will reduce the amount of manual handling necessary for flu submissions and pave the way for increased automation of the submissions process. While this automation may disrupt some historic practices, it will better align influenza virus data processing with other viruses and ultimately lower the submission burden on data providers.


2017 ◽  
Author(s):  
Eneida Hatcher ◽  
Yiming Bao ◽  
Paolo Amedeo ◽  
Olga Blinkova ◽  
Guy Cochrane ◽  
...  

Currently the National Center of Biotechnology Information (NCBI) assigns individual taxonomy identifiers to each distinct influenza virus isolate submitted to GenBank. To support this practice, individual flu isolates must be manually added to the NCBI taxonomy database and unique taxonomy identifiers generated. This added layer of manual processing is unique to influenza virus and prevents automatization of the flu sequence submission process. Here we outline a new NCBI policy that normalizes influenza virus taxonomy processing but maintains features supported by the previous approach. This change will reduce the amount of manual handling necessary for flu submissions and pave the way for increased automation of the submissions process. While this automation may disrupt some historic practices, it will better align influenza virus data processing with other viruses and ultimately lower the submission burden on data providers.


2014 ◽  
Vol 43 (D1) ◽  
pp. D1086-D1098 ◽  
Author(s):  
Scott Federhen

2011 ◽  
Vol 40 (D1) ◽  
pp. D136-D143 ◽  
Author(s):  
S. Federhen

Sign in / Sign up

Export Citation Format

Share Document