Human genome annotation

Peter Little

doi:10.1038/896

GENCODE: The reference human genome annotation for The ENCODE Project

Genome Research ◽

10.1101/gr.135350.111 ◽

2012 ◽

Vol 22 (9) ◽

pp. 1760-1774 ◽

Cited By ~ 2787

Author(s):

J. Harrow ◽

A. Frankish ◽

J. M. Gonzalez ◽

E. Tapanari ◽

M. Diekhans ◽

...

Keyword(s):

Human Genome ◽

Genome Annotation ◽

Encode Project ◽

Reference Human Genome

Download Full-text

The effect of human genome annotation complexity on RNA-Seq gene expression quantification

2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops ◽

10.1109/bibmw.2012.6470224 ◽

2012 ◽

Cited By ~ 3

Author(s):

Po-Yen Wu ◽

John H. Phan ◽

May D. Wang

Keyword(s):

Gene Expression ◽

Human Genome ◽

Genome Annotation ◽

Rna Seq ◽

Gene Expression Quantification ◽

Expression Quantification

Download Full-text

Assessing the impact of human genome annotation choice on RNA-seq expression estimates

BMC Bioinformatics ◽

10.1186/1471-2105-14-s11-s8 ◽

2013 ◽

Vol 14 (Suppl 11) ◽

pp. S8 ◽

Cited By ~ 28

Author(s):

Po-Yen Wu ◽

John H Phan ◽

May D Wang

Keyword(s):

Human Genome ◽

Genome Annotation ◽

Rna Seq ◽

The Impact

Download Full-text

Myosin-I nomenclature

The Journal of Cell Biology ◽

10.1083/jcb.200110032 ◽

2001 ◽

Vol 155 (5) ◽

pp. 703-704 ◽

Cited By ~ 53

Author(s):

Peter G. Gillespie ◽

Joseph P. Albanesi ◽

Martin Bähler ◽

William M. Bement ◽

Jonathan S. Berg ◽

...

Keyword(s):

Human Genome ◽

Genome Organization ◽

Genome Annotation ◽

Nomenclature System ◽

Myosin I

We suggest that the vertebrate myosin-I field adopt a common nomenclature system based on the names adopted by the Human Genome Organization (HUGO). At present, the myosin-I nomenclature is very confusing; not only are several systems in use, but several different genes have been given the same name. Despite their faults, we believe that the names adopted by the HUGO nomenclature group for genome annotation are the best compromise, and we recommend universal adoption.

Download Full-text

Protein-Coding Hotspots in the Human Genome: Annotation, Significance, and Their Conservation in Animal Models (mouse, fruit fly)

10.17918/00000400 ◽

2021 ◽

Author(s):

D. V. Klopfenstein

Keyword(s):

Animal Models ◽

Human Genome ◽

Genome Annotation ◽

Fruit Fly ◽

Protein Coding

Download Full-text

Bioinformatics Research and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-642-13078-6_7 ◽

2010 ◽

pp. 50-51

Author(s):

Mark Gerstein

Keyword(s):

Human Genome ◽

Genome Annotation

Download Full-text

The shrinking human protein coding complement: are there fewer than 20,000 genes?

10.1101/001909 ◽

2014 ◽

Cited By ~ 2

Author(s):

Iakes Ezkurdia ◽

David Juan ◽

Jose Manuel Rodriguez ◽

Adam Frankish ◽

Mark Deikhans ◽

...

Keyword(s):

Protein Expression ◽

Human Genome ◽

Genome Annotation ◽

Large Scale ◽

Cellular Protein ◽

Human Protein ◽

Protein Coding ◽

Detection Rates ◽

Protein Coding Genes ◽

Peptide Mass

Determining the full complement of protein-coding genes is a key goal of genome annotation. The most powerful approach for confirming protein coding potential is the detection of cellular protein expression through peptide mass spectrometry experiments. Here we map the peptides detected in 7 large-scale proteomics studies to almost 60% of the protein coding genes in the GENCODE annotation the human genome. We find that conservation across vertebrate species and the age of the gene family are key indicators of whether a peptide will be detected in proteomics experiments. We find peptides for most highly conserved genes and for practically all genes that evolved before bilateria. At the same time there is almost no evidence of protein expression for genes that have appeared since primates, or for genes that do not have any protein-like features or cross-species conservation. We identify 19 non-protein-like features such as weak conservation, no protein features or ambiguous annotations in major databases that are indicators of low peptide detection rates. We use these features to describe a set of 2,001 genes that are potentially non-coding, and show that many of these genes behave more like non-coding genes than protein-coding genes. We detect peptides for just 3% of these genes. We suggest that many of these 2,001 genes do not code for proteins under normal circumstances and that they should not be included in the human protein coding gene catalogue. These potential non-coding genes will be revised as part of the ongoing human genome annotation effort.

Download Full-text

A Statistical Framework to Predict Functional Non-Coding Regions in the Human Genome Through Integrated Analysis of Annotation Data

10.1101/018093 ◽

2015 ◽

Cited By ~ 1

Author(s):

Qiongshi Lu ◽

Yiming Hu ◽

Jiehuan Sun ◽

Yuwei Cheng ◽

Kei-Hoi Cheung ◽

...

Keyword(s):

Human Genome ◽

Genome Annotation ◽

Human Genetics ◽

Integrated Analysis ◽

Whole Genome ◽

Coding Regions ◽

Functional Regions ◽

Statistical Framework ◽

Annotation Data ◽

High Throughput Experiments

Identifying functional regions in the human genome is a major goal in human genetics. Great efforts have been made to functionally annotate the human genome either through computational predictions, such as genomic conservation, or high-throughput experiments, such as the ENCODE project. These efforts have resulted in a rich collection of functional annotation data of diverse types that need to be jointly analyzed for integrated interpretation and annotation. Here we present GenoCanyon, a whole-genome annotation method that performs unsupervised statistical learning using 22 computational and experimental annotations thereby inferring the functional potential of each position in the human genome. With GenoCanyon, we are able to predict many of the known functional regions. The ability of predicting functional regions as well as its generalizable statistical framework makes GenoCanyon a unique and powerful tool for whole-genome annotation. The GenoCanyon web server is available at http://genocanyon.med.yale.edu

Download Full-text

DNA methylation in satellite repeats disorders

Essays in Biochemistry ◽

10.1042/ebc20190028 ◽

2019 ◽

Vol 63 (6) ◽

pp. 757-771 ◽

Cited By ~ 4

Author(s):

Claire Francastel ◽

Frédérique Magdinier

Keyword(s):

Dna Methylation ◽

Human Genome ◽

Repetitive Dna ◽

Dna Sequences ◽

Satellite Repeats ◽

Tremendous Progress ◽

Genes Encoding ◽

Dna Elements ◽

Near Future

Abstract Despite the tremendous progress made in recent years in assembling the human genome, tandemly repeated DNA elements remain poorly characterized. These sequences account for the vast majority of methylated sites in the human genome and their methylated state is necessary for this repetitive DNA to function properly and to maintain genome integrity. Furthermore, recent advances highlight the emerging role of these sequences in regulating the functions of the human genome and its variability during evolution, among individuals, or in disease susceptibility. In addition, a number of inherited rare diseases are directly linked to the alteration of some of these repetitive DNA sequences, either through changes in the organization or size of the tandem repeat arrays or through mutations in genes encoding chromatin modifiers involved in the epigenetic regulation of these elements. Although largely overlooked so far in the functional annotation of the human genome, satellite elements play key roles in its architectural and topological organization. This includes functions as boundary elements delimitating functional domains or assembly of repressive nuclear compartments, with local or distal impact on gene expression. Thus, the consideration of satellite repeats organization and their associated epigenetic landmarks, including DNA methylation (DNAme), will become unavoidable in the near future to fully decipher human phenotypes and associated diseases.

Download Full-text

The human genome project: A generation's psyche and a society's revolution

Molecular Diagnosis ◽

10.1016/s1084-8592(00)80025-9 ◽

2000 ◽

Vol 5 (2) ◽

pp. 87-89

Author(s):

D COOPER

Keyword(s):

Human Genome ◽

Human Genome Project ◽

Genome Project ◽

The Human Genome Project

Download Full-text