scholarly journals Phylogenetic approaches to identifying fragments of the same gene, with application to the wheat genome

2017 ◽  
Author(s):  
Ivana Piližota ◽  
Henning Redestig ◽  
Christophe Dessimoz

AbstractAs the time and cost of sequencing decrease, the number of available genomes and transcriptomes rapidly increases. Yet the quality of the assemblies and the gene annotations varies considerably and often remains poor, affecting downstream analyses. This is particularly true when fragments of the same gene are annotated as distinct genes and consequently wrongly appear as paralogs. In this study, we introduce two novel phylogenetic tests to infer non-overlapping or partially overlapping genes that are in fact parts of the same gene. One approach collapses branches with low bootstrap support and the other computes a likelihood ratio test. We extensively validated these methods by 1) introducing and recovering fragmentation on the bread wheat, Triticum aestivum cv. Chinese Spring, chromosome 3B; 2) by applying the methods to the low-quality 3B assembly and validating predictions against the high-quality 3B assembly; and 3) by comparing the performance of the proposed methods to the performance of existing methods, namely Ensembl Compara and ESPRIT. Application of this combination to a draft shotgun assembly of the entire bread wheat genome revealed 1221 pairs of genes which are highly likely to be fragments of the same gene. Our approach demonstrates the power of fine-grained evolutionary inferences across multiple species to improving genome assemblies and annotations. An open source software tool is available at https://github.com/DessimozLab/esprit2.

2019 ◽  
Vol 100 (4) ◽  
pp. 801-812 ◽  
Author(s):  
Abdulqader Jighly ◽  
Reem Joukhadar ◽  
Deepmala Sehgal ◽  
Sukhwinder Singh ◽  
Francis C. Ogbonnaya ◽  
...  

GigaScience ◽  
2017 ◽  
Vol 6 (11) ◽  
Author(s):  
Aleksey V Zimin ◽  
Daniela Puiu ◽  
Richard Hall ◽  
Sarah Kingan ◽  
Bernardo J Clavijo ◽  
...  

2015 ◽  
Vol 16 (1) ◽  
Author(s):  
Jarrod A Chapman ◽  
Martin Mascher ◽  
Aydın Buluç ◽  
Kerrie Barry ◽  
Evangelos Georganas ◽  
...  

2020 ◽  
Vol 18 (3) ◽  
pp. 221-229 ◽  
Author(s):  
Jiantao Guan ◽  
Diego F. Garcia ◽  
Yun Zhou ◽  
Rudi Appels ◽  
Aili Li ◽  
...  

Nature Plants ◽  
2021 ◽  
Vol 7 (2) ◽  
pp. 172-183
Author(s):  
Alexandra M. Przewieslik-Allen ◽  
Paul A. Wilkinson ◽  
Amanda J. Burridge ◽  
Mark O. Winfield ◽  
Xiaoyang Dai ◽  
...  

2017 ◽  
Author(s):  
Aleksey V. Zimin ◽  
Daniela Puiu ◽  
Richard Hall ◽  
Sarah Kingan ◽  
Bernardo J. Clavijo ◽  
...  

AbstractCommon bread wheat, Triticum aestivum, has one of the most complex genomes known to science, with 6 copies of each chromosome, enormous numbers of near-identical sequences scattered throughout, and an overall size of more than 15 billion bases. Multiple past attempts to assemble the genome have failed. Here we report the first successful assembly of T. aestivum, using deep sequencing coverage from a combination of short Illumina reads and very long Pacific Biosciences reads. The final assembly contains 15,344,693,583 bases and has a weighted average (N50) contig size of of 232,659 bases. This represents by far the most complete and contiguous assembly of the wheat genome to date, providing a strong foundation for future genetic studies of this important food crop. We also report how we used the recently published genome of Aegilops tauschii, the diploid ancestor of the wheat D genome, to identify 4,179,762,575 bp of T. aestivum that correspond to its D genome components.


Nature ◽  
2012 ◽  
Vol 491 (7426) ◽  
pp. 705-710 ◽  
Author(s):  
Rachel Brenchley ◽  
Manuel Spannagl ◽  
Matthias Pfeifer ◽  
Gary L. A. Barker ◽  
Rosalinda D’Amore ◽  
...  

Genetics ◽  
1998 ◽  
Vol 149 (4) ◽  
pp. 2007-2023 ◽  
Author(s):  
Marion S Röder ◽  
Victor Korzun ◽  
Katja Wendehake ◽  
Jens Plaschke ◽  
Marie-Hélène Tixier ◽  
...  

Abstract Hexaploid bread wheat (Triticum aestivum L. em. Thell) is one of the world's most important crop plants and displays a very low level of intraspecific polymorphism. We report the development of highly polymorphic microsatellite markers using procedures optimized for the large wheat genome. The isolation of microsatellite-containing clones from hypomethylated regions of the wheat genome increased the proportion of useful markers almost twofold. The majority (80%) of primer sets developed are genome-specific and detect only a single locus in one of the three genomes of bread wheat (A, B, or D). Only 20% of the markers detect more than one locus. A total of 279 loci amplified by 230 primer sets were placed onto a genetic framework map composed of RFLPs previously mapped in the reference population of the International Triticeae Mapping Initiative (ITMI) Opata 85 × W7984. Sixty-five microsatellites were mapped at a LOD >2.5, and 214 microsatellites were assigned to the most likely intervals. Ninety-three loci were mapped to the A genome, 115 to the B genome, and 71 to the D genome. The markers are randomly distributed along the linkage map, with clustering in several centromeric regions.


2004 ◽  
Vol 101 (7) ◽  
pp. 1916-1921 ◽  
Author(s):  
Sorin Istrail ◽  
Granger G. Sutton ◽  
Liliana Florea ◽  
Aaron L. Halpern ◽  
Clark M. Mobarry ◽  
...  

2022 ◽  
Vol 12 ◽  
Author(s):  
Qasim Raza ◽  
Awais Riaz ◽  
Rana Muhammad Atif ◽  
Babar Hussain ◽  
Iqrar Ahmad Rana ◽  
...  

MADS-box gene family members play multifarious roles in regulating the growth and development of crop plants and hold enormous promise for bolstering grain yield potential under changing global environments. Bread wheat (Triticum aestivum L.) is a key stable food crop around the globe. Until now, the available information concerning MADS-box genes in the wheat genome has been insufficient. Here, a comprehensive genome-wide analysis identified 300 high confidence MADS-box genes from the publicly available reference genome of wheat. Comparative phylogenetic analyses with Arabidopsis and rice MADS-box genes classified the wheat genes into 16 distinct subfamilies. Gene duplications were mainly identified in subfamilies containing unbalanced homeologs, pointing towards a potential mechanism for gene family expansion. Moreover, a more rapid evolution was inferred for M-type genes, as compared with MIKC-type genes, indicating their significance in understanding the evolutionary history of the wheat genome. We speculate that subfamily-specific distal telomeric duplications in unbalanced homeologs facilitate the rapid adaptation of wheat to changing environments. Furthermore, our in-silico expression data strongly proposed MADS-box genes as active guardians of plants against pathogen insurgency and harsh environmental conditions. In conclusion, we provide an entire complement of MADS-box genes identified in the wheat genome that could accelerate functional genomics efforts and possibly facilitate bridging gaps between genotype-to-phenotype relationships through fine-tuning of agronomically important traits.


Sign in / Sign up

Export Citation Format

Share Document