Computational Tools forBrassica–ArabidopsisComparative Genomics
Recent advances, such as the availability of extensive genome survey sequence (GSS) data and draft physical maps, are radically transforming the means by which we can dissectBrassicagenome structure and systematically relate it to theArabidopsismodel. Hitherto, our view of the co-linearities between these closely related genomes had been largely inferred from comparative RFLP data, necessitating substantial interpolation and expert interpretation. Sequencing of theBrassica rapagenome by the MultinationalBrassicaGenome Project will, however, enable an entirely computational approach to this problem. Meanwhile we have been developing databases and bioinformatics tools to support our work inBrassicacomparative genomics, including a recently completed draft physical map ofB. rapaintegrated with anchor probes derived from theArabidopsisgenome sequence. We are also exploring new ways to display the emergingBrassica–Arabidopsissequence homology data. We have mapped all publicly available Brassica sequencesin silicoto theArabidopsisTIGR v5 genome sequence and published this in the ATIDB database that uses Generic Genome Browser (GBrowse). Thisin silicoapproach potentially identifies all paralogous sequences and so we colour-code the significance of the mappings and offer an integrated, real-time multiple alignment tool to partition them into paralogous groups. The MySQL database driving GBrowse can also be directly interrogated, using the powerful API offered by the Perl Bio∷DB∷GFF methods, facilitating a wide range of data-mining possibilities.