Twelve Elements of Visualization and Analysis for Tertiary and Quaternary Structure of Biological Molecules

Mapping Intimacies ◽

10.1101/153528 ◽

2017 ◽

Cited By ~ 3

Author(s):

Philippe Youkharibache

Keyword(s):

Open Source ◽

Quaternary Structure ◽

Structural Information ◽

3D Visualization ◽

3D Structure ◽

Life Sciences ◽

Integrated Approach ◽

Sequence Information ◽

3D Graphics ◽

Molecular Graphics

AbstractDuring the last decades, 3D Molecular Graphics in Life Sciences has been used almost exclusively by experts through complex software and applications ranging from Structural Biology to Computer Aided Drug Design. The emergence of JavaScript and WebGL as a viable platform has enabled 3D visualization of biomolecular structures through Web browsers, without any need for specialized software. Although still in its infancy, Web Molecular Graphics opens new perspectives. This white paper, proposes a set of Twelve Elements to consider to enable 3D visualization and structural analyses of biological systems in Web molecular viewers. The Elements go beyond 3D graphics and propose an integrated approach to visualize and analyze molecular entities and their interactions in multiple dimensions, at multiple levels of details, for diverse users. The bridging of 1D sequence browsers and 3D structure viewers, possible under a Web browser, enables information flow where molecular biologists can use structural information directly at the sequence level. Given the tsunami of sequence information linked to diseases from next generation sequencing - in need for interpretation - making structural information readily available to research scientists is a tremendous opportunity for medical discovery. The Twelve Elements are conceptual and are intended to entice developers to architect software components and APIs, and to gather together as a community around common goals and open source software. A few features of emerging viewers, all available as open source, are highlighted. Speed and quality of 3D graphics for large molecular systems, the interoperability of Web components, and the instantaneous sharing of annotated visualizations through the Web, are some of the most amazing and promising capabilities of 3D Web viewing, opening bright perspectives for Life Sciences research.

Download Full-text

A Max-Margin Model for Predicting Residue—Base Contacts in Protein–RNA Interactions

Life ◽

10.3390/life11111135 ◽

2021 ◽

Vol 11 (11) ◽

pp. 1135

Author(s):

Shunya Kashiwagi ◽

Kengo Sato ◽

Yasubumi Sakakibara

Keyword(s):

Rna Binding ◽

Structural Information ◽

3D Structure ◽

Scoring Function ◽

Prediction Method ◽

Sequence Information ◽

Integer Programming Problem ◽

3D Structures ◽

Binding Residue ◽

Base Contact

Protein–RNA interactions (PRIs) are essential for many biological processes, so understanding aspects of the sequences and structures involved in PRIs is important for unraveling such processes. Because of the expensive and time-consuming techniques required for experimental determination of complex protein–RNA structures, various computational methods have been developed to predict PRIs. However, most of these methods focus on predicting only RNA-binding regions in proteins or only protein-binding motifs in RNA. Methods for predicting entire residue–base contacts in PRIs have not yet achieved sufficient accuracy. Furthermore, some of these methods require the identification of 3D structures or homologous sequences, which are not available for all protein and RNA sequences. Here, we propose a prediction method for predicting residue–base contacts between proteins and RNAs using only sequence information and structural information predicted from sequences. The method can be applied to any protein–RNA pair, even when rich information such as its 3D structure, is not available. In this method, residue–base contact prediction is formalized as an integer programming problem. We predict a residue–base contact map that maximizes a scoring function based on sequence-based features such as k-mers of sequences and the predicted secondary structure. The scoring function is trained using a max-margin framework from known PRIs with 3D structures. To verify our method, we conducted several computational experiments. The results suggest that our method, which is based on only sequence information, is comparable with RNA-binding residue prediction methods based on known binding data.

Download Full-text

A max-margin model for predicting residue-base contacts in protein-RNA interactions

10.1101/022459 ◽

2015 ◽

Author(s):

Kengo Sato ◽

Shunya Kashiwagi ◽

Yasubumi Sakakibara

Keyword(s):

Rna Binding ◽

Structural Information ◽

3D Structure ◽

Scoring Function ◽

Prediction Method ◽

Sequence Information ◽

Integer Programming Problem ◽

3D Structures ◽

Binding Residue ◽

Base Contact

Motivation: Protein-RNA interactions (PRIs) are essential for many biological processes, so understanding aspects of the sequence and structure in PRIs is important for understanding those processes. Due to the expensive and time-consuming processes required for experimental determination of complex protein-RNA structures, various computational methods have been developed to predict PRIs. However, most of these methods focus on predicting only RNA-binding regions in proteins or only protein-binding motifs in RNA. Methods for predicting entire residue-base contacts in PRIs have not yet achieved sufficient accuracy. Furthermore, some of these methods require 3D structures or homologous sequences, which are not available for all protein and RNA sequences. Results: We propose a prediction method for residue-base contacts between proteins and RNAs using only sequence information and structural information predicted from only sequences. The method can be applied to any protein-RNA pair, even when rich information such as 3D structure is not available. Residue-base contact prediction is formalized as an integer programming problem. We predict a residue-base contact map that maximizes a scoring function based on sequence-based features such as k-mer of sequences and predicted secondary structure. The scoring function is trained by a max-margin framework from known PRIs with 3D structures. To verify our method, we conducted several computational experiments. The results suggest that our method, which is based on only sequence information, is comparable with RNA-binding residue prediction methods based on known binding data.

Download Full-text

A Novel Prediction of Quaternary Structural Type of Proteins with Gene Ontology

Protein and Peptide Letters ◽

10.2174/0929866526666191014144618 ◽

2020 ◽

Vol 27 (4) ◽

pp. 313-320 ◽

Cited By ~ 1

Author(s):

Xuan Xiao ◽

Wei-Jie Chen ◽

Wang-Ren Qiu

Keyword(s):

Gene Ontology ◽

Feature Extraction ◽

Extraction Method ◽

Quaternary Structure ◽

Structural Type ◽

Sequence Information ◽

Prediction System ◽

Data Set ◽

Feature Extraction Method ◽

Prediction Rate

Background: The information of quaternary structure attributes of proteins is very important because it is closely related to the biological functions of proteins. With the rapid development of new generation sequencing technology, we are facing a challenge: how to automatically identify the four-level attributes of new polypeptide chains according to their sequence information (i.e., whether they are formed as just as a monomer, or as a hetero-oligomer, or a homo-oligomer). Objective: In this article, our goal is to find a new way to represent protein sequences, thereby improving the prediction rate of protein quaternary structure. Methods: In this article, we developed a prediction system for protein quaternary structural type in which a protein sequence was expressed by combining the Pfam functional-domain and gene ontology. turn protein features into digital sequences, and complete the prediction of quaternary structure through specific machine learning algorithms and verification algorithm. Results: Our data set contains 5495 protein samples. Through the method provided in this paper, we classify proteins into monomer, or as a hetero-oligomer, or a homo-oligomer, and the prediction rate is 74.38%, which is 3.24% higher than that of previous studies. Through this new feature extraction method, we can further classify the four-level structure of proteins, and the results are also correspondingly improved. Conclusion: After the applying the new prediction system, compared with the previous results, we have successfully improved the prediction rate. We have reason to believe that the feature extraction method in this paper has better practicability and can be used as a reference for other protein classification problems.

Download Full-text

An Interactive WebGIS Framework for Coastal Erosion Risk Management

Journal of Marine Science and Engineering ◽

10.3390/jmse9060567 ◽

2021 ◽

Vol 9 (6) ◽

pp. 567

Author(s):

Alessandra Capolupo ◽

Cristina Monterisi ◽

Alessandra Saponieri ◽

Fabio Addona ◽

Leonardo Damiani ◽

...

Keyword(s):

Open Source ◽

3D Visualization ◽

Erosion Risk ◽

Innovative Strategies ◽

Web Mapping ◽

Natural Processes ◽

Multi Scale ◽

Interactive Interface ◽

Multi Temporal ◽

User Friendly

The Italian coastline stretches over about 8350 km, with 3600 km of beaches, representing a significant resource for the country. Natural processes and anthropic interventions keep threatening its morphology, moulding its shape and triggering soil erosion phenomena. Thus, many scholars have been focusing their work on investigating and monitoring shoreline instability. Outcomes of such activities can be largely widespread and shared with expert and non-expert users through Web mapping. This paper describes the performances of a WebGIS prototype designed to disseminate the results of the Italian project Innovative Strategies for the Monitoring and Analysis of Erosion Risk, known as the STIMARE project. While aiming to include the entire national coastline, three study areas along the regional coasts of Puglia and Emilia Romagna have already been implemented as pilot cases. This WebGIS was generated using Free and Open-Source Software for Geographic information systems (FOSS4G). The platform was designed by combining Apache http server, Geoserver, as open-source server and PostgreSQL (with PostGIS extension) as database. Pure javascript libraries OpenLayers and Cesium were implemented to obtain a hybrid 2D and 3D visualization. A user-friendly interactive interface was programmed to help users visualize and download geospatial data in several formats (pdf, kml and shp), in accordance with the European INSPIRE directives, satisfying both multi-temporal and multi-scale perspectives.

Download Full-text

TomoSAR Mapping of 3D Forest Structure: Contributions of L-Band Configurations

Remote Sensing ◽

10.3390/rs13122255 ◽

2021 ◽

Vol 13 (12) ◽

pp. 2255

Author(s):

Matteo Pardini ◽

Victor Cazcarra-Bes ◽

Konstantinos Papathanassiou

Keyword(s):

Information Content ◽

Forest Structure ◽

Structural Information ◽

3D Structure ◽

Physical Structure ◽

Structure Mapping ◽

Structure Information ◽

Forest Sites ◽

L Band ◽

Structure Indices

Synthetic Aperture Radar (SAR) measurements are unique for mapping forest 3D structure and its changes in time. Tomographic SAR (TomoSAR) configurations exploit this potential by reconstructing the 3D radar reflectivity. The frequency of the SAR measurements is one of the main parameters determining the information content of the reconstructed reflectivity in terms of penetration and sensitivity to the individual vegetation elements. This paper attempts to review and characterize the structural information content of L-band TomoSAR reflectivity reconstructions, and their potential to forest structure mapping. First, the challenges in the accurate TomoSAR reflectivity reconstruction of volume scatterers (which are expected to dominate at L-band) and to extract physical structure information from the reconstructed reflectivity is addressed. Then, the L-band penetration capability is directly evaluated by means of the estimation performance of the sub-canopy ground topography. The information content of the reconstructed reflectivity is then evaluated in terms of complementary structure indices. Finally, the dependency of the TomoSAR reconstruction and of its structural information to both the TomoSAR acquisition geometry and the temporal change of the reflectivity that may occur in the time between the TomoSAR measurements in repeat-pass or bistatic configurations is evaluated. The analysis is supported by experimental results obtained by processing airborne acquisitions performed over temperate forest sites close to the city of Traunstein in the south of Germany.

Download Full-text

A virtual alternative to molecular model sets: a beginners’ guide to constructing and visualizing molecules in open-source molecular graphics software

BMC Research Notes ◽

10.1186/s13104-021-05461-7 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Siripreeya Phankingthongkum ◽

Taweetham Limpanuparb

Keyword(s):

General Chemistry ◽

Open Source ◽

Undergraduate Students ◽

Teaching And Learning ◽

Molecular Model ◽

Bond Line ◽

First Year ◽

Molecular Graphics ◽

Model Sets ◽

Graphics Software

Abstract Objective The application of molecular graphics software as a simple and free alternative to molecular model sets for introductory-level chemistry learners is presented. Results Based on either Avogadro or IQmol, we proposed four sets of tasks for students, building basic molecular geometries, visualizing orbitals and densities, predicting polarity of molecules and matching 3D structures with bond-line structures. These topics are typically covered in general chemistry for first-year undergraduate students. Detailed step-by-step procedures are provided for all tasks for both programs so that instructors and students can adopt one of the two programs in their teaching and learning as an alternative to molecular model sets.

Download Full-text

Self-Supervised Representation Learning of Protein Tertiary Structures (PtsRep): Protein Engineering as A Case Study

10.1101/2020.12.22.423916 ◽

2020 ◽

Author(s):

Junwen Luo ◽

Yi Cai ◽

Jialin Wu ◽

Hongmin Cai ◽

Xiaofeng Yang ◽

...

Keyword(s):

Deep Learning ◽

Protein Engineering ◽

Structural Information ◽

Representation Learning ◽

Sequence Information ◽

Structural Representation ◽

Tertiary Structures ◽

Structural Space ◽

General Protein ◽

And Function

AbstractIn recent years, deep learning has been increasingly used to decipher the relationships among protein sequence, structure, and function. Thus far deep learning of proteins has mostly utilized protein primary sequence information, while the vast amount of protein tertiary structural information remains unused. In this study, we devised a self-supervised representation learning framework to extract the fundamental features of unlabeled protein tertiary structures (PtsRep), and the embedded representations were transferred to two commonly recognized protein engineering tasks, protein stability and GFP fluorescence prediction. On both tasks, PtsRep significantly outperformed the two benchmark methods (UniRep and TAPE-BERT), which are based on protein primary sequences. Protein clustering analyses demonstrated that PtsRep can capture the structural signals in proteins. PtsRep reveals an avenue for general protein structural representation learning, and for exploring protein structural space for protein engineering and drug design.

Download Full-text

Using Animation to Enhance 3D Visualization: A Strategy for a Production Environment

Microscopy and Microanalysis ◽

10.1017/s1431927600022388 ◽

1998 ◽

Vol 4 (S2) ◽

pp. 452-453

Author(s):

M. T. Dougherty ◽

W. Chiu

Keyword(s):

3D Visualization ◽

Three Dimensional ◽

Thought Experiments ◽

Visual Image ◽

Human Mind ◽

3D Graphics ◽

Production Environment ◽

Train Of Thought ◽

Advanced Computer ◽

Clear Presentation

The use of animation can significantly enhance the visualization of three-dimensional (3D) structures. It can present a focused train of thought, or it can be used to systematically scan through previously unfathomable quantities of data to examine for unknown features and consistencies. In order to establish a modern animation facility requires a variety of technical, psychological and artistic skills, in addition to advanced computer graphics and related equipment. 3D graphics combined with animation has proven to be a very effective tool in storytelling. The dynamic visual image is uniquely suited for thought experiments, simulations and traversing vast quantities of data otherwise incomprehensible to the human mind. Sometimes animation is the only method that allows a clear presentation of complex empirical or theoretical information in many dimensions. Scientific visualization is by its nature an exploratory process. And frequently animations are iteratively refined and polished to enhance comprehensibility for the researchers and their peers.

Download Full-text

Digital, Three-Dimensional Visualization of Root Systems in Peat

Soil Systems ◽

10.3390/soilsystems4010013 ◽

2020 ◽

Vol 4 (1) ◽

pp. 13 ◽

Cited By ~ 2

Author(s):

Stella Gribbe ◽

Gesche Blume-Werry ◽

John Couwenberg

Keyword(s):

Root System ◽

3D Visualization ◽

Large Diameter ◽

3D Structure ◽

Three Dimensional ◽

Plant Morphology ◽

Dimensional Structure ◽

Valuable Insight ◽

Similar Data ◽

Soil Matrix

Belowground plant structures are inherently difficult to observe in the field. Sedge peat that mainly consists of partly decayed roots and rhizomes offers a particularly challenging soil matrix to study (live) plant roots. To obtain information on belowground plant morphology, research commonly relies on rhizotrons, excavations, or computerized tomography scans (CT). However, all of these methods have certain limitations. For example, CT scans of peat cores cannot sharply distinguish between plant material and water, and rhizotrons do not provide a 3D structure of the root system. Here, we developed a low-cost approach for 3D visualization of the root system in peat monoliths. Two large diameter (20 cm) peat cores were extracted, frozen and two smaller peat monoliths (47 × 6.5 × 13 cm) were taken from each core. Slices of 0.5 mm or 1 mm were cut from one of the frozen monoliths, respectively, using a paper block cutter and the freshly cut surface of the monolith was photographed after each cut. A 3D model of the fresh (live) roots and rhizomes was reconstructed from the resulting images of the thinner slices based on computerized image analysis, including preprocessing, filtering, segmentation and 3D visualization using the open-source software Fiji, Drishti, and Ilastik. Digital volume measurements on the models produced similar data as manual washing out of roots from the adjacent peat monoliths. The constructed 3D models provide valuable insight into the three-dimensional structure of the root system in the peat matrix.

Download Full-text

4172 Introduction to R Programming and GitHub: Developing Automated Analysis of Complete Blood Count Data as a Translational Science Undergraduate Project

Journal of Clinical and Translational Science ◽

10.1017/cts.2020.215 ◽

2020 ◽

Vol 4 (s1) ◽

pp. 63-63

Author(s):

Jeffrey Robinson ◽

Annica Wayman

Keyword(s):

Software Development ◽

Open Source ◽

Complete Blood Count ◽

Life Sciences ◽

Statistical Testing ◽

Translational Science ◽

Development Environment ◽

Individual Parameter ◽

R Language ◽

R Programming

OBJECTIVES/GOALS: Introduce students to programming and software development practices in the life sciences by analyzing standard clinical diagnostic bloodwork for differential immune responses. Including lectures and a semester project with the goal of enhancing undergraduate students’ education to prepare them for careers in translational science. METHODS/STUDY POPULATION: The educational content was taught for the first time as a component of the newly developed course BTEC 330 “Software Applications in the Life Sciences” in UMBC’s Translational Life Science Technology (TLST) Bachelor’s degree program at the Universities at Shady Grove campus. Eleven students took the course. All were beginners with no programming background. Lectures provided background on the diagnostic components of the CBC, criteria for differential diagnosis in the clinical setting, and introduction to hematology and flow cytometry, forming underpinnings for interpretation of the CBC results. Weekly computer lab practical sessions provided training fundamentals of R programming language, the R-studio integrated development environment (IDE), and the GitHub.com open-source software development platform. RESULTS/ANTICIPATED RESULTS: The graded assignment consisted of a coding project in which students were each assigned an individual parameter from the CBC results. These include, for example, relative lymphocyte count or hemoglobin readouts. Students each created their own R-language script using R-studio, with functional code which: 1) Read in data from a file provided, 2) Performed statistical testing, 3) Read out statistical results as text, and charts as image files, 4) “Diagnosed” individuals in the dataset as being inside or outside the clinical normal range for that parameter. Each student also registered their own GitHub account and published their open-source code. Grading was performed on code functionality by downloading each student repository and running the code with the instructor as an outside developer using the resource. DISCUSSION/SIGNIFICANCE OF IMPACT: In this curriculum, students with no background in programming learned to code a basic R-language script and use GitHub to automate interpretation of CBC results. With advanced automation now becoming commonplace in translational science, such course content can provide introductory level of literacy in development of clinical informatics software.

Download Full-text