scholarly journals Can Bibliographic Pointers for Known Biological Data Be Found Automatically? Protein Interactions as a Case Study

2001 ◽  
Vol 2 (4) ◽  
pp. 196-206 ◽  
Author(s):  
Christian Blaschke ◽  
Alfonso Valencia

The Dictionary of Interacting Proteins(DIP) (Xenarioset al., 2000) is a large repository of protein interactions: its March 2000 release included 2379 protein pairs whose interactions have been detected by experimental methods. Even if many of these correspond to poorly characterized proteins, the result of massive yeast two-hybrid screenings, as many as 851 correspond to interactions detected using direct biochemical methods.We used information retrieval technology to search automatically for sentences in Medline abstracts that support these 851 DIP interactions. Surprisingly, we found correspondence between DIP protein pairs and Medline sentences describing their interactions in only 30% of the cases. This low coverage has interesting consequences regarding the quality of annotations (references) introduced in the database and the limitations of the application of information extraction (IE) technology to Molecular Biology. It is clear that the limitation of analyzing abstracts rather than full papers and the lack of standard protein names are difficulties of considerably more importance than the limitations of the IE methodology employed. A positive finding is the capacity of the IE system to identify new relations between proteins, even in a set of proteins previously characterized by human experts. These identifications are made with a considerable degree of precision.This is, to our knowledge, the first large scale assessment of IE capacity to detect previously known interactions: we thus propose the use of the DIP data set as a biological reference to benchmark IE systems.

Land ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 295
Author(s):  
Yuan Gao ◽  
Anyu Zhang ◽  
Yaojie Yue ◽  
Jing’ai Wang ◽  
Peng Su

Suitable land is an important prerequisite for crop cultivation and, given the prospect of climate change, it is essential to assess such suitability to minimize crop production risks and to ensure food security. Although a variety of methods to assess the suitability are available, a comprehensive, objective, and large-scale screening of environmental variables that influence the results—and therefore their accuracy—of these methods has rarely been explored. An approach to the selection of such variables is proposed and the criteria established for large-scale assessment of land, based on big data, for its suitability to maize (Zea mays L.) cultivation as a case study. The predicted suitability matched the past distribution of maize with an overall accuracy of 79% and a Kappa coefficient of 0.72. The land suitability for maize is likely to decrease markedly at low latitudes and even at mid latitudes. The total area suitable for maize globally and in most major maize-producing countries will decrease, the decrease being particularly steep in those regions optimally suited for maize at present. Compared with earlier research, the method proposed in the present paper is simple yet objective, comprehensive, and reliable for large-scale assessment. The findings of the study highlight the necessity of adopting relevant strategies to cope with the adverse impacts of climate change.


2021 ◽  
Vol 10 (9) ◽  
pp. 144-147
Author(s):  
Huiling LI ◽  
Xuan SU ◽  
Shuaipeng ZHANG

Massive amounts of business process event logs are collected and stored by modern information systems. Model discovery aims to discover a process model from such event logs, however, most of the existing approaches still suffer from low efficiency when facing large-scale event logs. Event log sampling techniques provide an effective scheme to improve the efficiency of process discovery, but the existing techniques still cannot guarantee the quality of model mining. Therefore, a sampling approach based on set coverage algorithm named set coverage sampling approach is proposed. The proposed sampling approach has been implemented in the open-source process mining toolkit ProM. Furthermore, experiments using a real event log data set from conformance checking and time performance analysis show that the proposed event log sampling approach can greatly improve the efficiency of log sampling on the premise of ensuring the quality of model mining.


2018 ◽  
Vol 64 (247) ◽  
pp. 811-821 ◽  
Author(s):  
STEFAN LIPPL ◽  
SAURABH VIJAY ◽  
MATTHIAS BRAUN

ABSTRACTDespite their importance for mass-balance estimates and the progress in techniques based on optical and thermal satellite imagery, the mapping of debris-covered glacier boundaries remains a challenging task. Manual corrections hamper regular updates. In this study, we present an automatic approach to delineate glacier outlines using interferometrically derived synthetic aperture radar (InSAR) coherence, slope and morphological operations. InSAR coherence detects the temporally decorrelated surface (e.g. glacial extent) irrespective of its surface type and separates it from the highly coherent surrounding areas. We tested the impact of different processing settings, for example resolution, coherence window size and topographic phase removal, on the quality of the generated outlines. We found minor influence of the topographic phase, but a combination of strong multi-looking during interferogram generation and additional averaging during coherence estimation strongly deteriorated the coherence at the glacier edges. We analysed the performance of X-, C- and L- band radar data. The C-band Sentinel-1 data outlined the glacier boundary with the least misclassifications and a type II error of 0.47% compared with Global Land Ice Measurements from Space inventory data. Our study shows the potential of the Sentinel-1 mission together with our automatic processing chain to provide regular updates for land-terminating glaciers on a large scale.


Author(s):  
Sajad Badalkhani ◽  
Ramazan Havangi ◽  
Mohsen Farshad

There is an extensive literature regarding multi-robot simultaneous localization and mapping (MRSLAM). In most part of the research, the environment is assumed to be static, while the dynamic parts of the environment degrade the estimation quality of SLAM algorithms and lead to inherently fragile systems. To enhance the performance and robustness of the SLAM in dynamic environments (SLAMIDE), a novel cooperative approach named parallel-map (p-map) SLAM is introduced in this paper. The objective of the proposed method is to deal with the dynamics of the environment, by detecting dynamic parts and preventing the inclusion of them in SLAM estimations. In this approach, each robot builds a limited map in its own vicinity, while the global map is built through a hybrid centralized MRSLAM. The restricted size of the local maps, bounds computational complexity and resources needed to handle a large scale dynamic environment. Using a probabilistic index, the proposed method differentiates between stationary and moving landmarks, based on their relative positions with other parts of the environment. Stationary landmarks are then used to refine a consistent map. The proposed method is evaluated with different levels of dynamism and for each level, the performance is measured in terms of accuracy, robustness, and hardware resources needed to be implemented. The method is also evaluated with a publicly available real-world data-set. Experimental validation along with simulations indicate that the proposed method is able to perform consistent SLAM in a dynamic environment, suggesting its feasibility for MRSLAM applications.


Author(s):  
Ruey-Shiang Shaw ◽  
Sheng-Pao Shih ◽  
Ta-Yu Fu ◽  
Chia-Wen Tsai

The software industry faces drastic changes in technology and business operations. The research structure of this study is based on the business model for software industries proposed by Rajala in 2003. The researcher employed an ex post facto research design to conduct a case study of the Galaxy Software Service Co., a company that is representative of the software industry in Taiwan. The main research goal of this study is to explore how this particular company developed into a large software company in the Taiwanese software sector, which is characterized by a prevalence of small- and medium-sized businesses, over a period of 25 years. This study employs a case study design and relies on in-depth participation and interviews to acquire a complete data set of the company’s internal operations. The evolution of the business model from the company’s inception until the present day has been divided into four phases: the entrepreneur phase, the growth phase, the stable phase, and the innovative breakthrough phase. The company developed into a major player in the software industry for 3 reasons: it has always insisted on a product differentiation strategy based on the sole reliance on software products, it started out as a software products dealer and gradually developed its own research and development capability, and it built a large-scale project management capability and received CMMI certification. These factors make the company stand out from other System Integrated businesses in the Taiwanese software sector offering both hardware and software products.


2019 ◽  
Vol 44 (3) ◽  
pp. 472-498
Author(s):  
Huy Quan Vu ◽  
Jian Ming Luo ◽  
Gang Li ◽  
Rob Law

Understanding the differences and similarities in the activities of tourists from various cultures is important for tourism managers to develop appropriate plans and strategies that could support urban tourism marketing and managements. However, tourism managers still face challenges in obtaining such understanding because the traditional approach of data collection, which relies on survey and questionnaires, is incapable of capturing tourist activities at a large scale. In this article, we present a method for the study of tourist activities based on a new type of data, venue check-ins. The effectiveness of the presented approach is demonstrated through a case study of a major tourism country, France. Analysis based on a large-scale data set from 19 tourism cities in France reveals interesting differences and similarities in the activities of tourists from 14 markets (countries). Valuable insights are provided for various urban tourism applications.


2007 ◽  
Vol 2 (1) ◽  
pp. 41-55 ◽  
Author(s):  
Keshar J. Baral

Using the data set published by joint venture banks in their annual reports, and NRB in its supervision annual reports, this paper examines the financial health of joint venture banks in the CAMEL framework. The health check up conducted on the basis of publicly available financial data concludes that the health of joint venture banks is better than that of the other commercial banks. In addition, the perusal of indicators of different components of CAMEL indicates that the financial health of joint venture banks is not so strong to manage the possible large scale shocks to their balance sheet and their health is fair. Journal of Nepalese Business Studies Vol.2(1) 2005 pp.41-55


2018 ◽  
Author(s):  
J Budis ◽  
J Gazdarica ◽  
J Radvanszky ◽  
M Harsanyova ◽  
I Gazdaricova ◽  
...  

AbstractLow-coverage massively parallel genome sequencing for non-invasive prenatal testing (NIPT) of common aneuploidies is one of the most rapidly adopted and relatively low-cost DNA tests. Since aggregation of reads from a large number of samples allows overcoming the problems of extremely low coverage of individual samples, we describe the possible re-use of the data generated during NIPT testing for genome scale population specific frequency determination of small DNA variants, requiring no additional costs except of those for the NIPT test itself. We applied our method to a data set comprising of 1,548 original NIPT test results and evaluated the findings on different levels, from in silico population frequency comparisons up to wet lab validation analyses using a gold-standard method. The revealed high reliability of variant calling and allelic frequency determinations suggest that these NIPT data could serve as valuable alternatives to large scale population studies even for smaller countries around the world.


2016 ◽  
Vol 68 (3) ◽  
pp. 509-513 ◽  
Author(s):  
Chunhe Yang ◽  
Hongwu Du

The hierarchical clustering method has been used for exploration of gene expression and proteomic profiles; however, little research into its application in the examination of expression of multiplecytokine/chemokine responses to stimuli has been reported. Thus, little progress has been made on how phytohemagglutinin(PHA) affects cytokine expression profiling on a large scale in the human hematological system. To investigate the characteristic expression pattern under PHA stimulation, Luminex, a multiplex bead-based suspension array, was performed. The data set collected from human peripheral blood mononuclear cells (PBMC) was analyzed using the hierarchical clustering method. It was revealed that two specific chemokines (CCL3 andCCL4) underwent significantly greater quantitative changes during induction of expression than other tested cytokines/chemokines after PHA stimulation. This result indicates that hierarchical clustering is a useful tool for detecting fine patterns during exploration of biological data, and that it can play an important role in comparative studies.


Sign in / Sign up

Export Citation Format

Share Document