Can Bibliographic Pointers for Known Biological Data Be Found Automatically? Protein Interactions as a Case Study

Christian Blaschke; Alfonso Valencia

doi:10.1002/cfg.91

Can Bibliographic Pointers for Known Biological Data Be Found Automatically? Protein Interactions as a Case Study

Comparative and Functional Genomics ◽

10.1002/cfg.91 ◽

2001 ◽

Vol 2 (4) ◽

pp. 196-206 ◽

Cited By ~ 33

Author(s):

Christian Blaschke ◽

Alfonso Valencia

Keyword(s):

Protein Interactions ◽

Large Scale ◽

Biological Data ◽

Positive Finding ◽

Data Set ◽

Large Scale Assessment ◽

Low Coverage ◽

Standard Protein

The Dictionary of Interacting Proteins(DIP) (Xenarioset al., 2000) is a large repository of protein interactions: its March 2000 release included 2379 protein pairs whose interactions have been detected by experimental methods. Even if many of these correspond to poorly characterized proteins, the result of massive yeast two-hybrid screenings, as many as 851 correspond to interactions detected using direct biochemical methods.We used information retrieval technology to search automatically for sentences in Medline abstracts that support these 851 DIP interactions. Surprisingly, we found correspondence between DIP protein pairs and Medline sentences describing their interactions in only 30% of the cases. This low coverage has interesting consequences regarding the quality of annotations (references) introduced in the database and the limitations of the application of information extraction (IE) technology to Molecular Biology. It is clear that the limitation of analyzing abstracts rather than full papers and the lack of standard protein names are difficulties of considerably more importance than the limitations of the IE methodology employed. A positive finding is the capacity of the IE system to identify new relations between proteins, even in a set of proteins previously characterized by human experts. These identifications are made with a considerable degree of precision.This is, to our knowledge, the first large scale assessment of IE capacity to detect previously known interactions: we thus propose the use of the DIP data set as a biological reference to benchmark IE systems.

Download Full-text

Predicting Shifts in Land Suitability for Maize Cultivation Worldwide Due to Climate Change: A Modeling Approach

Land ◽

10.3390/land10030295 ◽

2021 ◽

Vol 10 (3) ◽

pp. 295

Author(s):

Yuan Gao ◽

Anyu Zhang ◽

Yaojie Yue ◽

Jing’ai Wang ◽

Peng Su

Keyword(s):

Climate Change ◽

Crop Production ◽

Large Scale ◽

Land Suitability ◽

Large Scale Assessment ◽

Scale Assessment ◽

Maize Cultivation ◽

Large Scale Screening ◽

Selection Of

Suitable land is an important prerequisite for crop cultivation and, given the prospect of climate change, it is essential to assess such suitability to minimize crop production risks and to ensure food security. Although a variety of methods to assess the suitability are available, a comprehensive, objective, and large-scale screening of environmental variables that influence the results—and therefore their accuracy—of these methods has rarely been explored. An approach to the selection of such variables is proposed and the criteria established for large-scale assessment of land, based on big data, for its suitability to maize (Zea mays L.) cultivation as a case study. The predicted suitability matched the past distribution of maize with an overall accuracy of 79% and a Kappa coefficient of 0.72. The land suitability for maize is likely to decrease markedly at low latitudes and even at mid latitudes. The total area suitable for maize globally and in most major maize-producing countries will decrease, the decrease being particularly steep in those regions optimally suited for maize at present. Compared with earlier research, the method proposed in the present paper is simple yet objective, comprehensive, and reliable for large-scale assessment. The findings of the study highlight the necessity of adopting relevant strategies to cope with the adverse impacts of climate change.

Download Full-text

https://ijsea.com/archive/volume10/issue9/IJSEA10091005.pdf

International Journal of Science and Engineering Applications ◽

10.7753/ijsea1009.1006 ◽

2021 ◽

Vol 10 (9) ◽

pp. 144-147

Author(s):

Huiling LI ◽

Xuan SU ◽

Shuaipeng ZHANG

Keyword(s):

Process Model ◽

Large Scale ◽

Process Mining ◽

Data Set ◽

Systems Model ◽

Event Logs ◽

Event Log ◽

Low Efficiency ◽

Sampling Approach

Massive amounts of business process event logs are collected and stored by modern information systems. Model discovery aims to discover a process model from such event logs, however, most of the existing approaches still suffer from low efficiency when facing large-scale event logs. Event log sampling techniques provide an effective scheme to improve the efficiency of process discovery, but the existing techniques still cannot guarantee the quality of model mining. Therefore, a sampling approach based on set coverage algorithm named set coverage sampling approach is proposed. The proposed sampling approach has been implemented in the open-source process mining toolkit ProM. Furthermore, experiments using a real event log data set from conformance checking and time performance analysis show that the proposed event log sampling approach can greatly improve the efficiency of log sampling on the premise of ensuring the quality of model mining.

Download Full-text

Ensuring Quality of Large Scale Industrial Process Collections: Experiences from a Case Study

Lecture Notes in Business Information Processing - The Practice of Enterprise Modeling ◽

10.1007/978-3-662-45501-2_2 ◽

2014 ◽

pp. 11-25 ◽

Cited By ~ 8

Author(s):

Merethe Heggset ◽

John Krogstie ◽

Harald Wesenberg

Keyword(s):

Large Scale ◽

Industrial Process

Download Full-text

Automatic delineation of debris-covered glaciers using InSAR coherence derived from X-, C- and L-band radar data: a case study of Yazgyl Glacier

Journal of Glaciology ◽

10.1017/jog.2018.70 ◽

2018 ◽

Vol 64 (247) ◽

pp. 811-821 ◽

Cited By ~ 8

Author(s):

STEFAN LIPPL ◽

SAURABH VIJAY ◽

MATTHIAS BRAUN

Keyword(s):

Large Scale ◽

Window Size ◽

Radar Data ◽

Morphological Operations ◽

L Band ◽

Surrounding Areas ◽

Coherence Estimation ◽

The Impact

ABSTRACTDespite their importance for mass-balance estimates and the progress in techniques based on optical and thermal satellite imagery, the mapping of debris-covered glacier boundaries remains a challenging task. Manual corrections hamper regular updates. In this study, we present an automatic approach to delineate glacier outlines using interferometrically derived synthetic aperture radar (InSAR) coherence, slope and morphological operations. InSAR coherence detects the temporally decorrelated surface (e.g. glacial extent) irrespective of its surface type and separates it from the highly coherent surrounding areas. We tested the impact of different processing settings, for example resolution, coherence window size and topographic phase removal, on the quality of the generated outlines. We found minor influence of the topographic phase, but a combination of strong multi-looking during interferogram generation and additional averaging during coherence estimation strongly deteriorated the coherence at the glacier edges. We analysed the performance of X-, C- and L- band radar data. The C-band Sentinel-1 data outlined the glacier boundary with the least misclassifications and a type II error of 0.47% compared with Global Land Ice Measurements from Space inventory data. Our study shows the potential of the Sentinel-1 mission together with our automatic processing chain to provide regular updates for land-terminating glaciers on a large scale.

Download Full-text

Multi-Robot SLAM in Dynamic Environments with Parallel Maps

International Journal of Humanoid Robotics ◽

10.1142/s0219843621500110 ◽

2021 ◽

pp. 2150011

Author(s):

Sajad Badalkhani ◽

Ramazan Havangi ◽

Mohsen Farshad

Keyword(s):

Large Scale ◽

Dynamic Environment ◽

Dynamic Environments ◽

Extensive Literature ◽

Real World Data ◽

Data Set ◽

Cooperative Approach ◽

Localization And Mapping ◽

Multi Robot

There is an extensive literature regarding multi-robot simultaneous localization and mapping (MRSLAM). In most part of the research, the environment is assumed to be static, while the dynamic parts of the environment degrade the estimation quality of SLAM algorithms and lead to inherently fragile systems. To enhance the performance and robustness of the SLAM in dynamic environments (SLAMIDE), a novel cooperative approach named parallel-map (p-map) SLAM is introduced in this paper. The objective of the proposed method is to deal with the dynamics of the environment, by detecting dynamic parts and preventing the inclusion of them in SLAM estimations. In this approach, each robot builds a limited map in its own vicinity, while the global map is built through a hybrid centralized MRSLAM. The restricted size of the local maps, bounds computational complexity and resources needed to handle a large scale dynamic environment. Using a probabilistic index, the proposed method differentiates between stationary and moving landmarks, based on their relative positions with other parts of the environment. Stationary landmarks are then used to refine a consistent map. The proposed method is evaluated with different levels of dynamism and for each level, the performance is measured in terms of accuracy, robustness, and hardware resources needed to be implemented. The method is also evaluated with a publicly available real-world data-set. Experimental validation along with simulations indicate that the proposed method is able to perform consistent SLAM in a dynamic environment, suggesting its feasibility for MRSLAM applications.

Download Full-text

From Entrepreneur to Big Player

International Journal of Strategic Information Technology and Applications ◽

10.4018/jsita.2013040102 ◽

2013 ◽

Vol 4 (2) ◽

pp. 21-34

Author(s):

Ruey-Shiang Shaw ◽

Sheng-Pao Shih ◽

Ta-Yu Fu ◽

Chia-Wen Tsai

Keyword(s):

Business Model ◽

Large Scale ◽

Software Industry ◽

Stable Phase ◽

Data Set ◽

Main Research ◽

Software Products ◽

Ex Post ◽

Major Player

The software industry faces drastic changes in technology and business operations. The research structure of this study is based on the business model for software industries proposed by Rajala in 2003. The researcher employed an ex post facto research design to conduct a case study of the Galaxy Software Service Co., a company that is representative of the software industry in Taiwan. The main research goal of this study is to explore how this particular company developed into a large software company in the Taiwanese software sector, which is characterized by a prevalence of small- and medium-sized businesses, over a period of 25 years. This study employs a case study design and relies on in-depth participation and interviews to acquire a complete data set of the company’s internal operations. The evolution of the business model from the company’s inception until the present day has been divided into four phases: the entrepreneur phase, the growth phase, the stable phase, and the innovative breakthrough phase. The company developed into a major player in the software industry for 3 reasons: it has always insisted on a product differentiation strategy based on the sole reliance on software products, it started out as a software products dealer and gradually developed its own research and development capability, and it built a large-scale project management capability and received CMMI certification. These factors make the company stand out from other System Integrated businesses in the Taiwanese software sector offering both hardware and software products.

Download Full-text

Exploration of Tourist Activities in Urban Destination Using Venue Check-In Data

Journal of Hospitality & Tourism Research ◽

10.1177/1096348019889121 ◽

2019 ◽

Vol 44 (3) ◽

pp. 472-498

Author(s):

Huy Quan Vu ◽

Jian Ming Luo ◽

Gang Li ◽

Rob Law

Keyword(s):

Data Collection ◽

Large Scale ◽

Traditional Approach ◽

Urban Tourism ◽

Data Set ◽

Tourism Marketing ◽

Large Scale Data ◽

New Type ◽

Scale Data

Understanding the differences and similarities in the activities of tourists from various cultures is important for tourism managers to develop appropriate plans and strategies that could support urban tourism marketing and managements. However, tourism managers still face challenges in obtaining such understanding because the traditional approach of data collection, which relies on survey and questionnaires, is incapable of capturing tourist activities at a large scale. In this article, we present a method for the study of tourist activities based on a new type of data, venue check-ins. The effectiveness of the presented approach is demonstrated through a case study of a major tourism country, France. Analysis based on a large-scale data set from 19 tourism cities in France reveals interesting differences and similarities in the activities of tourists from 14 markets (countries). Valuable insights are provided for various urban tourism applications.

Download Full-text

Health Check-up of Commercial Banks in the Framework of CAMEL: A Case Study of Joint Venture Banks in Nepal

Journal of Nepalese Business Studies ◽

10.3126/jnbs.v2i1.55 ◽

2007 ◽

Vol 2 (1) ◽

pp. 41-55 ◽

Cited By ~ 11

Author(s):

Keshar J. Baral

Keyword(s):

Commercial Banks ◽

Large Scale ◽

Joint Venture ◽

Balance Sheet ◽

Health Check ◽

Annual Reports ◽

Financial Health ◽

Data Set ◽

Business Studies

Using the data set published by joint venture banks in their annual reports, and NRB in its supervision annual reports, this paper examines the financial health of joint venture banks in the CAMEL framework. The health check up conducted on the basis of publicly available financial data concludes that the health of joint venture banks is better than that of the other commercial banks. In addition, the perusal of indicators of different components of CAMEL indicates that the financial health of joint venture banks is not so strong to manage the possible large scale shocks to their balance sheet and their health is fair. Journal of Nepalese Business Studies Vol.2(1) 2005 pp.41-55

Download Full-text

Non-invasive prenatal testing as a valuable source of population specific allelic frequencies

10.1101/348466 ◽

2018 ◽

Author(s):

J Budis ◽

J Gazdarica ◽

J Radvanszky ◽

M Harsanyova ◽

I Gazdaricova ◽

...

Keyword(s):

Large Scale ◽

High Reliability ◽

Low Cost ◽

Variant Calling ◽

Prenatal Testing ◽

Data Set ◽

Gold Standard Method ◽

Non Invasive ◽

Scale Population ◽

Low Coverage

AbstractLow-coverage massively parallel genome sequencing for non-invasive prenatal testing (NIPT) of common aneuploidies is one of the most rapidly adopted and relatively low-cost DNA tests. Since aggregation of reads from a large number of samples allows overcoming the problems of extremely low coverage of individual samples, we describe the possible re-use of the data generated during NIPT testing for genome scale population specific frequency determination of small DNA variants, requiring no additional costs except of those for the NIPT test itself. We applied our method to a data set comprising of 1,548 original NIPT test results and evaluated the findings on different levels, from in silico population frequency comparisons up to wet lab validation analyses using a gold-standard method. The revealed high reliability of variant calling and allelic frequency determinations suggest that these NIPT data could serve as valuable alternatives to large scale population studies even for smaller countries around the world.

Download Full-text

Evaluation by hierarchical clustering of multiple cytokine expression after phytohemagglutinin stimulation

Archives of Biological Sciences ◽

10.2298/abs150731040y ◽

2016 ◽

Vol 68 (3) ◽

pp. 509-513 ◽

Cited By ~ 1

Author(s):

Chunhe Yang ◽

Hongwu Du

Keyword(s):

Hierarchical Clustering ◽

Large Scale ◽

Mononuclear Cells ◽

Human Peripheral Blood ◽

Cytokine Expression ◽

Biological Data ◽

Clustering Method ◽

Data Set ◽

Suspension Array ◽

Pha Stimulation

The hierarchical clustering method has been used for exploration of gene expression and proteomic profiles; however, little research into its application in the examination of expression of multiplecytokine/chemokine responses to stimuli has been reported. Thus, little progress has been made on how phytohemagglutinin(PHA) affects cytokine expression profiling on a large scale in the human hematological system. To investigate the characteristic expression pattern under PHA stimulation, Luminex, a multiplex bead-based suspension array, was performed. The data set collected from human peripheral blood mononuclear cells (PBMC) was analyzed using the hierarchical clustering method. It was revealed that two specific chemokines (CCL3 andCCL4) underwent significantly greater quantitative changes during induction of expression than other tested cytokines/chemokines after PHA stimulation. This result indicates that hierarchical clustering is a useful tool for detecting fine patterns during exploration of biological data, and that it can play an important role in comparative studies.

Download Full-text