An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework

Analysis of Ultrafast Transient Absorption Spectroscopy Data using Integrative Data Analysis Platform: 1. Data Processing, Fitting, and Model Selection

10.1101/165498 ◽

2017 ◽

Author(s):

Evgenii L. Kovrigin

Keyword(s):

Data Analysis ◽

Transient Absorption ◽

Model Fitting ◽

Information Criterion ◽

Transient Absorption Spectroscopy ◽

Isosbestic Point ◽

Spectroscopy Data ◽

Integrative Data Analysis ◽

Protein Label ◽

Analysis Platform

ABSTRACTThis manuscript describes a workflow for analysis of transient absorption (TA) spectroscopy data using Integrative Data Analysis Platforms (IDAP) software package. Time-dependent spectral series are analyzed through evaluation of the isosbestic point and kinetics of excited state and ground-state bleach decays. Model fitting and selection based on Akaike’s Information Criterion is discussed. As a practical example, we analyze excitation decays of a common protein label, Alexa Fluor 647.

Download Full-text

Fifteen Years of Gene Set Analysis for High-Throughput Genomic Data: A Review of Statistical Approaches and Future Challenges

Entropy ◽

10.3390/e22040427 ◽

2020 ◽

Vol 22 (4) ◽

pp. 427

Author(s):

Samarendra Das ◽

Craig J. McClain ◽

Shesh N. Rai

Keyword(s):

Gene Expression ◽

Data Analysis ◽

Rna Sequencing ◽

Association Studies ◽

Genome Wide Association ◽

Gene Set Analysis ◽

Statistical Structure ◽

Gene Set ◽

Genome Wide ◽

Association Data

Over the last decade, gene set analysis has become the first choice for gaining insights into underlying complex biology of diseases through gene expression and gene association studies. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results. Although gene set analysis approaches are extensively used in gene expression and genome wide association data analysis, the statistical structure and steps common to these approaches have not yet been comprehensively discussed, which limits their utility. In this article, we provide a comprehensive overview, statistical structure and steps of gene set analysis approaches used for microarrays, RNA-sequencing and genome wide association data analysis. Further, we also classify the gene set analysis approaches and tools by the type of genomic study, null hypothesis, sampling model and nature of the test statistic, etc. Rather than reviewing the gene set analysis approaches individually, we provide the generation-wise evolution of such approaches for microarrays, RNA-sequencing and genome wide association studies and discuss their relative merits and limitations. Here, we identify the key biological and statistical challenges in current gene set analysis, which will be addressed by statisticians and biologists collectively in order to develop the next generation of gene set analysis approaches. Further, this study will serve as a catalog and provide guidelines to genome researchers and experimental biologists for choosing the proper gene set analysis approach based on several factors.

Download Full-text

Evolutionary Intelligent Data Warehousing Approach to Knowledge Discovery Systems: Dynamic Cubing

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813666191211113623 ◽

2019 ◽

Vol 13 ◽

Author(s):

Harkiran Kaur ◽

Kawaljeet Singh ◽

Tejinder Kaur

Keyword(s):

Knowledge Discovery ◽

Data Warehouse ◽

Business Intelligence ◽

End Users ◽

Processing Stage ◽

Information Update ◽

On Line ◽

Analytical Processing ◽

Warehouse Operations ◽

Made In

Background: Numerous E – Migrants databases assist the migrants to locate their peers in various countries; hence contributing largely in communication of migrants, staying overseas. Presently, these traditional E – Migrants databases face the issues of non – scalability, difficult search mechanisms and burdensome information update routines. Furthermore, analysis of migrants’ profiles in these databases has remained unhandled till date and hence do not generate any knowledge. Objective: To design and develop an efficient and multidimensional knowledge discovery framework for E - Migrants databases. Method: In the proposed technique, results of complex calculations related to most probable On-Line Analytical Processing operations required by end users, are stored in the form of Decision Trees, at the pre- processing stage of data analysis. While browsing the Cube, these pre-computed results are called; thus offering Dynamic Cubing feature to end users at runtime. This data-tuning step reduces the query processing time and increases efficiency of required data warehouse operations. Results: Experiments conducted with Data Warehouse of around 1000 migrants’ profiles confirm the knowledge discovery power of this proposal. Using the proposed methodology, authors have designed a framework efficient enough to incorporate the amendments made in the E – Migrants Data Warehouse systems on regular intervals, which was totally missing in the traditional E – Migrants databases. Conclusion: The proposed methodology facilitate migrants to generate dynamic knowledge and visualize it in the form of dynamic cubes. Applying Business Intelligence mechanisms, blending it with tuned OLAP operations, the authors have managed to transform traditional datasets into intelligent migrants Data Warehouse.

Download Full-text

Integrative Data Analysis from a Unifying Research Synthesis Perspective

10.1093/oso/9780190676001.003.0020 ◽

2018 ◽

Author(s):

Eun-Young Mun ◽

Anne E. Ray

Keyword(s):

Data Analysis ◽

Large Scale ◽

Research Synthesis ◽

Alcohol Intervention ◽

Data Set ◽

Integrative Data Analysis ◽

Level Data ◽

Model Complex ◽

Wide Range ◽

Individual Participant

Integrative data analysis (IDA) is a promising new approach in psychological research and has been well received in the field of alcohol research. This chapter provides a larger unifying research synthesis framework for IDA. Major advantages of IDA of individual participant-level data include better and more flexible ways to examine subgroups, model complex relationships, deal with methodological and clinical heterogeneity, and examine infrequently occurring behaviors. However, between-study heterogeneity in measures, designs, and samples and systematic study-level missing data are significant barriers to IDA and, more broadly, to large-scale research synthesis. Based on the authors’ experience working on the Project INTEGRATE data set, which combined individual participant-level data from 24 independent college brief alcohol intervention studies, it is also recognized that IDA investigations require a wide range of expertise and considerable resources and that some minimum standards for reporting IDA studies may be needed to improve transparency and quality of evidence.

Download Full-text

MEASUREMENT PROPERTIES OF A STANDARDIZED ELICITED IMITATION TEST: AN INTEGRATIVE DATA ANALYSIS

Studies in Second Language Acquisition ◽

10.1017/s0272263121000383 ◽

2021 ◽

pp. 1-27

Author(s):

Daniel R. Isbell ◽

Young-A Son

Keyword(s):

Second Language ◽

Data Analysis ◽

Second Language Acquisition ◽

Item Difficulty ◽

Measurement Properties ◽

Linguistic Features ◽

Elicited Imitation ◽

Integrative Data Analysis ◽

Rater Severity ◽

Study Participants

Abstract Elicited Imitation Tests (EITs) are commonly used in second language acquisition (SLA)/bilingualism research contexts to assess the general oral proficiency of study participants. While previous studies have provided valuable EIT construct-related validity evidence, some key gaps remain. This study uses an integrative data analysis to further probe the validity of the Korean EIT score interpretations by examining the performances of 318 Korean learners (198 second language, 79 foreign language, and 41 heritage) on the Korean EIT scored by five different raters. Expanding on previous EIT validation efforts, this study (a) examined both inter-rater reliability and differences in rater severity, (b) explored measurement bias across subpopulations of language learners, (c) identified relevant linguistic features which relate to item difficulty, and (d) provided a norm-referenced interpretation for Korean EIT scores. Overall, findings suggest that the Korean EIT can be used in diverse SLA/bilingualism research contexts, as it measures ability similarly across subgroups and raters.

Download Full-text

Design and implementation of intelligent accounting data analysis platform based on industrial cloud computing

EURASIP Journal on Wireless Communications and Networking ◽

10.1186/s13638-020-1647-2 ◽

2020 ◽

Vol 2020 (1) ◽

Author(s):

Wang Ting ◽

Yang Liu

Keyword(s):

Cloud Computing ◽

Data Analysis ◽

Accounting Data ◽

Design And Implementation ◽

Analysis Platform ◽

Industrial Cloud

Download Full-text

Gene set analysis and reduction for a continuous phenotype: Identifying markers of birth weight variation based on embryonic stem cells and immunologic signatures

Computers in Biology and Medicine ◽

10.1016/j.compbiomed.2019.103389 ◽

2019 ◽

Vol 113 ◽

pp. 103389

Author(s):

Shabnam Vatanpour ◽

Saumyadipta Pyne ◽

Ana Paula Leite ◽

Irina Dinu

Keyword(s):

Stem Cells ◽

Birth Weight ◽

Embryonic Stem Cells ◽

Embryonic Stem ◽

Gene Set Analysis ◽

Gene Set ◽

Weight Variation ◽

Continuous Phenotype

Download Full-text

Variance component score test for time-course gene set analysis of longitudinal RNA-seq data

Biostatistics ◽

10.1093/biostatistics/kxx005 ◽

2017 ◽

Vol 18 (4) ◽

pp. 589-604 ◽

Cited By ~ 6

Author(s):

Denis Agniel ◽

Boris P. Hejblum

Keyword(s):

Variance Component ◽

Time Course ◽

Score Test ◽

Gene Set Analysis ◽

Rna Seq ◽

Gene Set ◽

Component Score

Download Full-text

DSigDB: drug signatures database for gene set analysis: Fig. 1.

Bioinformatics ◽

10.1093/bioinformatics/btv313 ◽

2015 ◽

Vol 31 (18) ◽

pp. 3069-3071 ◽

Cited By ~ 81

Author(s):

Minjae Yoo ◽

Jimin Shin ◽

Jihye Kim ◽

Karen A. Ryall ◽

Kyubum Lee ◽

...

Keyword(s):

Gene Set Analysis ◽

Gene Set

Download Full-text

Microarray-based gene set analysis: a comparison of current methods

BMC Bioinformatics ◽

10.1186/1471-2105-9-502 ◽

2008 ◽

Vol 9 (1) ◽

Cited By ~ 57

Author(s):

Sarah Song ◽

Michael A Black

Keyword(s):

Gene Set Analysis ◽

Gene Set

Download Full-text