scholarly journals High Dimensional Classification of Structural MRI Alzheimer?s Disease Data Based on Large Scale Regularization

Author(s):  
Ramon Casanova ◽  
Christopher T. Whitlow ◽  
Benjamin Wagner ◽  
Jeff Williamson ◽  
Sally A. Shumaker ◽  
...  
2004 ◽  
Vol 3 (1) ◽  
pp. 1-24 ◽  
Author(s):  
Markus Ruschhaupt ◽  
Wolfgang Huber ◽  
Annemarie Poustka ◽  
Ulrich Mansmann

We demonstrate a concept and implementation of a compendium for the classification of high-dimensional data from microarray gene expression profiles. A compendium is an interactive document that bundles primary data, statistical processing methods, figures, and derived data together with the textual documentation and conclusions. Interactivity allows the reader to modify and extend these components. We address the following questions: how much does the discriminatory power of a classifier depend on the choice of the algorithm that was used to identify it; what alternative classifiers could be used just as well; how robust is the result. The answers to these questions are essential prerequisites for validation and biological interpretation of the classifiers. We show how to use this approach by looking at these questions for a specific breast cancer microarray data set that first has been studied by Huang et al. (2003).


Author(s):  
Pasi Luukka ◽  
◽  
Jouni Sampo

We have compared the differential evolution and genetic algorithms in a study of weight optimization for different similarity measures in a task of classification. In a study of high dimensional data weighting similarity measures become of great importance and efforts to study suitable optimizers is needed. In this article we have studied proper weighting of similarity measures in the classification of high dimensional and large scale data. We will show that in most cases the differential evolution algorithm should be used in finding the weights instead of the genetic algorithm.


2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Ge Song ◽  
Yunming Ye

Textual stream classification has become a realistic and challenging issue since large-scale, high-dimensional, and non-stationary streams with class imbalance have been widely used in various real-life applications. According to the characters of textual streams, it is technically difficult to deal with the classification of textual stream, especially in imbalanced environment. In this paper, we propose a new ensemble framework, clustering forest, for learning from the textual imbalanced stream with concept drift (CFIM). The CFIM is based on ensemble learning by integrating a set of clustering trees (CTs). An adaptive selection method, which flexibly chooses the useful CTs by the property of the stream, is presented in CFIM. In particular, to deal with the problem of class imbalance, we collect and reuse both rare-class instances and misclassified instances from the historical chunks. Compared to most existing approaches, it is worth pointing out that our approach assumes that both majority class and rareclass may suffer from concept drift. Thus the distribution of resampled instances is similar to the current concept. The effectiveness of CFIM is examined in five real-world textual streams under an imbalanced nonstationary environment. Experimental results demonstrate that CFIM achieves better performance than four state-of-the-art ensemble models.


2015 ◽  
Vol 112 (8) ◽  
pp. 2479-2484 ◽  
Author(s):  
Tian Ge ◽  
Thomas E. Nichols ◽  
Phil H. Lee ◽  
Avram J. Holmes ◽  
Joshua L. Roffman ◽  
...  

The discovery and prioritization of heritable phenotypes is a computational challenge in a variety of settings, including neuroimaging genetics and analyses of the vast phenotypic repositories in electronic health record systems and population-based biobanks. Classical estimates of heritability require twin or pedigree data, which can be costly and difficult to acquire. Genome-wide complex trait analysis is an alternative tool to compute heritability estimates from unrelated individuals, using genome-wide data that are increasingly ubiquitous, but is computationally demanding and becomes difficult to apply in evaluating very large numbers of phenotypes. Here we present a fast and accurate statistical method for high-dimensional heritability analysis using genome-wide SNP data from unrelated individuals, termed massively expedited genome-wide heritability analysis (MEGHA) and accompanying nonparametric sampling techniques that enable flexible inferences for arbitrary statistics of interest. MEGHA produces estimates and significance measures of heritability with several orders of magnitude less computational time than existing methods, making heritability-based prioritization of millions of phenotypes based on data from unrelated individuals tractable for the first time to our knowledge. As a demonstration of application, we conducted heritability analyses on global and local morphometric measurements derived from brain structural MRI scans, using genome-wide SNP data from 1,320 unrelated young healthy adults of non-Hispanic European ancestry. We also computed surface maps of heritability for cortical thickness measures and empirically localized cortical regions where thickness measures were significantly heritable. Our analyses demonstrate the unique capability of MEGHA for large-scale heritability-based screening and high-dimensional heritability profile construction.


Sign in / Sign up

Export Citation Format

Share Document