A predictive epilepsy index based on probabilistic classification of interictal spike waveforms

Mapping Intimacies ◽

10.1101/373571 ◽

2018 ◽

Author(s):

Jesse A. Pfammatter ◽

Rachel A. Bergstrom ◽

Eli P. Wallace ◽

Rama K. Maganti ◽

Mathew V. Jones

Keyword(s):

Gaussian Mixture ◽

High Amplitude ◽

Data Sets ◽

Interictal Spikes ◽

Interictal Spike ◽

Automated Method ◽

Probabilistic Classification ◽

Out Of Sample ◽

Electrographic Seizures ◽

Definition Of

AbstractQuantification of interictal spikes in EEG may provide insight on epilepsy disease burden, but manual quantification of spikes is time-consuming and subject to bias. We present a probability-based, automated method for the classification and quantification of interictal events, using EEG data from kainate- and saline-injected mice (C57BL/6J background) several weeks post-treatment. We first detected high-amplitude events, then projected event waveforms into Principal Components space and identified clusters of spike morphologies using a Gaussian Mixture Model. We calculated the odds-ratio of events from kainate-versus saline-treated mice within each cluster, converted these values to probability scores, P(kainate), and calculated an Hourly Epilepsy Index for each animal by summing the probabilities for events where the cluster P(kainate) > 0.5 and dividing the resultant sum by the record duration. This Index is predictive of whether an animal received an epileptogenic treatment (i.e., kainate), even if a seizure was never observed. We applied this method to an out-of-sample dataset to assess epileptiform spike morphologies in five kainate mice monitored for ~1 month. The magnitude of the Index increased over time in a subset of animals and revealed changes in the prevalence of epileptiform (P(kainate) > 0.5) spike morphologies. Importantly, in both data sets, animals that had electrographic seizures also had a high Index. This analysis is fast, unbiased, and provides information regarding the salience of spike morphologies for disease progression. Future refinement will allow a better understanding of the definition of interictal spikes in quantitative and unambiguous terms.

Download Full-text

Estimation of Average Payloads from Weigh-in-Motion Data

Transportation Research Record Journal of the Transportation Research Board ◽

10.3141/2644-05 ◽

2017 ◽

Vol 2644 (1) ◽

pp. 39-47 ◽

Cited By ~ 3

Author(s):

Sarah Hernandez

Keyword(s):

State Level ◽

Gaussian Mixture ◽

Data Sets ◽

Body Type ◽

Motion Data ◽

Weigh In Motion ◽

Measured Weight ◽

Gross Vehicle Weight ◽

Definition Of ◽

Weight Data

Average payloads define the ton-to-truck conversion factors that are critical inputs to commodity-based freight forecasting models. However, average payloads are derived primarily from outdated, unrepresentative truck surveys. With increasing technological and methodological means of concurrently measuring truck configurations, commodity types, and weights, there are now viable alternatives to truck surveys. In this paper, a method to derive average payloads by truck body type and using weight data from weigh-in-motion (WIM) sensors is presented. Average payloads by truck body type are found by subtracting an estimated average empty weight from an estimated average loaded weight. Empty and loaded weights are derived from a Gaussian mixture model fit to a gross vehicle weight distribution. An analysis of truck body type distributions, loaded weights, empty weights, and resulting payloads of five-axle tractor trailer (FHWA Class 9 or 3-S2) trucks is presented to compare national and state-level Vehicle Inventory and Use Survey (VIUS) data and the WIM-based approach. Results show statistically significant differences between the three data sets in each of the comparison categories. A challenge in this analysis is the definition of a correct set of payloads because the WIM and survey data are subject to their own inherent misrepresentations. WIM data, however, provide a continuous source of measured weight data that overcome the drawback of using out-of-date surveys. Overall, average payloads from measured weights are lower than those for the national or California VIUS estimates.

Download Full-text

A Spectral-Based Approach for BCG Signal Content Classification

Sensors ◽

10.3390/s21031020 ◽

2021 ◽

Vol 21 (3) ◽

pp. 1020

Author(s):

Mohamed Chiheb Ben Nasr ◽

Sofia Ben Jebara ◽

Samuel Otis ◽

Bessam Abdulrazak ◽

Neila Mezghani

Keyword(s):

Human Body ◽

Supervised Classification ◽

Unsupervised Classification ◽

Gaussian Mixture ◽

K Nearest Neighbors ◽

Content Classification ◽

Extraction Step ◽

Vital Sign Measurement ◽

Definition Of ◽

Spectral Flatness Measure

This paper has two objectives: the first is to generate two binary flags to indicate useful frames permitting the measurement of cardiac and respiratory rates from Ballistocardiogram (BCG) signals—in fact, human body activities during measurements can disturb the BCG signal content, leading to difficulties in vital sign measurement; the second objective is to achieve refined BCG signal segmentation according to these activities. The proposed framework makes use of two approaches: an unsupervised classification based on the Gaussian Mixture Model (GMM) and a supervised classification based on K-Nearest Neighbors (KNN). Both of these approaches consider two spectral features, namely the Spectral Flatness Measure (SFM) and Spectral Centroid (SC), determined during the feature extraction step. Unsupervised classification is used to explore the content of the BCG signals, justifying the existence of different classes and permitting the definition of useful hyper-parameters for effective segmentation. In contrast, the considered supervised classification approach aims to determine if the BCG signal content allows the measurement of the heart rate (HR) and the respiratory rate (RR) or not. Furthermore, two levels of supervised classification are used to classify human-body activities into many realistic classes from the BCG signal (e.g., coughing, holding breath, air expiration, movement, et al.). The first one considers frame-by-frame classification, while the second one, aiming to boost the segmentation performance, transforms the frame-by-frame SFM and SC features into temporal series which track the temporal variation of the measures of the BCG signal. The proposed approach constitutes a novelty in this field and represents a powerful method to segment BCG signals according to human body activities, resulting in an accuracy of 94.6%.

Download Full-text

Fast and robust identity-by-descent inference with the templated positional Burrows-Wheeler transform

Molecular Biology and Evolution ◽

10.1093/molbev/msaa328 ◽

2020 ◽

Author(s):

William A Freyman ◽

Kimberly F McManus ◽

Suyash S Shringarpure ◽

Ethan M Jewett ◽

Katarzyna Bryc ◽

...

Keyword(s):

Isolation By Distance ◽

False Negative ◽

Segment Length ◽

Data Sets ◽

Haplotype Sharing ◽

Binary File ◽

Inference Algorithms ◽

Out Of Sample ◽

Massive Scale ◽

Burrows Wheeler Transform

Abstract Estimating the genomic location and length of identical-by-descent (IBD) segments among individuals is a crucial step in many genetic analyses. However, the exponential growth in the size of biobank and direct-to-consumer (DTC) genetic data sets makes accurate IBD inference a significant computational challenge. Here we present the templated positional Burrows-Wheeler transform (TPBWT) to make fast IBD estimates robust to genotype and phasing errors. Using haplotype data simulated over pedigrees with realistic genotyping and phasing errors we show that the TPBWT outperforms other state-of-the-art IBD inference algorithms in terms of speed and accuracy. For each phase-aware method, we explore the false positive and false negative rates of inferring IBD by segment length and characterize the types of error commonly found. Our results highlight the fragility of most phased IBD inference methods; the accuracy of IBD estimates can be highly sensitive to the quality of haplotype phasing. Additionally we compare the performance of the TPBWT against a widely used phase-free IBD inference approach that is robust to phasing errors. We introduce both in-sample and out-of-sample TPBWT-based IBD inference algorithms and demonstrate their computational efficiency on massive-scale datasets with millions of samples. Furthermore we describe the binary file format for TPBWT-compressed haplotypes that results in fast and efficient out-of-sample IBD computes against very large cohort panels. Finally, we demonstrate the utility of the TPBWT in a brief empirical analysis exploring geographic patterns of haplotype sharing within Mexico. Hierarchical clustering of IBD shared across regions within Mexico reveals geographically structured haplotype sharing and a strong signal of isolation by distance. Our software implementation of the TPBWT is freely available for non-commercial use in the code repository https://github.com/23andMe/phasedibd.

Download Full-text

A New Epigenetic Model to Stratify Glioma Patients According to Their Immunosuppressive State

Cells ◽

10.3390/cells10030576 ◽

2021 ◽

Vol 10 (3) ◽

pp. 576

Author(s):

Maurizio Polano ◽

Emanuele Fabbiani ◽

Eva Adreuzzi ◽

Federica Di Cintio ◽

Luca Bedon ◽

...

Keyword(s):

The Cancer Genome Atlas ◽

Classification Model ◽

Low Grade ◽

Machine Learning Classification ◽

Immune State ◽

Out Of Sample ◽

Cancer Genome Atlas ◽

The Central Nervous System ◽

Neural Network Classifiers ◽

Definition Of

Gliomas are the most common primary neoplasm of the central nervous system. A promising frontier in the definition of glioma prognosis and treatment is represented by epigenetics. Furthermore, in this study, we developed a machine learning classification model based on epigenetic data (CpG probes) to separate patients according to their state of immunosuppression. We considered 573 cases of low-grade glioma (LGG) and glioblastoma (GBM) from The Cancer Genome Atlas (TCGA). First, from gene expression data, we derived a novel binary indicator to flag patients with a favorable immune state. Then, based on previous studies, we selected the genes related to the immune state of tumor microenvironment. After, we improved the selection with a data-driven procedure, based on Boruta. Finally, we tuned, trained, and evaluated both random forest and neural network classifiers on the resulting dataset. We found that a multi-layer perceptron network fed by the 338 probes selected by applying both expert choice and Boruta results in the best performance, achieving an out-of-sample accuracy of 82.8%, a Matthews correlation coefficient of 0.657, and an area under the ROC curve of 0.9. Based on the proposed model, we provided a method to stratify glioma patients according to their epigenomic state.

Download Full-text

Towards Application of One-Class Classification Methods to Medical Data

The Scientific World JOURNAL ◽

10.1155/2014/730712 ◽

2014 ◽

Vol 2014 ◽

pp. 1-7 ◽

Cited By ~ 10

Author(s):

Itziar Irigoien ◽

Basilio Sierra ◽

Concepción Arenas

Keyword(s):

State Of The Art ◽

Gaussian Mixture ◽

Support Vector ◽

Support Vector Data Description ◽

Data Sets ◽

Biomedical Data ◽

Vector Data ◽

Target Class ◽

Tumor Recognition ◽

One Class Classification

In the problem of one-class classification (OCC) one of the classes, the target class, has to be distinguished from all other possible objects, considered as nontargets. In many biomedical problems this situation arises, for example, in diagnosis, image based tumor recognition or analysis of electrocardiogram data. In this paper an approach to OCC based on a typicality test is experimentally compared with reference state-of-the-art OCC techniques—Gaussian, mixture of Gaussians, naive Parzen, Parzen, and support vector data description—using biomedical data sets. We evaluate the ability of the procedures using twelve experimental data sets with not necessarily continuous data. As there are few benchmark data sets for one-class classification, all data sets considered in the evaluation have multiple classes. Each class in turn is considered as the target class and the units in the other classes are considered as new units to be classified. The results of the comparison show the good performance of the typicality approach, which is available for high dimensional data; it is worth mentioning that it can be used for any kind of data (continuous, discrete, or nominal), whereas state-of-the-art approaches application is not straightforward when nominal variables are present.

Download Full-text

Quantification of Ligand PET Studies using a Reference Region with a Displaceable Fraction: Application to Occupancy Studies with [11C]-DASB as an Example

Journal of Cerebral Blood Flow & Metabolism ◽

10.1038/jcbfm.2011.108 ◽

2011 ◽

Vol 32 (1) ◽

pp. 70-80 ◽

Cited By ~ 21

Author(s):

Federico E Turkheimer ◽

Sudhakar Selvaraj ◽

Rainer Hinz ◽

Venkatesha Murthy ◽

Zubin Bhagwagar ◽

...

Keyword(s):

Specific Binding ◽

Normal Subjects ◽

Volume Of Distribution ◽

Binding Potential ◽

Accurate Estimation ◽

Data Sets ◽

Reference Region ◽

Reference Tissue ◽

Positron Emission ◽

Definition Of

This paper aims to build novel methodology for the use of a reference region with specific binding for the quantification of brain studies with radioligands and positron emission tomography (PET). In particular: (1) we introduce a definition of binding potential BPD = DVR–1 where DVR is the volume of distribution relative to a reference tissue that contains ligand in specifically bound form, (2) we validate a numerical methodology, rank-shaping regularization of exponential spectral analysis (RS-ESA), for the calculation of BPD that can cope with a reference region with specific bound ligand, (3) we demonstrate the use of RS-ESA for the accurate estimation of drug occupancies with the use of correction factors to account for the specific binding in the reference. [11C]-DASB with cerebellum as a reference was chosen as an example to validate the methodology. Two data sets were used; four normal subjects scanned after infusion of citalopram or placebo and further six test—retest data sets. In the drug occupancy study, the use of RS-ESA with cerebellar input plus corrections produced estimates of occupancy very close the ones obtained with plasma input. Test-retest results demonstrated a tight linear relationship between BPD calculated either with plasma or with a reference input and high reproducibility.

Download Full-text

Indicators of AEI Applied to the Delaware Estuary

The Scientific World JOURNAL ◽

10.1100/tsw.2002.346 ◽

2002 ◽

Vol 2 ◽

pp. 169-189 ◽

Cited By ~ 3

Author(s):

Lawrence W. Barnthouse ◽

Douglas G. Heimbuch ◽

Vaughn C. Anthony ◽

Ray W. Hilborn ◽

Ransom A. Myers

Keyword(s):

Meta Analysis ◽

Juvenile Fish ◽

Fish Populations ◽

Weight Of Evidence ◽

Data Sets ◽

Delaware Estuary ◽

Fish Stocks ◽

Trends Analysis ◽

Definition Of

We evaluated the impacts of entrainment and impingement at the Salem Generating Station on fish populations and communities in the Delaware Estuary. In the absence of an agreed-upon regulatory definition of “adverse environmental impact” (AEI), we developed three independent benchmarks of AEI based on observed or predicted changes that could threaten the sustainability of a population or the integrity of a community.Our benchmarks of AEI included: (1) disruption of the balanced indigenous community of fish in the vicinity of Salem (the “BIC” analysis); (2) a continued downward trend in the abundance of one or more susceptible fish species (the “Trends” analysis); and (3) occurrence of entrainment/impingement mortality sufficient, in combination with fishing mortality, to jeopardize the future sustainability of one or more populations (the “Stock Jeopardy” analysis).The BIC analysis utilized nearly 30 years of species presence/absence data collected in the immediate vicinity of Salem. The Trends analysis examined three independent data sets that document trends in the abundance of juvenile fish throughout the estuary over the past 20 years. The Stock Jeopardy analysis used two different assessment models to quantify potential long-term impacts of entrainment and impingement on susceptible fish populations. For one of these models, the compensatory capacities of the modeled species were quantified through meta-analysis of spawner-recruit data available for several hundred fish stocks.All three analyses indicated that the fish populations and communities of the Delaware Estuary are healthy and show no evidence of an adverse impact due to Salem. Although the specific models and analyses used at Salem are not applicable to every facility, we believe that a weight of evidence approach that evaluates multiple benchmarks of AEI using both retrospective and predictive methods is the best approach for assessing entrainment and impingement impacts at existing facilities.

Download Full-text

Fast and robust identity-by-descent inference with the templated positional Burrows-Wheeler transform

10.1101/2020.09.14.296939 ◽

2020 ◽

Author(s):

William A. Freyman ◽

Kimberly F. McManus ◽

Suyash S. Shringarpure ◽

Ethan M. Jewett ◽

Katarzyna Bryc ◽

...

Keyword(s):

Isolation By Distance ◽

False Negative ◽

Segment Length ◽

Data Sets ◽

Haplotype Sharing ◽

Binary File ◽

Inference Algorithms ◽

Out Of Sample ◽

Massive Scale ◽

Burrows Wheeler Transform

AbstractEstimating the genomic location and length of identical-by-descent (IBD) segments among individuals is a crucial step in many genetic analyses. However, the exponential growth in the size of biobank and direct-to-consumer (DTC) genetic data sets makes accurate IBD inference a significant computational challenge. Here we present the templated positional Burrows-Wheeler transform (TPBWT) to make fast IBD estimates robust to genotype and phasing errors. Using haplotype data simulated over pedigrees with realistic genotyping and phasing errors we show that the TPBWT outperforms other state-of-the-art IBD inference algorithms in terms of speed and accuracy. For each phase-aware method, we explore the false positive and false negative rates of inferring IBD by segment length and characterize the types of error commonly found. Our results highlight the fragility of most phased IBD inference methods; the accuracy of IBD estimates can be highly sensitive to the quality of haplotype phasing. Additionally we compare the performance of the TPBWT against a widely used phase-free IBD inference approach that is robust to phasing errors. We introduce both in-sample and out-of-sample TPBWT-based IBD inference algorithms and demonstrate their computational efficiency on massive-scale datasets with millions of samples. Furthermore we describe the binary file format for TPBWT-compressed haplotypes that results in fast and efficient out-of-sample IBD computes against very large cohort panels. Finally, we demonstrate the utility of the TPBWT in a brief empirical analysis exploring geographic patterns of haplotype sharing within Mexico. Hierarchical clustering of IBD shared across regions within Mexico reveals geographically structured haplotype sharing and a strong signal of isolation by distance. Our software implementation of the TPBWT is freely available for non-commercial use in the code repository https://github.com/23andMe/phasedibd.

Download Full-text

An environmental determinant of viral respiratory disease

10.1101/2020.06.05.20123349 ◽

2020 ◽

Author(s):

Yeon-Woo Choi ◽

Alexandre Tuel ◽

Elfatih A. B. Eltahir

Keyword(s):

Respiratory Disease ◽

Environmental Variable ◽

Data Sets ◽

Maxwell Theory ◽

State Variable ◽

Environmental Determinant ◽

Weather And Climate ◽

Adc Values ◽

Definition Of ◽

Viral Respiratory

ABSTRACTThe evident seasonality of influenza suggests a significant role for weather and climate as one of several determinants of viral respiratory disease (VRD), including social determinants which play a major role in shaping these phenomena. Based on the current mechanistic understanding of how VRDs are transmitted by small droplets, we identify an environmental variable, Air Drying Capacity (ADC), as an atmospheric state-variable with significant and direct relevance to the transmission of VRD. ADC dictates the evolution and fate of droplets under given temperature and humidity conditions. The definition of this variable is rooted in the Maxwell theory of droplet evolution via coupled heat and mass transfer between droplets and the surrounding environment. We present the climatology of ADC, and compare its observed distribution in space and time to the observed prevalence of influenza and COVID-19 from extensive global data sets. Globally, large ADC values appear to significantly constrain the observed transmission and spread of VRD, consistent with the significant coherency of the observed seasonal cycles of ADC and influenza. Our results introduce a new environmental determinant, rooted in the mechanism of VRD transmission, with potential implications for explaining seasonality of influenza, and for describing how environmental conditions may impact to some degree the evolution of similar VRDs, such as COVID-19.

Download Full-text

Potential impact of natural hazards on water supply systems in Alpine regions

Water Practice & Technology ◽

10.2166/wpt.2008.060 ◽

2008 ◽

Vol 3 (3) ◽

Cited By ~ 5

Author(s):

M. Möderl ◽

D. Vanham ◽

S. De Toffo ◽

W. Rauch

Keyword(s):

Water Supply ◽

Natural Hazards ◽

Potential Effect ◽

Water Supply Systems ◽

Data Sets ◽

Management Support ◽

Prevention Measures ◽

Support Tool ◽

Alpine Regions ◽

Definition Of

One of the most important aspects in water supply management is supply security. In this article a methodology is introduced to first identify vulnerable sites of a water supply system (WSS) and second to estimate the potential effect of alpine natural hazards on this system. The approach serves for the definition of zones with low, medium and high potential risk by combining vulnerability and hazard maps. This approach enables the possibility to accomplish prevention measures on risky sites considering the available budget. A management support tool (VulNetWS - Vulnerability of Water Supply Networks) is developed which quantifies vulnerability based on hydraulic and quality simulations assuming component failure of each single WSS component. Hazards of flooding, landslide, debris flow and avalanches are calculated and categorized in potential low, medium and high hazard zones. For this analysis different GIS data sets (e.g. Austrian hazard zone maps, HORA “Flood Risk Zoning”) are used. The methodology is presented by applying it upon an alpine region encompassing the municipality of Kitzbühel (Tyrol - Austria) and 4 neighbouring municipalities. The combination of vulnerability and hazard is summarized using a risk matrix that highlights a zone of 0.42 square kilometres within the study area as being potentially risky.

Download Full-text