Analysis of Local Variability and Allostery in Macromolecular Assemblies using Cryo-EM and Focused Classification

Mapping Intimacies ◽

10.1101/365940 ◽

2018 ◽

Cited By ~ 1

Author(s):

Cheng Zhang ◽

William Cantara ◽

Youngmin Jeon ◽

Karin Musier-Forsyth ◽

Nikolaus Grigorieff ◽

...

Keyword(s):

Image Classification ◽

Single Particle ◽

Ribosomal Subunit ◽

Simulated Data ◽

Region Of Interest ◽

List Type ◽

Classification Methods ◽

Experimental Dataset ◽

Macromolecular Assemblies ◽

Structural Mobility

AbstractSingle-particle electron cryo-microscopy and computational image classification can be used to analyze structural variability in macromolecules and their assemblies. In some cases, a particle may contain different regions that each display a range of distinct conformations. We have developed strategies, implemented within the Frealign andcisTEM image processing packages, to focus classify on specific regions of a particle and detect potential covariance. The strategies are based on masking the region of interest using either a 2-D mask applied to reference projections and particle images, or a 3-D mask applied to the 3-D volume. We show that focused classification approaches can be used to study structural allostery, a concept that is likely to gain more importance as datasets grow in size, allowing the distinction of more structural states and smaller differences between states. Finally, we apply the approaches to an experimental dataset containing the HIV-1 Transactivation Response (TAR) element RNA fused into the large bacterial ribosomal subunit, to deconvolve structural mobility within localized regions of interest.HighlightsDescription of different image classification strategies in single-particle cryo-EMQuantitative evaluation of two classification methods using simulated dataApplication of the two classification methods to an experimental dataset

Download Full-text

Particle Based Model for Airborne Disease Transmission

10.1101/2020.04.23.20076273 ◽

2020 ◽

Author(s):

Michael B Dillon ◽

Charles F Dillon

Keyword(s):

Single Particle ◽

Disease Transmission ◽

A Priori ◽

Region Of Interest ◽

Infectious Agent ◽

Geographic Region ◽

Disease Spread ◽

List Type ◽

Executive Summary ◽

Airborne Disease

Executive SummaryPrior literature documents cases of airborne infectious disease transmission at distances ranging from ≥ 2 m to inter-continental in scale. Physics- and biology- based models describe the key aspects of these airborne disease transmission events, but important gaps remain. This report extends current approaches by developing a new, single-particle based theory that (a) assesses the likelihood of rare airborne infections (where individuals inhale either one or no infectious particles) and (b) explicitly accounts for the variability in airborne exposures and population susceptibilities within a geographic region of interest. For these hazards, airborne particle fate and transport is independent of particulate concentration, and so results for complex releases can be determined from the results of many single-particle releases.This work is intended to provide context for both (a) the initial stages of a disease outbreak and (b) larger scale (≥ 2 m) disease spread, including distant disease “sparks” (low probability, unexpected disease transmission events that infect remote, susceptible populations). The physics of airborne particulate dispersion inherently constrains airborne disease transmission. As such, this work suggests results that, a priori, may be applicable to many airborne diseases.Model PredictionsModeling predictions of the single-particle transmission kernel suggest that outdoor airborne disease transmission events may occur episodically as the infection probabilities can vary over many orders of magnitude depending on the distance downwind; specific virus, prion, or microorganism; and meteorological conditions.Model results suggest that, under the right conditions, an indoor infected person could spread disease to a similar, or greater, number of people downwind than in the building they occupy. However, the downwind, per-person infection probability is predicted to be lower than the within-building, per-person infection probability. This finding is limited to airborne transmission considerations.This work suggests a new relative disease probability metric for airborne transmitted diseases. This metric, which is distinct from the traditional relative risk metric, is applicable when the rate at which the infectious agent losses infectivity in the atmosphere is ≲ 1 h-1.

Download Full-text

IMAGE CLASSIFICATION METHODS IN THE SPACE OF DESCRIPTIONS IN THE FORM OF A SET OF THE KEY POINT DESCRIPTORS

Telecommunications and Radio Engineering ◽

10.1615/telecomradeng.v77.i9.40 ◽

2018 ◽

Vol 77 (9) ◽

pp. 787-797 ◽

Cited By ~ 2

Author(s):

V. А. Gorokhovatskyi

Keyword(s):

Image Classification ◽

Classification Methods

Download Full-text

Modeling Linkage Disequilibrium and Identifying Recombination Hotspots Using Single-Nucleotide Polymorphism Data

Genetics ◽

10.1093/genetics/165.4.2213 ◽

2003 ◽

Vol 165 (4) ◽

pp. 2213-2233 ◽

Cited By ~ 41

Author(s):

Na Li ◽

Matthew Stephens

Keyword(s):

Linkage Disequilibrium ◽

Recombination Rate ◽

Population Sample ◽

Simulated Data ◽

Region Of Interest ◽

Population Data ◽

Recombination Rates ◽

Single Nucleotide ◽

Recombination Hotspots ◽

Genomic Regions

AbstractWe introduce a new statistical model for patterns of linkage disequilibrium (LD) among multiple SNPs in a population sample. The model overcomes limitations of existing approaches to understanding, summarizing, and interpreting LD by (i) relating patterns of LD directly to the underlying recombination process; (ii) considering all loci simultaneously, rather than pairwise; (iii) avoiding the assumption that LD necessarily has a “block-like” structure; and (iv) being computationally tractable for huge genomic regions (up to complete chromosomes). We examine in detail one natural application of the model: estimation of underlying recombination rates from population data. Using simulation, we show that in the case where recombination is assumed constant across the region of interest, recombination rate estimates based on our model are competitive with the very best of current available methods. More importantly, we demonstrate, on real and simulated data, the potential of the model to help identify and quantify fine-scale variation in recombination rate from population data. We also outline how the model could be useful in other contexts, such as in the development of more efficient haplotype-based methods for LD mapping.

Download Full-text

A survey of image classification methods and techniques

2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT) ◽

10.1109/iccicct.2014.6993023 ◽

2014 ◽

Cited By ~ 26

Author(s):

Siddhartha Sankar Nath ◽

Girish Mishra ◽

Jajnyaseni Kar ◽

Sayan Chakraborty ◽

Nilanjan Dey

Keyword(s):

Image Classification ◽

Classification Methods ◽

Methods And Techniques

Download Full-text

Effective combining of color and texture descriptors for indoor-outdoor image classification

Facta universitatis - series Electronics and Energetics ◽

10.2298/fuee1403399c ◽

2014 ◽

Vol 27 (3) ◽

pp. 399-410 ◽

Cited By ~ 2

Author(s):

Stevica Cvetkovic ◽

Sasa Nikolic ◽

Slobodan Ilic

Keyword(s):

Image Classification ◽

Accurate Method ◽

Optimal Combination ◽

Compact Representation ◽

Svm Classifier ◽

Classification Methods ◽

Empirical Tests ◽

Public Datasets ◽

Fast Extraction ◽

Classification Procedures

Although many indoor-outdoor image classification methods have been proposed in the literature, most of them have omitted comparison with basic methods to justify the need for complex feature extraction and classification procedures. In this paper we propose a relatively simple but highly accurate method for indoor-outdoor image classification, based on combination of carefully engineered MPEG-7 color and texture descriptors. In order to determine the optimal combination of descriptors in terms of fast extraction, compact representation and high accuracy, we conducted comprehensive empirical tests over several color and texture descriptors. The descriptors combination was used for training and testing of a binary SVM classifier. We have shown that the proper descriptors preprocessing before SVM classification has significant impact on the final result. Comprehensive experimental evaluation shows that the proposed method outperforms several more complex indoor-outdoor image classification techniques on a couple of public datasets.

Download Full-text

Detection and Analysis of the Causes of Intensive Harmful Algal Bloom in Kamchatka Based on Satellite Data

Journal of Marine Science and Engineering ◽

10.3390/jmse9101092 ◽

2021 ◽

Vol 9 (10) ◽

pp. 1092

Author(s):

Valery Bondur ◽

Viktor Zamshin ◽

Olga Chvertkova ◽

Ekaterina Matrosova ◽

Vasilisa Khodaeva

Keyword(s):

Chlorophyll A ◽

Suspended Matter ◽

Algal Bloom ◽

Harmful Algal Bloom ◽

Simulated Data ◽

Region Of Interest ◽

Chlorophyll A Concentration ◽

Ocean Level ◽

Radar Satellite ◽

Sentinel 2A

In this paper, the causes of the anomalous harmful algal bloom which occurred in the fall of 2020 in Kamchatka have been detected and analyzed using a long-term time series of heterogeneous satellite and simulated data with respect to the sea surface height (HYCOM) and temperature (NOAA OISST), chlorophyll-a concentration (MODIS Ocean Color SMI), slick parameters (SENTINEL-1A/B), and suspended matter characteristics (SENTINEL-2A/B, C2RCC algorithm). It has been found that the harmful algal bloom was preceded by temperature anomalies (reaching 6 °C, exceeding the climatic norm by more than three standard deviation intervals) and intensive ocean level variability followed by the generation of vortices, mixing water masses and providing nutrients to the upper photic layer. The harmful algal bloom itself was manifested in an increase in the concentration of chlorophyll-a, its average monthly value for October 2020 (bloom peak) approached 15 mg/m3, exceeding the climatic norm almost four-fold for the region of interest (Avacha Gulf). The zones of accumulation of a large amount of biogenic surfactant films registered in radar satellite imagery correlate well with the local regions of the highest chlorophyll-a concentration. The harmful bloom was influenced by river runoff, which intensively brought mineral and biogenic suspensions into the marine environment (the concentration of total suspended matter within the plume of the Nalycheva River reached 10 mg/m3 and more in 2020), expanding food resources for microalgae.

Download Full-text

Extracting image features for classification by two-tier genetic programming

10.26686/wgtn.13150940 ◽

2020 ◽

Author(s):

Harith Al-Sahaf ◽

A Song ◽

K Neshatian ◽

Mengjie Zhang

Keyword(s):

Genetic Programming ◽

Image Classification ◽

Domain Knowledge ◽

Extraction Process ◽

High Accuracy ◽

Classification Performance ◽

Image Features ◽

Classification Methods ◽

Feature Based ◽

Second Tier

Image classification is a complex but important task especially in the areas of machine vision and image analysis such as remote sensing and face recognition. One of the challenges in image classification is finding an optimal set of features for a particular task because the choice of features has direct impact on the classification performance. However the goodness of a feature is highly problem dependent and often domain knowledge is required. To address these issues we introduce a Genetic Programming (GP) based image classification method, Two-Tier GP, which directly operates on raw pixels rather than features. The first tier in a classifier is for automatically defining features based on raw image input, while the second tier makes decision. Compared to conventional feature based image classification methods, Two-Tier GP achieved better accuracies on a range of different tasks. Furthermore by using the features defined by the first tier of these Two-Tier GP classifiers, conventional classification methods obtained higher accuracies than classifying on manually designed features. Analysis on evolved Two-Tier image classifiers shows that there are genuine features captured in the programs and the mechanism of achieving high accuracy can be revealed. The Two-Tier GP method has clear advantages in image classification, such as high accuracy, good interpretability and the removal of explicit feature extraction process. © 2012 IEEE.

Download Full-text

AIRBORNE HYPERSPECTRAL REMOTE SENSING FOR IDENTIFICATION GRASSLAND VEGETATION

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xl-3-w3-427-2015 ◽

2015 ◽

Vol XL-3/W3 ◽

pp. 427-431 ◽

Cited By ~ 1

Author(s):

P. Burai ◽

T. Tomor ◽

L. Bekő ◽

B. Deák

Keyword(s):

Image Classification ◽

Learning Algorithm ◽

Training Sample ◽

Hyperspectral Data ◽

Training Dataset ◽

Classification Methods ◽

Grassland Vegetation ◽

Training Samples ◽

Almost All ◽

Noise Fraction

In our study we classified grassland vegetation types of an alkali landscape (Eastern Hungary), using different image classification methods for hyperspectral data. Our aim was to test the applicability of hyperspectral data in this complex system using various image classification methods. To reach the highest classification accuracy, we compared the performance of traditional image classifiers, machine learning algorithm, feature extraction (MNF-transformation) and various sizes of training dataset. Hyperspectral images were acquired by an AISA EAGLE II hyperspectral sensor of 128 contiguous bands (400–1000 nm), a spectral sampling of 5 nm bandwidth and a ground pixel size of 1 m. We used twenty vegetation classes which were compiled based on the characteristic dominant species, canopy height, and total vegetation cover. Image classification was applied to the original and MNF (minimum noise fraction) transformed dataset using various training sample sizes between 10 and 30 pixels. In the case of the original bands, both SVM and RF classifiers provided high accuracy for almost all classes irrespectively of the number of the training pixels. We found that SVM and RF produced the best accuracy with the first nine MNF transformed bands. Our results suggest that in complex open landscapes, application of SVM can be a feasible solution, as this method provides higher accuracies compared to RF and MLC. SVM was not sensitive for the size of the training samples, which makes it an adequate tool for cases when the available number of training pixels are limited for some classes.

Download Full-text

The effects of data adequacy and calibration size on the accuracy of presence-only species distribution models

10.1101/775700 ◽

2019 ◽

Author(s):

Truly Santika ◽

Michael F. Hutchinson ◽

Kerrie A. Wilson

Keyword(s):

Performance Measures ◽

Species Distribution ◽

Species Distribution Models ◽

Simulated Data ◽

List Type ◽

Model Accuracy ◽

Distribution Models ◽

Accuracy Measure ◽

Data Coverage ◽

True Distribution

ABSTRACTPresence-only data used to develop species distribution models are often biased towards areas that are frequently surveyed. Furthermore, the size of calibration area with respect to the area covered by the species occurrences has been shown to affect model accuracy. However, existing assessments of the effect of data inadequacy and calibration size on model accuracy have predominately been conducted using empirical studies. These studies can give ambiguous results, since the data used to train and test the model can both be biased.These limitations were addressed by applying simulated data to assess how inadequate data coverage and the size of calibration area affect the accuracy of species distribution models generated by MaxEnt and BIOCLIM. The validity of four presence-only performance measures, Contrast Validation Index (CVI), Boyce index, AUC and AUCratio, was also assessed.CVI, AUC and AUCratio ranked the accuracy of univariate models correctly according to the true importance of their defining environmental variable, a desirable property of an accuracy measure. Contrastingly, Boyce index failed to rank the accuracy of univariate models correctly and a high percentage of irrelevant variables produced models with a high Boyce index.Inadequate data coverage and increased calibration area reduced model accuracy by reducing the correct identification of the dominant environmental determinant. BIOCLIM outperformed MaxEnt models in predicting the true distribution of simulated species with a symmetric dominant response. However, MaxEnt outperformed BIOCLIM in predicting the true distribution of simulated species with skew and linear dominant responses. Despite this, the standard performance measures consistently overestimated the performance of MaxEnt models and showed them as always having higher model accuracy than the BIOCLIM models.It has been acknowledged that research should be directed towards testing and improving species distribution modelling tools, particularly how to handle the inevitable bias and scarcity of species occurrence data. Simulated data, as demonstrated here, provides a powerful approach to comprehensively test the performance of modelling tools and to disentangle the effects of data properties and modelling options on model accuracy. This may be impossible to achieve using real-world data.

Download Full-text