Machine-learning for cluster analysis of localization microscopy data.

Mapping Intimacies ◽

10.1101/505719 ◽

2018 ◽

Author(s):

David J Williamson ◽

Garth L Burn ◽

Juliette Griffie ◽

Daniel M Davis ◽

Dylan M Owen

Keyword(s):

Machine Learning ◽

Cluster Analysis ◽

Single Molecule ◽

Large Scale ◽

Supervised Machine Learning ◽

Point Pattern ◽

Data Set ◽

Localization Microscopy ◽

Human T Cell ◽

Microscopy Data

Quantifying the clustering of points within single-molecule localization microscopy data is useful to understanding the spatial relationships of the molecules in the underlying sample. The conversion of point pattern data into a meaningful description of clustering is difficult, especially for biologically derived data, as the definitions of clustering are often subjective or simplistic. Many existing computational approaches are also limited in their ability to process large-scale data-sets or to deal effectively with inhomogeneities in clustering. Here we have developed a supervised machine-learning approach to cluster analysis which is fast and accurate. Trained on a variety of simulated clustered data, the network can then classify all points from a typical localization microscopy data-set (several million points from the entire field of view) as being either clustered or not-clustered, with the potential to include additional classifiers to describe different types of clusters. Clustered points can then be further refined into like-clusters for the measurement of cluster area, shape, and point-density. We demonstrate the performance on simulated data and experimental data of the kinase Csk and the adaptor PAG in both naive and pre-stimulated primary human T cell synapses.

Download Full-text

Machine learning for cluster analysis of localization microscopy data

Nature Communications ◽

10.1038/s41467-020-15293-x ◽

2020 ◽

Vol 11 (1) ◽

Cited By ~ 7

Author(s):

David J. Williamson ◽

Garth L. Burn ◽

Sabrina Simoncelli ◽

Juliette Griffié ◽

Ruby Peters ◽

...

Keyword(s):

Machine Learning ◽

Cluster Analysis ◽

Localization Microscopy ◽

Microscopy Data

Download Full-text

A Bayesian cluster analysis method for single-molecule localization microscopy data

Nature Protocols ◽

10.1038/nprot.2016.149 ◽

2016 ◽

Vol 11 (12) ◽

pp. 2499-2514 ◽

Cited By ~ 28

Author(s):

Juliette Griffié ◽

Michael Shannon ◽

Claire L Bromley ◽

Lies Boelen ◽

Garth L Burn ◽

...

Keyword(s):

Cluster Analysis ◽

Single Molecule ◽

Analysis Method ◽

Localization Microscopy ◽

Bayesian Cluster Analysis ◽

Cluster Analysis Method ◽

Microscopy Data ◽

Bayesian Cluster ◽

Single Molecule Localization Microscopy

Download Full-text

Automated Analysis of Fluorescence Kinetics in Single-Molecule Localization Microscopy Data Reveals Protein Stoichiometry

The Journal of Physical Chemistry B ◽

10.1021/acs.jpcb.1c01130 ◽

2021 ◽

Author(s):

Alon Saguy ◽

Tim N. Baldering ◽

Lucien E. Weiss ◽

Elias Nehme ◽

Christos Karathanasis ◽

...

Keyword(s):

Single Molecule ◽

Automated Analysis ◽

Fluorescence Kinetics ◽

Localization Microscopy ◽

Microscopy Data ◽

Single Molecule Localization Microscopy

Download Full-text

Exploring fake news identification using word and sentence embeddings

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189865 ◽

2021 ◽

pp. 1-8

Author(s):

V.T Priyanga ◽

J.P Sanjanasri ◽

Vijay Krishna Menon ◽

E.A Gopalakrishnan ◽

K.P Soman

Keyword(s):

Machine Learning ◽

Social Media ◽

Network Analysis ◽

Supervised Machine Learning ◽

Breeding Ground ◽

Fake News ◽

Data Set ◽

Highly Correlated ◽

Use Of Social Media ◽

The Liar

The widespread use of social media like Facebook, Twitter, Whatsapp, etc. has changed the way News is created and published; accessing news has become easy and inexpensive. However, the scale of usage and inability to moderate the content has made social media, a breeding ground for the circulation of fake news. Fake news is deliberately created either to increase the readership or disrupt the order in the society for political and commercial benefits. It is of paramount importance to identify and filter out fake news especially in democratic societies. Most existing methods for detecting fake news involve traditional supervised machine learning which has been quite ineffective. In this paper, we are analyzing word embedding features that can tell apart fake news from true news. We use the LIAR and ISOT data set. We churn out highly correlated news data from the entire data set by using cosine similarity and other such metrices, in order to distinguish their domains based on central topics. We then employ auto-encoders to detect and differentiate between true and fake news while also exploring their separability through network analysis.

Download Full-text

Machine learning identifies an immunological pattern associated with multiple juvenile idiopathic arthritis subtypes

Annals of the Rheumatic Diseases ◽

10.1136/annrheumdis-2018-214354 ◽

2019 ◽

Vol 78 (5) ◽

pp. 617-628 ◽

Cited By ~ 5

Author(s):

Erika Van Nieuwenhove ◽

Vasiliki Lagou ◽

Lien Van Eyck ◽

James Dooley ◽

Ulrich Bodenhofer ◽

...

Keyword(s):

Machine Learning ◽

Juvenile Idiopathic Arthritis ◽

Large Scale ◽

Inflammatory Diseases ◽

Adaptive Immune System ◽

Healthy Children ◽

Learning Approaches ◽

Data Set ◽

Immune Signature ◽

Systemic Jia

ObjectivesJuvenile idiopathic arthritis (JIA) is the most common class of childhood rheumatic diseases, with distinct disease subsets that may have diverging pathophysiological origins. Both adaptive and innate immune processes have been proposed as primary drivers, which may account for the observed clinical heterogeneity, but few high-depth studies have been performed.MethodsHere we profiled the adaptive immune system of 85 patients with JIA and 43 age-matched controls with indepth flow cytometry and machine learning approaches.ResultsImmune profiling identified immunological changes in patients with JIA. This immune signature was shared across a broad spectrum of childhood inflammatory diseases. The immune signature was identified in clinically distinct subsets of JIA, but was accentuated in patients with systemic JIA and those patients with active disease. Despite the extensive overlap in the immunological spectrum exhibited by healthy children and patients with JIA, machine learning analysis of the data set proved capable of discriminating patients with JIA from healthy controls with ~90% accuracy.ConclusionsThese results pave the way for large-scale immune phenotyping longitudinal studies of JIA. The ability to discriminate between patients with JIA and healthy individuals provides proof of principle for the use of machine learning to identify immune signatures that are predictive to treatment response group.

Download Full-text

Leveraging Road Characteristics and Contributor Behaviour for Assessing Road Type Quality in OSM

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10070436 ◽

2021 ◽

Vol 10 (7) ◽

pp. 436

Author(s):

Amerah Alghanim ◽

Musfira Jilani ◽

Michela Bertolotto ◽

Gavin McArdle

Keyword(s):

Machine Learning ◽

Spatial Data ◽

Classification Accuracy ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Data Set ◽

Semantic Inference ◽

Road Type ◽

The Impact

Volunteered Geographic Information (VGI) is often collected by non-expert users. This raises concerns about the quality and veracity of such data. There has been much effort to understand and quantify the quality of VGI. Extrinsic measures which compare VGI to authoritative data sources such as National Mapping Agencies are common but the cost and slow update frequency of such data hinder the task. On the other hand, intrinsic measures which compare the data to heuristics or models built from the VGI data are becoming increasingly popular. Supervised machine learning techniques are particularly suitable for intrinsic measures of quality where they can infer and predict the properties of spatial data. In this article we are interested in assessing the quality of semantic information, such as the road type, associated with data in OpenStreetMap (OSM). We have developed a machine learning approach which utilises new intrinsic input features collected from the VGI dataset. Specifically, using our proposed novel approach we obtained an average classification accuracy of 84.12%. This result outperforms existing techniques on the same semantic inference task. The trustworthiness of the data used for developing and training machine learning models is important. To address this issue we have also developed a new measure for this using direct and indirect characteristics of OSM data such as its edit history along with an assessment of the users who contributed the data. An evaluation of the impact of data determined to be trustworthy within the machine learning model shows that the trusted data collected with the new approach improves the prediction accuracy of our machine learning technique. Specifically, our results demonstrate that the classification accuracy of our developed model is 87.75% when applied to a trusted dataset and 57.98% when applied to an untrusted dataset. Consequently, such results can be used to assess the quality of OSM and suggest improvements to the data set.

Download Full-text

Different firm responses to the COVID-19 pandemic shocks: machine-learning evidence on the Vietnamese labor market

International Journal of Emerging Markets ◽

10.1108/ijoem-02-2021-0292 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Lam Hoang Viet Le ◽

Toan Luu Duc Huynh ◽

Bryan S. Weber ◽

Bao Khac Quoc Nguyen

Keyword(s):

Machine Learning ◽

Labor Market ◽

Large Scale ◽

Government Support ◽

Policy Implications ◽

Machine Learning Techniques ◽

Firm Characteristics ◽

Data Set ◽

Content Type ◽

Firm Responses

PurposeThis paper aims to identify the disproportionate impacts of the COVID-19 pandemic on labor markets.Design/methodology/approachThe authors conduct a large-scale survey on 16,000 firms from 82 industries in Ho Chi Minh City, Vietnam, and analyze the data set by using different machine-learning methods.FindingsFirst, job loss and reduction in state-owned enterprises have been significantly larger than in other types of organizations. Second, employees of foreign direct investment enterprises suffer a significantly lower labor income than those of other groups. Third, the adverse effects of the COVID-19 pandemic on the labor market are heterogeneous across industries and geographies. Finally, firms with high revenue in 2019 are more likely to adopt preventive measures, including the reduction of labor forces. The authors also find a significant correlation between firms' revenue and labor reduction as traditional econometrics and machine-learning techniques suggest.Originality/valueThis study has two main policy implications. First, although government support through taxes has been provided, the authors highlight evidence that there may be some additional benefit from targeting firms that have characteristics associated with layoffs or other negative labor responses. Second, the authors provide information that shows which firm characteristics are associated with particular labor market responses such as layoffs, which may help target stimulus packages. Although the COVID-19 pandemic affects most industries and occupations, heterogeneous firm responses suggest that there could be several varieties of targeted policies-targeting firms that are likely to reduce labor forces or firms likely to face reduced revenue. In this paper, the authors outline several industries and firm characteristics which appear to more directly be reducing employee counts or having negative labor responses which may lead to more cost–effect stimulus.

Download Full-text

A Machine Learning Pipeline for Demand Response Capacity Scheduling

Energies ◽

10.3390/en13071848 ◽

2020 ◽

Vol 13 (7) ◽

pp. 1848 ◽

Cited By ~ 1

Author(s):

Gautham Krishnadas ◽

Aristides Kiprakis

Keyword(s):

Machine Learning ◽

Energy Balance ◽

Smart Grid ◽

Demand Response ◽

Large Scale ◽

Performance Metrics ◽

Supervised Machine Learning ◽

Algorithm Selection ◽

Load Forecast ◽

Forecast Models

Demand response (DR) is an integral component of smart grid operations that offers the necessary flexibility to support its decarbonisation. In incentive-based DR programs, deviations from the scheduled DR capacity affect the grid’s energy balance and result in revenue losses for the DR participants. This issue aggravates with increasing DR delivery from participants such as large consumer buildings who have limited standard methods to follow for DR capacity scheduling. Load curtailment based DR capacity availability from such consumers can be forecasted reliably with the help of supervised machine learning (ML) models. This study demonstrates the development of data-driven ML based total and flexible load forecast models for a retail building. The ML model development tasks such as data pre-processing, training-testing dataset preparation, cross-validation, algorithm selection, hyperparameter optimisation, feature ranking, model selection and model evaluation are guided by deployment-centric design criteria such as reliability, computational efficiency and scalability. Based on the selected performance metrics, the day-ahead and week-ahead ML based load forecast models developed for the retail building are shown to outperform the timeseries persistence models used for benchmarking. Furthermore, the deployment of these models for DR capacity scheduling is proposed as an ML pipeline that can be realised with the help of ML workflows, computational resources as well as systems for monitoring and visualisation. The ML pipeline ensures faster, cost-effective and large-scale deployment of forecast models that support reliable DR capacity scheduling without affecting the grid’s energy balance. Minimisation of revenue losses encourages increased DR participation from large consumer buildings, ensuring further flexibility in the smart grid.

Download Full-text

SMoLR: visualization and analysis of single-molecule localization microscopy data in R

BMC Bioinformatics ◽

10.1186/s12859-018-2578-3 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 5

Author(s):

Maarten W. Paul ◽

H. Martijn de Gruiter ◽

Zhanmin Lin ◽

Willy M. Baarends ◽

Wiggert A. van Cappellen ◽

...

Keyword(s):

Single Molecule ◽

Localization Microscopy ◽

Microscopy Data ◽

Single Molecule Localization Microscopy

Download Full-text

On the Influence of Contextual Features for the Identification of Complex Words

International Journal of Semantic Computing ◽

10.1142/s1793351x17400207 ◽

2017 ◽

Vol 11 (04) ◽

pp. 497-511

Author(s):

Elnaz Davoodi ◽

Leila Kosseim ◽

Matthew Mongrain

Keyword(s):

Machine Learning ◽

Natural Language ◽

Target Word ◽

Supervised Machine Learning ◽

Learning Models ◽

Data Set ◽

Contextual Features ◽

Complex Words ◽

Machine Learning Models

This paper evaluates the effect of the context of a target word on the identification of complex words in natural language texts. The approach automatically tags words as either complex or not, based on two sets of features: base features that only pertain to the target word, and contextual features that take the context of the target word into account. We experimented with several supervised machine learning models, and trained and tested the approach with the 2016 SemEval Word Complexity Data Set. Results show that when discriminating base features are used, the words around the target word can supplement those features and improve the recognition of complex words.

Download Full-text