Leveraging Collaborative-Filtering for Personalized Behavior Modeling

Xuhai Xu; Prerna Chikersal; Janine M. Dutcher; Yasaman S. Sefidgar; Woosuk Seo; Michael J. Tumminia; Daniella K. Villalba; Sheldon Cohen; Kasey G. Creswell; J. David Creswell; Afsaneh Doryab; Paula S. Nurius; Eve Riskin; Anind K. Dey; Jennifer Mankoff

doi:10.1145/3448107

Leveraging Collaborative-Filtering for Personalized Behavior Modeling

Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies ◽

10.1145/3448107 ◽

2021 ◽

Vol 5 (1) ◽

pp. 1-27

Author(s):

Xuhai Xu ◽

Prerna Chikersal ◽

Janine M. Dutcher ◽

Yasaman S. Sefidgar ◽

Woosuk Seo ◽

...

Keyword(s):

College Students ◽

Collaborative Filtering ◽

Human Behavior ◽

Work Performance ◽

Performance Metrics ◽

Learning Algorithm ◽

Model Performance ◽

Majority Voting ◽

Behavior Modeling ◽

Depression Detection

The prevalence of mobile phones and wearable devices enables the passive capturing and modeling of human behavior at an unprecedented resolution and scale. Past research has demonstrated the capability of mobile sensing to model aspects of physical health, mental health, education, and work performance, etc. However, most of the algorithms and models proposed in previous work follow a one-size-fits-all (i.e., population modeling) approach that looks for common behaviors amongst all users, disregarding the fact that individuals can behave very differently, resulting in reduced model performance. Further, black-box models are often used that do not allow for interpretability and human behavior understanding. We present a new method to address the problems of personalized behavior classification and interpretability, and apply it to depression detection among college students. Inspired by the idea of collaborative-filtering, our method is a type of memory-based learning algorithm. It leverages the relevance of mobile-sensed behavior features among individuals to calculate personalized relevance weights, which are used to impute missing data and select features according to a specific modeling goal (e.g., whether the student has depressive symptoms) in different time epochs, i.e., times of the day and days of the week. It then compiles features from epochs using majority voting to obtain the final prediction. We apply our algorithm on a depression detection dataset collected from first-year college students with low data-missing rates and show that our method outperforms the state-of-the-art machine learning model by 5.1% in accuracy and 5.5% in F1 score. We further verify the pipeline-level generalizability of our approach by achieving similar results on a second dataset, with an average improvement of 3.4% across performance metrics. Beyond achieving better classification performance, our novel approach is further able to generate personalized interpretations of the models for each individual. These interpretations are supported by existing depression-related literature and can potentially inspire automated and personalized depression intervention design in the future.

Download Full-text

The Segmented Colour Feature Extreme Learning Machine: Applications in Agricultural Robotics

Agronomy ◽

10.3390/agronomy11112290 ◽

2021 ◽

Vol 11 (11) ◽

pp. 2290

Author(s):

Edmund J. Sadgrove ◽

Greg Falzon ◽

David Miron ◽

David W. Lamb

Keyword(s):

Extreme Learning Machine ◽

Learning Algorithm ◽

Model Performance ◽

Image Features ◽

Majority Voting ◽

Decision Matrix ◽

Colour Image ◽

Colour Feature ◽

Learning Machine ◽

Performance Results

This study presents the Segmented Colour Feature Extreme Learning Machine (SCF-ELM). The SCF-ELM is inspired by the Extreme Learning Machine (ELM) which is known for its rapid training and inference times. The ELM is therefore an ideal candidate for an ensemble learning algorithm. The Colour Feature Extreme Learning Machine (CF-ELM) is used in this study due to its additional ability to extract colour image features. The SCF-ELM is an ensemble learner that utilizes feature mapping via k-means clustering, a decision matrix and majority voting. It has been evaluated on a range of challenging agricultural object classification scenarios including weed, livestock and machinery detection. SCF-ELM model performance results were excellent both in terms of detection, 90 to 99% accuracy, and also inference times, around 0.01(s) per image. The SCF-ELM was able to compete or improve upon established algorithms in its class, indicating its potential for remote computing applications in agriculture.

Download Full-text

Human Behavior Modeling Method Based on the Causality Between the Situation and the Behavior

IEEJ Transactions on Electronics Information and Systems ◽

10.1541/ieejeiss.131.635 ◽

2011 ◽

Vol 131 (3) ◽

pp. 635-643 ◽

Cited By ~ 6

Author(s):

Kohjiro Hashimoto ◽

Kae Doki ◽

Shinji Doki ◽

Shigeru Okuma ◽

Akihiro Torii

Keyword(s):

Human Behavior ◽

Modeling Method ◽

Behavior Modeling ◽

Human Behavior Modeling

Download Full-text

Faculty Opinions recommendation of Understanding increments in model performance metrics.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718099078.793483265 ◽

2013 ◽

Author(s):

Ewout Steyerberg

Keyword(s):

Performance Metrics ◽

Model Performance

Download Full-text

Exploiting Social Networks for Large-Scale Human Behavior Modeling

IEEE Pervasive Computing ◽

10.1109/mprv.2011.70 ◽

2011 ◽

Vol 10 (4) ◽

pp. 45-53 ◽

Cited By ~ 29

Author(s):

Nicholas D. Lane ◽

Ye Xu ◽

Hong Lu ◽

Andrew T. Campbell ◽

Tanzeem Choudhury ◽

...

Keyword(s):

Social Networks ◽

Human Behavior ◽

Large Scale ◽

Behavior Modeling ◽

Human Behavior Modeling

Download Full-text

Exploring the Uncertainty Space of Ensemble Classifiers in Face Recognition

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001415560029 ◽

2015 ◽

Vol 29 (03) ◽

pp. 1556002 ◽

Cited By ~ 7

Author(s):

Juan Luis Fernández-Martínez ◽

Ana Cernea

Keyword(s):

Face Recognition ◽

Nearest Neighbor ◽

Learning Algorithm ◽

Majority Voting ◽

Borda Count ◽

Discrete Wavelet ◽

Final Decision ◽

Ensemble Classifiers ◽

Classification Problems ◽

Uncertainty Space

In this paper, we present a supervised ensemble learning algorithm, called SCAV1, and its application to face recognition. This algorithm exploits the uncertainty space of the ensemble classifiers. Its design includes six different nearest-neighbor (NN) classifiers that are based on different and diverse image attributes: histogram, variogram, texture analysis, edges, bidimensional discrete wavelet transform and Zernike moments. In this approach each attribute, together with its corresponding type of the analysis (local or global), and the distance criterion (p-norm) induces a different individual NN classifier. The ensemble classifier SCAV1 depends on a set of parameters: the number of candidate images used by each individual method to perform the final classification and the individual weights given to each individual classifier. SCAV1 parameters are optimized/sampled using a supervised approach via the regressive particle swarm optimization algorithm (RR-PSO). The final classifier exploits the uncertainty space of SCAV1 and uses majority voting (Borda Count) as a final decision rule. We show the application of this algorithm to the ORL and PUT image databases, obtaining very high and stable accuracies (100% median accuracy and almost null interquartile range). In conclusion, exploring the uncertainty space of ensemble classifiers provides optimum results and seems to be the appropriate strategy to adopt for face recognition and other classification problems.

Download Full-text

Effects of feature type, learning algorithm and speaking style for depression detection from speech

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2015.7178877 ◽

2015 ◽

Cited By ~ 6

Author(s):

Vikramjit Mitra ◽

Elizabeth Shriberg

Keyword(s):

Learning Algorithm ◽

Depression Detection ◽

Speaking Style

Download Full-text

Integrating Human Behavior Modeling and Data Mining Techniques to Predict Human Errors in Numerical Typing

IEEE Transactions on Human-Machine Systems ◽

10.1109/thms.2014.2357178 ◽

2015 ◽

Vol 45 (1) ◽

pp. 39-50 ◽

Cited By ~ 14

Author(s):

Cheng-Jhe Lin ◽

Changxu Wu ◽

Wanpracha A. Chaovalitwongse

Keyword(s):

Data Mining ◽

Human Behavior ◽

Behavior Modeling ◽

Human Errors ◽

Data Mining Techniques ◽

Human Behavior Modeling

Download Full-text

Mind the gap: preventing circularity in missense variant prediction

10.1101/2020.05.06.080424 ◽

2020 ◽

Author(s):

Stephan Heijl ◽

Bas Vroling ◽

Tom van den Bergh ◽

Henk-Jan Joosten

Keyword(s):

Real World ◽

Performance Metrics ◽

Model Performance ◽

Predictive Performance ◽

Missense Variant ◽

Training Methods ◽

The Real ◽

Evaluation Scores ◽

Evaluation Strategies ◽

Generic Strategy

AbstractDespite advances in the field of missense variant effect prediction, the real clinical utility of current computational approaches remains rather limited. There is a large difference in performance metrics reported by developers and those observed in the real world. Most currently available predictors suffer from one or more types of circularity in their training and evaluation strategies that lead to overestimation of predictive performance. We present a generic strategy that is independent of dataset properties and algorithms used, to deal with circularity in the training phase. This results in more robust predictors and evaluation scores that accurately reflect the real-world performance of predictive models. Additionally, we show that commonly used training methods can have an adverse impact on model performance and lead to gross overestimation of true predictive performance.

Download Full-text

Quantifying location error to define uncertainty in volcanic mass flow hazard simulations

Natural Hazards and Earth System Science ◽

10.5194/nhess-21-2447-2021 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2447-2460

Author(s):

Stuart R. Mead ◽

Jonathan Procter ◽

Gabor Kereszturi

Keyword(s):

Mass Flow ◽

Performance Metrics ◽

Numerical Models ◽

Model Performance ◽

Flow Simulation ◽

Model Complexity ◽

Model Parameters ◽

Spatial Covariance ◽

Trade Offs ◽

Pixel Pair

Abstract. The use of mass flow simulations in volcanic hazard zonation and mapping is often limited by model complexity (i.e. uncertainty in correct values of model parameters), a lack of model uncertainty quantification, and limited approaches to incorporate this uncertainty into hazard maps. When quantified, mass flow simulation errors are typically evaluated on a pixel-pair basis, using the difference between simulated and observed (“actual”) map-cell values to evaluate the performance of a model. However, these comparisons conflate location and quantification errors, neglecting possible spatial autocorrelation of evaluated errors. As a result, model performance assessments typically yield moderate accuracy values. In this paper, similarly moderate accuracy values were found in a performance assessment of three depth-averaged numerical models using the 2012 debris avalanche from the Upper Te Maari crater, Tongariro Volcano, as a benchmark. To provide a fairer assessment of performance and evaluate spatial covariance of errors, we use a fuzzy set approach to indicate the proximity of similarly valued map cells. This “fuzzification” of simulated results yields improvements in targeted performance metrics relative to a length scale parameter at the expense of decreases in opposing metrics (e.g. fewer false negatives result in more false positives) and a reduction in resolution. The use of this approach to generate hazard zones incorporating the identified uncertainty and associated trade-offs is demonstrated and indicates a potential use for informed stakeholders by reducing the complexity of uncertainty estimation and supporting decision-making from simulated data.

Download Full-text

Blue-collared Workers’ Travel Behavior Modeling using “exPlainable” Machine Learning Model: The Case of Qatar

10.29117/quarfe.2021.0198 ◽

2021 ◽

Author(s):

Aya Alkhereibi ◽

Ali AbuZaid ◽

Tadesse Wakjira

Keyword(s):

Machine Learning ◽

Travel Behavior ◽

Total Population ◽

Performance Metrics ◽

Predictive Accuracy ◽

Mode Choice ◽

Behavior Modeling ◽

Significant Feature ◽

Machine Learning Model ◽

Occupation Level

This paper presents a novel study on the examination of explainable machine learning (ML) technique to predict the mode choice for communities with a majority of blue-collared workers. A total of 4875 trip records for 1050 blue-collared workers have been used to predict their travel mode choices based on 11 trips and socio-economic attributes. The data used in this paper are obtained from the Ministry of Transportation and Communication (MoTC), which targeted blue-collared workers as they represent 89% of the total population in the State of Qatar. A total of four ML models are evaluated to propose the best predictive model. The four models were examined using different performance metrics. The models’ prediction results showed that the random forest (RF) model had the highest accuracy with a predictive accuracy of 0.97. Moreover, SHapley Additive exPlanation (SHAP) approach is used to investigate the significance of the input features and explain the output of the RF model. The results of SHAP analysis revealed that occupation level is the most significant feature that influences the mode choice followed by occupation section, arrival time, and arrival municipality.

Download Full-text