Leveraging Collaborative-Filtering for Personalized Behavior Modeling

Author(s):  
Xuhai Xu ◽  
Prerna Chikersal ◽  
Janine M. Dutcher ◽  
Yasaman S. Sefidgar ◽  
Woosuk Seo ◽  
...  

The prevalence of mobile phones and wearable devices enables the passive capturing and modeling of human behavior at an unprecedented resolution and scale. Past research has demonstrated the capability of mobile sensing to model aspects of physical health, mental health, education, and work performance, etc. However, most of the algorithms and models proposed in previous work follow a one-size-fits-all (i.e., population modeling) approach that looks for common behaviors amongst all users, disregarding the fact that individuals can behave very differently, resulting in reduced model performance. Further, black-box models are often used that do not allow for interpretability and human behavior understanding. We present a new method to address the problems of personalized behavior classification and interpretability, and apply it to depression detection among college students. Inspired by the idea of collaborative-filtering, our method is a type of memory-based learning algorithm. It leverages the relevance of mobile-sensed behavior features among individuals to calculate personalized relevance weights, which are used to impute missing data and select features according to a specific modeling goal (e.g., whether the student has depressive symptoms) in different time epochs, i.e., times of the day and days of the week. It then compiles features from epochs using majority voting to obtain the final prediction. We apply our algorithm on a depression detection dataset collected from first-year college students with low data-missing rates and show that our method outperforms the state-of-the-art machine learning model by 5.1% in accuracy and 5.5% in F1 score. We further verify the pipeline-level generalizability of our approach by achieving similar results on a second dataset, with an average improvement of 3.4% across performance metrics. Beyond achieving better classification performance, our novel approach is further able to generate personalized interpretations of the models for each individual. These interpretations are supported by existing depression-related literature and can potentially inspire automated and personalized depression intervention design in the future.

Agronomy ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 2290
Author(s):  
Edmund J. Sadgrove ◽  
Greg Falzon ◽  
David Miron ◽  
David W. Lamb

This study presents the Segmented Colour Feature Extreme Learning Machine (SCF-ELM). The SCF-ELM is inspired by the Extreme Learning Machine (ELM) which is known for its rapid training and inference times. The ELM is therefore an ideal candidate for an ensemble learning algorithm. The Colour Feature Extreme Learning Machine (CF-ELM) is used in this study due to its additional ability to extract colour image features. The SCF-ELM is an ensemble learner that utilizes feature mapping via k-means clustering, a decision matrix and majority voting. It has been evaluated on a range of challenging agricultural object classification scenarios including weed, livestock and machinery detection. SCF-ELM model performance results were excellent both in terms of detection, 90 to 99% accuracy, and also inference times, around 0.01(s) per image. The SCF-ELM was able to compete or improve upon established algorithms in its class, indicating its potential for remote computing applications in agriculture.


2011 ◽  
Vol 131 (3) ◽  
pp. 635-643 ◽  
Author(s):  
Kohjiro Hashimoto ◽  
Kae Doki ◽  
Shinji Doki ◽  
Shigeru Okuma ◽  
Akihiro Torii

2011 ◽  
Vol 10 (4) ◽  
pp. 45-53 ◽  
Author(s):  
Nicholas D. Lane ◽  
Ye Xu ◽  
Hong Lu ◽  
Andrew T. Campbell ◽  
Tanzeem Choudhury ◽  
...  

Author(s):  
Juan Luis Fernández-Martínez ◽  
Ana Cernea

In this paper, we present a supervised ensemble learning algorithm, called SCAV1, and its application to face recognition. This algorithm exploits the uncertainty space of the ensemble classifiers. Its design includes six different nearest-neighbor (NN) classifiers that are based on different and diverse image attributes: histogram, variogram, texture analysis, edges, bidimensional discrete wavelet transform and Zernike moments. In this approach each attribute, together with its corresponding type of the analysis (local or global), and the distance criterion (p-norm) induces a different individual NN classifier. The ensemble classifier SCAV1 depends on a set of parameters: the number of candidate images used by each individual method to perform the final classification and the individual weights given to each individual classifier. SCAV1 parameters are optimized/sampled using a supervised approach via the regressive particle swarm optimization algorithm (RR-PSO). The final classifier exploits the uncertainty space of SCAV1 and uses majority voting (Borda Count) as a final decision rule. We show the application of this algorithm to the ORL and PUT image databases, obtaining very high and stable accuracies (100% median accuracy and almost null interquartile range). In conclusion, exploring the uncertainty space of ensemble classifiers provides optimum results and seems to be the appropriate strategy to adopt for face recognition and other classification problems.


2020 ◽  
Author(s):  
Stephan Heijl ◽  
Bas Vroling ◽  
Tom van den Bergh ◽  
Henk-Jan Joosten

AbstractDespite advances in the field of missense variant effect prediction, the real clinical utility of current computational approaches remains rather limited. There is a large difference in performance metrics reported by developers and those observed in the real world. Most currently available predictors suffer from one or more types of circularity in their training and evaluation strategies that lead to overestimation of predictive performance. We present a generic strategy that is independent of dataset properties and algorithms used, to deal with circularity in the training phase. This results in more robust predictors and evaluation scores that accurately reflect the real-world performance of predictive models. Additionally, we show that commonly used training methods can have an adverse impact on model performance and lead to gross overestimation of true predictive performance.


2021 ◽  
Vol 21 (8) ◽  
pp. 2447-2460
Author(s):  
Stuart R. Mead ◽  
Jonathan Procter ◽  
Gabor Kereszturi

Abstract. The use of mass flow simulations in volcanic hazard zonation and mapping is often limited by model complexity (i.e. uncertainty in correct values of model parameters), a lack of model uncertainty quantification, and limited approaches to incorporate this uncertainty into hazard maps. When quantified, mass flow simulation errors are typically evaluated on a pixel-pair basis, using the difference between simulated and observed (“actual”) map-cell values to evaluate the performance of a model. However, these comparisons conflate location and quantification errors, neglecting possible spatial autocorrelation of evaluated errors. As a result, model performance assessments typically yield moderate accuracy values. In this paper, similarly moderate accuracy values were found in a performance assessment of three depth-averaged numerical models using the 2012 debris avalanche from the Upper Te Maari crater, Tongariro Volcano, as a benchmark. To provide a fairer assessment of performance and evaluate spatial covariance of errors, we use a fuzzy set approach to indicate the proximity of similarly valued map cells. This “fuzzification” of simulated results yields improvements in targeted performance metrics relative to a length scale parameter at the expense of decreases in opposing metrics (e.g. fewer false negatives result in more false positives) and a reduction in resolution. The use of this approach to generate hazard zones incorporating the identified uncertainty and associated trade-offs is demonstrated and indicates a potential use for informed stakeholders by reducing the complexity of uncertainty estimation and supporting decision-making from simulated data.


2021 ◽  
Author(s):  
Aya Alkhereibi ◽  
Ali AbuZaid ◽  
Tadesse Wakjira

This paper presents a novel study on the examination of explainable machine learning (ML) technique to predict the mode choice for communities with a majority of blue-collared workers. A total of 4875 trip records for 1050 blue-collared workers have been used to predict their travel mode choices based on 11 trips and socio-economic attributes. The data used in this paper are obtained from the Ministry of Transportation and Communication (MoTC), which targeted blue-collared workers as they represent 89% of the total population in the State of Qatar. A total of four ML models are evaluated to propose the best predictive model. The four models were examined using different performance metrics. The models’ prediction results showed that the random forest (RF) model had the highest accuracy with a predictive accuracy of 0.97. Moreover, SHapley Additive exPlanation (SHAP) approach is used to investigate the significance of the input features and explain the output of the RF model. The results of SHAP analysis revealed that occupation level is the most significant feature that influences the mode choice followed by occupation section, arrival time, and arrival municipality.


Sign in / Sign up

Export Citation Format

Share Document