Psychometric and Validity Issues in Machine Learning Approaches to Personality Assessment: A Focus on Social Media Text Mining

Louis Tay; Sang Eun Woo; Louis Hickman; Rachel M. Saef

doi:10.1002/per.2290

Psychometric and Validity Issues in Machine Learning Approaches to Personality Assessment: A Focus on Social Media Text Mining

European Journal of Personality ◽

10.1002/per.2290 ◽

2020 ◽

Vol 34 (5) ◽

pp. 826-844 ◽

Cited By ~ 1

Author(s):

Louis Tay ◽

Sang Eun Woo ◽

Louis Hickman ◽

Rachel M. Saef

Keyword(s):

Machine Learning ◽

Social Media ◽

Text Mining ◽

Personality Assessment ◽

Ground Truth ◽

Psychometric Validation ◽

Learning Approaches ◽

Text Data ◽

Personality Psychology ◽

Social Media Text

In the age of big data, substantial research is now moving toward using digital footprints like social media text data to assess personality. Nevertheless, there are concerns and questions regarding the psychometric and validity evidence of such approaches. We seek to address this issue by focusing on social media text data and (i) conducting a review of psychometric validation efforts in social media text mining (SMTM) for personality assessment and discussing additional work that needs to be done; (ii) considering additional validity issues from the standpoint of reference (i.e. ‘ground truth’) and causality (i.e. how personality determines variations in scores derived from SMTM); and (iii) discussing the unique issues of generalizability when validating SMTM for personality assessment across different social media platforms and populations. In doing so, we explicate the key validity and validation issues that need to be considered as a field to advance SMTM for personality assessment, and, more generally, machine learning personality assessment methods. © 2020 European Association of Personality Psychology

Download Full-text

Cross-platform comparison of framed topics in Twitter and Weibo: machine learning approaches to social media text mining

Social Network Analysis and Mining ◽

10.1007/s13278-021-00772-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Yi Yang ◽

Jia-Huey Hsu ◽

Karl Löfgren ◽

Wonhyuk Cho

Keyword(s):

Machine Learning ◽

Social Media ◽

Text Mining ◽

Learning Approaches ◽

Social Media Text ◽

Platform Comparison ◽

Cross Platform

Download Full-text

CyberCan: A New Dictionary for Cantonese Social Media Text Segmentation

10.31235/osf.io/tyjr7 ◽

2021 ◽

Author(s):

Fei Shen ◽

Wenting Yu ◽

Chen Min ◽

Qianying Ye ◽

Chuanli Xia ◽

...

Keyword(s):

Social Media ◽

Text Mining ◽

Word Segmentation ◽

Unstructured Data ◽

Text Segmentation ◽

Chinese Word ◽

Chinese Word Segmentation ◽

Text Data ◽

Social Media Text

Text mining has been a dominant approach to extracting useful information from massive unstructured data online. But existing tools for Chinese word segmentation are not ideal for processing social media text data in Cantonese. This project developed CyberCan (https://github.com/shenfei1010/CyberCan), a lexicon of contemporary Cantonese based on more than 100 million pieces of internet texts. We compared the performance of CyberCan with existing Mandarin and Cantonese lexicons in terms of their word segmentation performance. Findings suggest that CyberCan outperforms all existing lexicons by a considerable margin.

Download Full-text

Using Machine Learning to Advance Personality Assessment and Theory

Personality and Social Psychology Review ◽

10.1177/1088868318772990 ◽

2018 ◽

Vol 23 (2) ◽

pp. 190-203 ◽

Cited By ~ 27

Author(s):

Wiebke Bleidorn ◽

Christopher James Hopwood

Keyword(s):

Machine Learning ◽

Social Media ◽

Personality Assessment ◽

Construct Validation ◽

Assessment Tools ◽

Learning Approaches ◽

Psychological Science ◽

Validation Framework ◽

Learning Research ◽

Applications Of Machine Learning

Machine learning has led to important advances in society. One of the most exciting applications of machine learning in psychological science has been the development of assessment tools that can powerfully predict human behavior and personality traits. Thus far, machine learning approaches to personality assessment have focused on the associations between social media and other digital records with established personality measures. The goal of this article is to expand the potential of machine learning approaches to personality assessment by embedding it in a more comprehensive construct validation framework. We review recent applications of machine learning to personality assessment, place machine learning research in the broader context of fundamental principles of construct validation, and provide recommendations for how to use machine learning to advance our understanding of personality.

Download Full-text

Rapid Assessment of Customer Marketplace in Disaster Settings through Machine Learning, Geospatial Information, and Social Media Text Mining: An Abstract

Developments in Marketing Science: Proceedings of the Academy of Marketing Science - Finding New Ways to Engage and Satisfy Global Customers ◽

10.1007/978-3-030-02568-7_133 ◽

2019 ◽

pp. 479-480

Author(s):

Rajiv Garg ◽

Patrick Brockett ◽

Linda L. Golden ◽

Yuxin Zhang

Keyword(s):

Machine Learning ◽

Social Media ◽

Text Mining ◽

Rapid Assessment ◽

Geospatial Information ◽

Social Media Text

Download Full-text

A Framework for Sentiment Analysis of Telugu Tweets

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f1602.089620 ◽

2020 ◽

Vol 9 (6) ◽

pp. 523-525

Keyword(s):

Neural Network ◽

Machine Learning ◽

Social Media ◽

Deep Learning ◽

Sentiment Analysis ◽

Recurrent Neural Network ◽

English Language ◽

Research Work ◽

Learning Approaches ◽

Text Data

Now a day Social Media like Facebook, twitter and Instagram is major Sources for people to share their emotions based on the current situations in society. By knowing the interesting patterns in it, a government/appropriate person for that situation can take good and useful decisions. Sentiment analysis is a method where people can extract the useful information from the text like the emotions (happy, sad, and neutral) of people. Much research work was been underdoing in the area of sentiment analysis. Among that work the Machine learning and Deep learning approaches plays a maximum role. Existing works on sentiment analysis is going in the English language. In this paper, proposed a novel framework that specifically designed to do sentiment analysis of the text data, that available in the telugu language. The proposed framework was integrated with the word embedding model Word2Vec, language translator and deep learning approaches like Recurrent Neural Network and Navie base algorithms to collect and analyse the sentiment in tweeter data that present in telugu language. The results shows effective in terms of accuracy, precision and specificity.

Download Full-text

Using Machine Learning to Advance Personality Assessment and Theory

10.31234/osf.io/ctr5g ◽

2018 ◽

Cited By ~ 3

Author(s):

Wiebke Bleidorn ◽

Christopher James Hopwood

Keyword(s):

Machine Learning ◽

Social Media ◽

Personality Assessment ◽

Construct Validation ◽

Assessment Tools ◽

Learning Approaches ◽

Psychological Science ◽

Validation Framework ◽

Learning Research ◽

Applications Of Machine Learning

Machine learning has led to important advances in society. One of the most exciting applications of machine learning in psychological science has been the development of assessment tools that can powerfully predict human behavior and personality traits. Thus far, machine learning approaches to personality assessment have been focused on the associations between social media and other digital records with established personality measures. The goal of this paper is to expand the potential of machine learning approaches to personality assessment by embedding it in a more comprehensive construct validation framework. We review recent applications of machine learning to personality assessment, place machine learning research in the broader context of fundamental principles of construct validation and provide recommendations for how to use machine learning to advance our understanding of personality.

Download Full-text

Suspicious Tweet Identification Using Machine Learning Approaches for Improving Social Media Marketing Analysis

International Journal of Business Intelligence and Data Mining ◽

10.1504/ijbidm.2022.10040478 ◽

2022 ◽

Vol 1 (1) ◽

pp. 1

Author(s):

Thamaraiselvan Natarajan ◽

Senthil Arasu Balasubramanian ◽

Jonath BackiaSeelan

Keyword(s):

Machine Learning ◽

Social Media ◽

Social Media Marketing ◽

Learning Approaches ◽

Marketing Analysis

Download Full-text

Detection of Economy-Related Turkish Tweets Based on Machine Learning Approaches

10.4018/978-1-7998-8413-2.ch008 ◽

2022 ◽

pp. 171-195

Author(s):

Jale Bektaş

Keyword(s):

Machine Learning ◽

Text Mining ◽

Text Classification ◽

Integration Method ◽

Classification Problem ◽

Feature Representation ◽

Learning Approaches ◽

Machine Learning Methods ◽

Linguistic Approach ◽

Turkish Language

Conducting NLP for Turkish is a lot harder than other Latin-based languages such as English. In this study, by using text mining techniques, a pre-processing frame is conducted in which TF-IDF values are calculated in accordance with a linguistic approach on 7,731 tweets shared by 13 famous economists in Turkey, retrieved from Twitter. Then, the classification results are compared with four common machine learning methods (SVM, Naive Bayes, LR, and integration LR with SVM). The features represented by the TF-IDF are experimented in different N-grams. The findings show the success of a text classification problem is relative with the feature representation methods, and the performance superiority of SVM is better compared to other ML methods with unigram feature representation. The best results are obtained via the integration method of SVM with LR with the Acc of 82.9%. These results show that these methodologies are satisfying for the Turkish language.

Download Full-text

A Generalized Relationship Mining Method for Social Media Text Data

Machine Learning and Data Mining in Pattern Recognition - Lecture Notes in Computer Science ◽

10.1007/978-3-319-08979-9_28 ◽

2014 ◽

pp. 376-392 ◽

Cited By ~ 1

Author(s):

Tuhin Sharma ◽

Durga Toshniwal

Keyword(s):

Social Media ◽

Mining Method ◽

Text Data ◽

Social Media Text

Download Full-text

Active Learning Approaches for Labeling Text: Review and Assessment of the Performance of Active Learning Approaches

Political Analysis ◽

10.1017/pan.2020.4 ◽

2020 ◽

Vol 28 (4) ◽

pp. 532-551

Author(s):

Blake Miller ◽

Fridolin Linder ◽

Walter R. Mebane

Keyword(s):

Machine Learning ◽

Active Learning ◽

Random Sampling ◽

Supervised Machine Learning ◽

Learning Approaches ◽

Simulation Studies ◽

Text Data ◽

Passive Learning ◽

Machine Learning Model ◽

The Cost

Supervised machine learning methods are increasingly employed in political science. Such models require costly manual labeling of documents. In this paper, we introduce active learning, a framework in which data to be labeled by human coders are not chosen at random but rather targeted in such a way that the required amount of data to train a machine learning model can be minimized. We study the benefits of active learning using text data examples. We perform simulation studies that illustrate conditions where active learning can reduce the cost of labeling text data. We perform these simulations on three corpora that vary in size, document length, and domain. We find that in cases where the document class of interest is not balanced, researchers can label a fraction of the documents one would need using random sampling (or “passive” learning) to achieve equally performing classifiers. We further investigate how varying levels of intercoder reliability affect the active learning procedures and find that even with low reliability, active learning performs more efficiently than does random sampling.

Download Full-text