Lender Trust on the P2P Lending: Analysis Based on Sentiment Analysis of Comment Text

Beibei Niu; Jinzheng Ren; Ansa Zhao; Xiaotao Li

doi:10.3390/su12083293

Lender Trust on the P2P Lending: Analysis Based on Sentiment Analysis of Comment Text

Sustainability ◽

10.3390/su12083293 ◽

2020 ◽

Vol 12 (8) ◽

pp. 3293 ◽

Cited By ~ 2

Author(s):

Beibei Niu ◽

Jinzheng Ren ◽

Ansa Zhao ◽

Xiaotao Li

Keyword(s):

Theoretical Basis ◽

Analytical Approach ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Text Data ◽

The Core ◽

Operational Level ◽

Core Subject ◽

P2p Lending ◽

Subject Areas

Lender trust is important to ensure the sustainability of P2P lending. This paper uses web crawling to collect more than 240,000 unique pieces of comment text data. Based on the mapping relationship between emotion and trust, we use the lexicon-based method and deep learning to check the trust of a given lender in P2P lending. Further, we use the Latent Dirichlet Allocation (LDA) topic model to mine topics concerned with this research. The results show that lenders are positive about P2P lending, though this tendency fluctuates downward with time. The security, rate of return, and compliance of P2P lending are the issues of greatest concern to lenders. This study reveals the core subject areas that influence a lender’s emotions and trusts and provides a theoretical basis and empirical reference for relevant platforms to improve their operational level while enhancing competitiveness. This analytical approach offers insights for researchers to understand the hidden content behind the text data.

Download Full-text

An exploration of text mining of narrative reports of injury incidents to assess risk

MATEC Web of Conferences ◽

10.1051/matecconf/201825106020 ◽

2018 ◽

Vol 251 ◽

pp. 06020 ◽

Cited By ~ 4

Author(s):

David Passmore ◽

Chungil Chae ◽

Yulia Kustikova ◽

Rose Baker ◽

Jeong-Ha Yim

Keyword(s):

Latent Dirichlet Allocation ◽

Topic Model ◽

Surface Mining ◽

Modeling Processes ◽

Free Text ◽

Text Data ◽

Injury Occurrence ◽

The Usa ◽

Musculoskeletal Systems ◽

Topic Mining

A topic model was explored using unsupervised machine learning to summarized free-text narrative reports of 77,215 injuries that occurred in coal mines in the USA between 2000 and 2015. Latent Dirichlet Allocation modeling processes identified six topics from the free-text data. One topic, a theme describing primarily injury incidents resulting in strains and sprains of musculoskeletal systems, revealed differences in topic emphasis by the location of the mine property at which injuries occurred, the degree of injury, and the year of injury occurrence. Text narratives clustered around this topic refer most frequently to surface or other locations rather than underground locations that resulted in disability and that, also, increased secularly over time. The modeling success enjoyed in this exploratory effort suggests that additional topic mining of these injury text narratives is justified, especially using a broad set of covariates to explain variations in topic emphasis and for comparison of surface mining injuries with injuries occurring during site preparation for construction.

Download Full-text

A New Vector Representation of Short Texts for Classification

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/2/12 ◽

2019 ◽

Vol 17 (2) ◽

pp. 241-249

Author(s):

Yangyang Li ◽

Bo Liu

Keyword(s):

Text Classification ◽

Web Search ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Classification Performance ◽

New Method ◽

Data Sets ◽

Text Data ◽

Short Text ◽

Space Model

Short and sparse characteristics and synonyms and homonyms are main obstacles for short-text classification. In recent years, research on short-text classification has focused on expanding short texts but has barely guaranteed the validity of expanded words. This study proposes a new method to weaken these effects without external knowledge. The proposed method analyses short texts by using the topic model based on Latent Dirichlet Allocation (LDA), represents each short text by using a vector space model and presents a new method to adjust the vector of short texts. In the experiments, two open short-text data sets composed of google news and web search snippets are utilised to evaluate the classification performance and prove the effectiveness of our method.

Download Full-text

Ldagibbs: A Command for Topic Modeling in Stata Using Latent Dirichlet Allocation

The Stata Journal Promoting communications on statistics and Stata ◽

10.1177/1536867x1801800107 ◽

2018 ◽

Vol 18 (1) ◽

pp. 101-117 ◽

Cited By ~ 10

Author(s):

Carlo Schwarz

Keyword(s):

Machine Learning ◽

Probability Distribution ◽

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Topic Models ◽

Text Documents ◽

Text Data ◽

Dirichlet Allocation

In this article, I introduce the ldagibbs command, which implements latent Dirichlet allocation in Stata. Latent Dirichlet allocation is the most popular machine-learning topic model. Topic models automatically cluster text documents into a user-chosen number of topics. Latent Dirichlet allocation represents each document as a probability distribution over topics and represents each topic as a probability distribution over words. Therefore, latent Dirichlet allocation provides a way to analyze the content of large unclassified text data and an alternative to predefined document classifications.

Download Full-text

CLDA: An Effective Topic Model for Mining User Interest Preference under Big Data Background

Complexity ◽

10.1155/2018/2503816 ◽

2018 ◽

Vol 2018 ◽

pp. 1-10 ◽

Cited By ~ 5

Author(s):

Lirong Qiu ◽

Jia Yu

Keyword(s):

Big Data ◽

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

User Interest ◽

Text Data ◽

Data Set ◽

Data Sparsity ◽

Short Text ◽

Text Filtering

In the present big data background, how to effectively excavate useful information is the problem that big data is facing now. The purpose of this study is to construct a more effective method of mining interest preferences of users in a particular field in the context of today’s big data. We mainly use a large number of user text data from microblog to study. LDA is an effective method of text mining, but it will not play a very good role in applying LDA directly to a large number of short texts in microblog. In today’s more effective topic modeling project, short texts need to be aggregated into long texts to avoid data sparsity. However, aggregated short texts are mixed with a lot of noise, reducing the accuracy of mining the user’s interest preferences. In this paper, we propose Combining Latent Dirichlet Allocation (CLDA), a new topic model that can learn the potential topics of microblog short texts and long texts simultaneously. The data sparsity of short texts is avoided by aggregating long texts to assist in learning short texts. Short text filtering long text is reused to improve mining accuracy, making long texts and short texts effectively combined. Experimental results in a real microblog data set show that CLDA outperforms many advanced models in mining user interest, and we also confirm that CLDA also has good performance in recommending systems.

Download Full-text

A Text-Mining Analysis on the Review of the Non-Financial Reporting Directive: Bringing Value Creation for Stakeholders into Accounting

Sustainability ◽

10.3390/su13020763 ◽

2021 ◽

Vol 13 (2) ◽

pp. 763

Author(s):

Simona Fiandrino ◽

Alberto Tonelli

Keyword(s):

Text Mining ◽

Financial Reporting ◽

Value Creation ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Current Debate ◽

The Core ◽

Probabilistic Topic Model ◽

Integrated Logic

The recent Review of the Non-Financial Reporting Directive (NFRD) aims to enhance adequate non-financial information (NFI) disclosure and improve accountability for stakeholders. This study focuses on this regulatory intervention and has a twofold objective: First, it aims to understand the main underlying issues at stake; second, it suggests areas of possible amendment considering the current debates on sustainability accounting and accounting for stakeholders. In keeping with these aims, the research analyzes the documents annexed to the contribution on the Review of the NFRD by conducting a text-mining analysis with latent Dirichlet allocation (LDA) probabilistic topic model (PTM). Our findings highlight four main topics at the core of the current debate: quality of NFI, standardization, materiality, and assurance. The research suggests ways of improving managerial policies to achieve more comparable, relevant, and reliable information by bringing value creation for stakeholders into accounting. It further addresses an integrated logic of accounting for stakeholders that contributes to sustainable development.

Download Full-text

Federated Latent Dirichlet Allocation: A Local Differential Privacy Based Framework

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6096 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6283-6290 ◽

Cited By ~ 2

Author(s):

Yansheng Wang ◽

Yongxin Tong ◽

Dingyuan Shi

Keyword(s):

Data Privacy ◽

Latent Dirichlet Allocation ◽

Differential Privacy ◽

Topic Model ◽

Text Data ◽

Data Collector ◽

Industrial Grade ◽

Model Training ◽

Open Datasets ◽

Dirichlet Allocation

Latent Dirichlet Allocation (LDA) is a widely adopted topic model for industrial-grade text mining applications. However, its performance heavily relies on the collection of large amount of text data from users' everyday life for model training. Such data collection risks severe privacy leakage if the data collector is untrustworthy. To protect text data privacy while allowing accurate model training, we investigate federated learning of LDA models. That is, the model is collaboratively trained between an untrustworthy data collector and multiple users, where raw text data of each user are stored locally and not uploaded to the data collector. To this end, we propose FedLDA, a local differential privacy (LDP) based framework for federated learning of LDA models. Central in FedLDA is a novel LDP mechanism called Random Response with Priori (RRP), which provides theoretical guarantees on both data privacy and model accuracy. We also design techniques to reduce the communication cost between the data collector and the users during model training. Extensive experiments on three open datasets verified the effectiveness of our solution.

Download Full-text

Predicting the citation and impact factor of terms for scientific publications using machine learning algorithms

CPT2020 The 8th International Scientific Conference on Computing in Physics and Technology Proceedings ◽

10.30987/conferencearticle_5fd755c0ea6458.82600196 ◽

2020 ◽

Author(s):

Aleksey Klokov ◽

Evgenii Slobodyuk ◽

Michael Charnine

Keyword(s):

Machine Learning ◽

Semantic Processing ◽

The Body ◽

Machine Learning Algorithms ◽

Scientific Publications ◽

Text Data ◽

Semantic Relationships ◽

Subject Areas ◽

The Subject ◽

Scientific Environment

The object of the research when writing the work was the body of text data collected together with the scientific advisor and the algorithms for processing the natural language of analysis. The stream of hypotheses has been tested against computer science scientific publications through a series of simulation experiments described in this dissertation. The subject of the research is algorithms and the results of the algorithms, aimed at predicting promising topics and terms that appear in the course of time in the scientific environment. The result of this work is a set of machine learning models, with the help of which experiments were carried out to identify promising terms and semantic relationships in the text corpus. The resulting models can be used for semantic processing and analysis of other subject areas.

Download Full-text

Intelligent radar software defect classification approach based on the latent Dirichlet allocation topic model

EURASIP Journal on Advances in Signal Processing ◽

10.1186/s13634-021-00761-3 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Xi Liu ◽

Yongfeng Yin ◽

Haifeng Li ◽

Jiabin Chen ◽

Chang Liu ◽

...

Keyword(s):

Latent Dirichlet Allocation ◽

Topic Model ◽

Recall Rate ◽

Defect Classification ◽

Software Defects ◽

Classification Approach ◽

Software Defect ◽

Model Combining ◽

Dirichlet Allocation

AbstractExisting software intelligent defect classification approaches do not consider radar characters and prior statistics information. Thus, when applying these appaoraches into radar software testing and validation, the precision rate and recall rate of defect classification are poor and have effect on the reuse effectiveness of software defects. To solve this problem, a new intelligent defect classification approach based on the latent Dirichlet allocation (LDA) topic model is proposed for radar software in this paper. The proposed approach includes the defect text segmentation algorithm based on the dictionary of radar domain, the modified LDA model combining radar software requirement, and the top acquisition and classification approach of radar software defect based on the modified LDA model. The proposed approach is applied on the typical radar software defects to validate the effectiveness and applicability. The application results illustrate that the prediction precison rate and recall rate of the poposed approach are improved up to 15 ~ 20% compared with the other defect classification approaches. Thus, the proposed approach can be applied in the segmentation and classification of radar software defects effectively to improve the identifying adequacy of the defects in radar software.

Download Full-text

Research progress and trend of leader member exchange based on social complex network and latent dirichlet allocation topic model

2020 2nd International Conference on Economic Management and Model Engineering (ICEMME) ◽

10.1109/icemme51517.2020.00090 ◽

2020 ◽

Author(s):

Zhang chunyang ◽

Ding kun ◽

Zhang chunbo ◽

Zhang li

Keyword(s):

Complex Network ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Research Progress ◽

Leader Member Exchange ◽

Member Exchange ◽

Dirichlet Allocation

Download Full-text

Technology Hotspot Tracking: Topic Discovery and Evolution of China’s Blockchain Patents Based on a Dynamic LDA Model

Symmetry ◽

10.3390/sym13030415 ◽

2021 ◽

Vol 13 (3) ◽

pp. 415

Author(s):

Jinli Wang ◽

Yong Fan ◽

Hui Zhang ◽

Libo Feng

Keyword(s):

Latent Dirichlet Allocation ◽

Topic Model ◽

Graph Model ◽

Representation Learning ◽

Research Direction ◽

Calculation Model ◽

Topic Evolution ◽

Blockchain Technology ◽

The Status ◽

Research Hotspots

Tracking scientific and technological (S&T) research hotspots can help scholars to grasp the status of current research and develop regular patterns in the field over time. It contributes to the generation of new ideas and plays an important role in promoting the writing of scientific research projects and scientific papers. Patents are important S&T resources, which can reflect the development status of the field. In this paper, we use topic modeling, topic intensity, and evolutionary computing models to discover research hotspots and development trends in the field of blockchain patents. First, we propose a time-based dynamic latent Dirichlet allocation (TDLDA) modeling method based on a probabilistic graph model and knowledge representation learning for patent text mining. Second, we present a computational model, topic intensity (TI), that expresses the topic strength and evolution. Finally, the point-wise mutual information (PMI) value is used to evaluate topic quality. We obtain 20 hot topics through TDLDA experiments and rank them according to the strength calculation model. The topic evolution model is used to analyze the topic evolution trend from the perspectives of rising, falling, and stable. From the experiments we found that 8 topics showed an upward trend, 6 topics showed a downward trend, and 6 topics became stable or fluctuated. Compared with the baseline method, TDLDA can have the best effect when K is 40 or less. TDLDA is an effective topic model that can extract hot topics and evolution trends of blockchain patent texts, which helps researchers to more accurately grasp the research direction and improves the quality of project application and paper writing in the blockchain technology domain.

Download Full-text