Word Vector Models Approach to Text Regression of Financial Risk Prediction

Hsiang-Yuan Yeh; Yu-Ching Yeh; Da-Bai Shen

doi:10.3390/sym12010089

Word Vector Models Approach to Text Regression of Financial Risk Prediction

Symmetry ◽

10.3390/sym12010089 ◽

2020 ◽

Vol 12 (1) ◽

pp. 89 ◽

Cited By ~ 1

Author(s):

Hsiang-Yuan Yeh ◽

Yu-Ching Yeh ◽

Da-Bai Shen

Keyword(s):

Financial Risk ◽

Information Economics ◽

Bag Of Words ◽

Word Embeddings ◽

Textual Information ◽

Financial Reports ◽

Domain Specific ◽

Regulatory Changes ◽

Wide Range ◽

Vector Representations

Linking textual information in finance reports to the stock return volatility provides a perspective on exploring useful insights for risk management. We introduce different kinds of word vector representations in the modeling of textual information: bag-of-words, pre-trained word embeddings, and domain-specific word embeddings. We apply linear and non-linear methods to establish a text regression model for volatility prediction. A large number of collected annually-published financial reports in the period from 1996 to 2013 is used in the experiments. We demonstrate that the domain-specific word vector learned from data not only captures lexical semantics, but also has better performance than the pre-trained word embeddings and traditional bag-of-words model. Our approach significantly outperforms with smaller prediction error in the regression task and obtains a 4%–10% improvement in the ranking task compared to state-of-the-art methods. These improvements suggest that the textual information may provide measurable effects on long-term volatility forecasting. In addition, we also find that the variations and regulatory changes in reports make older reports less relevant for volatility prediction. Our approach opens a new method of research into information economics and can be applied to a wide range of financial-related applications.

Download Full-text

Domain Heuristic Fusion of Multi-Word Embeddings for Nutrient Value Prediction

Mathematics ◽

10.3390/math9161941 ◽

2021 ◽

Vol 9 (16) ◽

pp. 1941

Author(s):

Gordana Ispirova ◽

Tome Eftimov ◽

Barbara Koroušić Seljak

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Nutrient Content ◽

Relevant Information ◽

Word Embeddings ◽

Short Text ◽

Domain Specific ◽

Nutrient Value ◽

Protein Prediction ◽

Vector Representations

Being both a poison and a cure for many lifestyle and non-communicable diseases, food is inscribing itself into the prime focus of precise medicine. The monitoring of few groups of nutrients is crucial for some patients, and methods for easing their calculations are emerging. Our proposed machine learning pipeline deals with nutrient prediction based on learned vector representations on short text–recipe names. In this study, we explored how the prediction results change when, instead of using the vector representations of the recipe description, we use the embeddings of the list of ingredients. The nutrient content of one food depends on its ingredients; therefore, the text of the ingredients contains more relevant information. We define a domain-specific heuristic for merging the embeddings of the ingredients, which combines the quantities of each ingredient in order to use them as features in machine learning models for nutrient prediction. The results from the experiments indicate that the prediction results improve when using the domain-specific heuristic. The prediction models for protein prediction were highly effective, with accuracies up to 97.98%. Implementing a domain-specific heuristic for combining multi-word embeddings yields better results than using conventional merging heuristics, with up to 60% more accuracy in some cases.

Download Full-text

FRIDAYS: A Financial Risk Information Detecting and Analyzing System

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019853 ◽

2019 ◽

Vol 33 ◽

pp. 9853-9854

Author(s):

Chi-Han Du ◽

Yi-Shyuan Chiang ◽

Kun-Che Tsai ◽

Liang-Chih Liu ◽

Ming-Feng Tsai ◽

...

Keyword(s):

Financial Risk ◽

Risk Information ◽

Financial Reports ◽

Domain Specific ◽

Different Levels

We present FRIDAYS, a financial risk information detecting and analyzing system that enables financial professionals to efficiently comprehend financial reports in terms of risk and domain-specific sentiment cues. Our system is designed to integrate multiple NLP models trained on financial reports but on different levels (i.e., word, multi-word, and sentence levels) and to illustrate the prediction results generated by the models. The system is available online at https://cfda.csie.org/FRIDAYS/.

Download Full-text

Interim Reporting Frequency and the Mispricing of Accruals

Accounting Horizons ◽

10.2308/acch-52097 ◽

2018 ◽

Vol 32 (3) ◽

pp. 29-47

Author(s):

Shou-Min Tsao ◽

Hsueh-Tien Lu ◽

Edmund C. Keung

Keyword(s):

Financial Reporting ◽

Stock Prices ◽

Data Availability ◽

Financial Reports ◽

Regulatory Changes ◽

Sample Composition ◽

Reporting Frequency ◽

Over Time ◽

Jel Classifications ◽

Accrual Mispricing

SYNOPSIS This study examines the association between mandatory financial reporting frequency and the accrual anomaly. Based on regulatory changes in reporting frequency requirements in Taiwan, we divide our sample period into three reporting regimes: a semiannual reporting regime from 1982 to 1985, a quarterly reporting regime from 1986 to 1987, and a monthly reporting regime (both quarterly financial reports and monthly revenue disclosure) from 1988 to 1993. We find that although both switches (from the semiannual reporting regime to the quarterly reporting regime and from the quarterly reporting regime to the monthly reporting regime) hasten the dissemination of the information contained in annual accruals into stock prices and reduce annual accrual mispricing, the switch to monthly reporting has a lesser effect. Our results are robust to controlling for risk factors, transaction costs, and potential changes in accrual, cash flow persistence, and sample composition over time. These results imply that more frequent reporting is one possible mechanism to reduce accrual mispricing. JEL Classifications: G14; L51; M41; M48. Data Availability: Data are available from sources identified in the paper.

Download Full-text

Development and Evaluation of Novel Ophthalmology Domain-Specific Neural Word Embeddings to Predict Visual Prognosis

International Journal of Medical Informatics ◽

10.1016/j.ijmedinf.2021.104464 ◽

2021 ◽

pp. 104464

Author(s):

Sophia Wang ◽

Benjamin Tseng ◽

Tina Hernandez-Boussard

Keyword(s):

Word Embeddings ◽

Visual Prognosis ◽

Domain Specific

Download Full-text

Audit quality within adverse selection markets

Asian Review of Accounting ◽

10.1108/ara-12-2015-0127 ◽

2016 ◽

Vol 24 (1) ◽

pp. 2-18 ◽

Cited By ~ 3

Author(s):

Bharat Sarath

Keyword(s):

Private Information ◽

Stock Price ◽

Audit Quality ◽

Information Economics ◽

Economic Market ◽

Financial Reports ◽

Content Type ◽

General Terms ◽

Audit Regulation ◽

Short Run

Purpose – Auditing may be viewed as an arrangement for reducing inefficiencies arising from the fundamental market conflict between a seller who wants as high a price as possible and a buyer who wants to pay as low a price as possible. In more general terms, sellers prefer policies that boost the stock price in the short run whereas buyers would prefer the price to peak when they are ready to sell some time in the future. By framing audited financial reports within this context, the purpose of this paper is to provide some insights regarding both audit institutions and audit regulation. Design/methodology/approach – This paper relies on conceptual arguments and a simple analytical model. Findings – The basic findings are that a unique definition of audit quality is not compatible with the economics of a market where there are conflicts across traders as well a possibility that some traders hold superior information to others. Even an identification of quality with accuracy fails in this setting of conflict. The inference is that audit quality should be approached from a multi-dimensional perspective rather than a unique measure. Research limitations/implications – While the paper points out difficulties in constructing measures of audit quality extant in the literature, it does not provide any clear empirical suggestions for better measures. Originality/value – The paper brings back into focus issues from information economics that form the bedrock for the study of audited financial statements in equity markets. While the paper is partially a survey and synthesis of some of the latest empirical findings, it describes them within the context of a rational economic market where traders may possess private information. Within such a market, the paper outlines both the conflicts and the benefits inherent to the current institutional arrangements where auditors are paid by incumbent shareholders and overseen by regulators.

Download Full-text

Learning Domain-Specific Word Embeddings from COVID-19 Tweets

10.1109/bigdata52589.2021.9671817 ◽

2021 ◽

Author(s):

Steve Aibuedefe Aigbe ◽

Christoph Eick

Keyword(s):

Word Embeddings ◽

Domain Specific

Download Full-text

Pelatihan Manajemen Keuangan bagi UMKM "Kelompok Binaan Handayani Catering" di Tengah Covid 19

Jurnal Surya Masyarakat ◽

10.26714/jsm.4.1.2021.60-68 ◽

2021 ◽

Vol 4 (1) ◽

pp. 60

Author(s):

Ahmad Rudi Yulianto ◽

Wahyu Setiawan

Keyword(s):

Capacity Building ◽

Community Service ◽

Cash Flow ◽

Financial Management ◽

Financial Risk ◽

Food Sector ◽

Financial Reports ◽

Business Sustainability ◽

Building Activities ◽

Financial Aspect

MSMEs are an economic driving sector that can contribute greatly to the Indonesian economy. One of the weaknesses of MSMEs is that they still lack knowledge and understanding of financial management, especially when coupled with the Covid-19 epidemic, MSMEs are experiencing various problems, so there needs to be strengthening of MSMEs through various skills so that MSMEs can survive during pandemics and post-pandemics. One of the business groups that is quite affected by Covid-19, is MSMEs engaged in the culinary or food sector, which are indicated to still have weaknesses in the financial aspect. Our MSME partners in community service are handayani catering assisted groups. The PKM team makes efforts to assist the assisted catering groups through mentoring, empowerment and capacity building activities in financial management, starting with providing financial records and bookkeeping as well as various ways to mitigate financial risk. The result of this activity was that the participants began to prepare financial reports and began to implement financial management, especially cash flow, which was previously less of a concern. Participants were greatly helped by the preparation of financial reports as an indicator of business sustainability and health.

Download Full-text

Tissue-specific cis-regulatory divergence implicates a fatty acid elongase necessary for inhibiting interspecies mating in Drosophila

10.1101/344754 ◽

2018 ◽

Author(s):

Peter A. Combs ◽

Joshua J. Krupp ◽

Neil M. Khosla ◽

Dennis Bua ◽

Dmitri A. Petrov ◽

...

Keyword(s):

Fatty Acid ◽

Candidate Genes ◽

Molecular Mechanisms ◽

Sister Species ◽

F1 Hybrids ◽

Rna Seq ◽

Fatty Acid Elongase ◽

Tissue Specific ◽

Regulatory Changes ◽

Wide Range

AbstractPheromones known as cuticular hydrocarbons are a major component of reproductive isolation in Drosophila. Individuals from morphologically similar sister species produce different sets of hydrocarbons that allow potential mates to identify them as a suitable partner. In order to explore the molecular mechanisms underlying speciation, we performed RNA-seq in F1 hybrids to measure tissue-specific cis-regulatory divergence between the sister species D. simulans and D. sechellia. By focusing on cis-regulatory changes specific to female oenocytes, we rapidly identified a small number of candidate genes. We found that one of these, the fatty acid elongase eloF, broadly affects both the complement of hydrocarbons present on D. sechellia females and the propensity of D. simulans males to mate with those females. In addition, knockdown of eloF in the more distantly related D. melanogaster led to a similar shift in hydrocarbons as well as lower interspecific mate discrimination by D. simulans males. Thus, cis-regulatory changes in eloF appear to be a major driver in the sexual isolation of D. simulans from multiple other species. More generally, our RNA-seq approach proved to be far more efficient than QTL mapping in identifying candidate genes; the same framework can be used to pinpoint cis-regulatory drivers of divergence in a wide range of traits differing between any interfertile species.

Download Full-text

Inferring Multilingual Domain-Specific Word Embeddings From Large Document Corpora

IEEE Access ◽

10.1109/access.2021.3118093 ◽

2021 ◽

Vol 9 ◽

pp. 137309-137321

Author(s):

Luca Cagliero ◽

Moreno La Quatra

Keyword(s):

Word Embeddings ◽

Domain Specific

Download Full-text

Measuring Models

Model-Driven Software Development ◽

10.4018/978-1-60566-006-6.ch007 ◽

2009 ◽

pp. 147-169 ◽

Cited By ~ 6

Author(s):

Martin Monperrus ◽

Jean-Marc Jézéquel ◽

Joël Champeau ◽

Brigitte Hoeltzener

Keyword(s):

Quality Assurance ◽

Software Development ◽

Point Of View ◽

Specific Information ◽

Model Driven Engineering ◽

Text Documents ◽

Model Driven ◽

Domain Specific ◽

Wide Range

Model-Driven Engineering (MDE) is an approach to software development that uses models as primary artifacts, from which code, documentation and tests are derived. One way of assessing quality assurance in a given domain is to define domain metrics. We show that some of these metrics are supported by models. As text documents, models can be considered from a syntactic point of view i.e., thought of as graphs. We can readily apply graph-based metrics to them, such as the number of nodes, the number of edges or the fan-in/fan-out distributions. However, these metrics cannot leverage the semantic structuring enforced by each specific metamodel to give domain specific information. Contrary to graph-based metrics, more specific metrics do exist for given domains (such as LOC for programs), but they lack genericity. Our contribution is to propose one metric, called s, that is generic over metamodels and allows the easy specification of an open-ended wide range of model metrics.

Download Full-text