Improved CCG Parsing with Semi-supervised Supertagging

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00186 ◽

2014 ◽

Vol 2 ◽

pp. 327-338 ◽

Cited By ~ 7

Author(s):

Mike Lewis ◽

Mark Steedman

Keyword(s):

Wall Street Journal ◽

State Of The Art ◽

Training Data ◽

Word Embeddings ◽

Important Goal ◽

Wall Street ◽

Feature Sets ◽

Lexical Categories ◽

Unlabelled Data ◽

Pos Tagger

Current supervised parsers are limited by the size of their labelled training data, making improving them with unlabelled data an important goal. We show how a state-of-the-art CCG parser can be enhanced, by predicting lexical categories using unsupervised vector-space embeddings of words. The use of word embeddings enables our model to better generalize from the labelled data, and allows us to accurately assign lexical categories without depending on a POS-tagger. Our approach leads to substantial improvements in dependency parsing results over the standard supervised CCG parser when evaluated on Wall Street Journal (0.8%), Wikipedia (1.8%) and biomedical (3.4%) text. We compare the performance of two recently proposed approaches for classification using a wide variety of word embeddings. We also give a detailed error analysis demonstrating where using embeddings outperforms traditional feature sets, and showing how including POS features can decrease accuracy.

Download Full-text

Finding Convincing Arguments Using Scalable Bayesian Preference Learning

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00026 ◽

2018 ◽

Vol 6 ◽

pp. 357-371 ◽

Cited By ~ 4

Author(s):

Edwin Simpson ◽

Iryna Gurevych

Keyword(s):

State Of The Art ◽

Training Data ◽

Preference Learning ◽

Word Embeddings ◽

Inference Method ◽

Linguistic Features ◽

Bayesian Approaches ◽

Stochastic Variational Inference ◽

Crowdsourced Data ◽

Previous State

We introduce a scalable Bayesian preference learning method for identifying convincing arguments in the absence of gold-standard ratings or rankings. In contrast to previous work, we avoid the need for separate methods to perform quality control on training data, predict rankings and perform pairwise classification. Bayesian approaches are an effective solution when faced with sparse or noisy training data, but have not previously been used to identify convincing arguments. One issue is scalability, which we address by developing a stochastic variational inference method for Gaussian process (GP) preference learning. We show how our method can be applied to predict argument convincingness from crowdsourced data, outperforming the previous state-of-the-art, particularly when trained with small amounts of unreliable data. We demonstrate how the Bayesian approach enables more effective active learning, thereby reducing the amount of data required to identify convincing arguments for new users and domains. While word embeddings are principally used with neural networks, our results show that word embeddings in combination with linguistic features also benefit GPs when predicting argument convincingness.

Download Full-text

Towards Robust Semantic Role Labeling

Computational Linguistics ◽

10.1162/coli.2008.34.2.289 ◽

2008 ◽

Vol 34 (2) ◽

pp. 289-310 ◽

Cited By ~ 31

Author(s):

Sameer S. Pradhan ◽

Wayne Ward ◽

James H. Martin

Keyword(s):

Wall Street Journal ◽

State Of The Art ◽

Structural Features ◽

Semantic Features ◽

Semantic Role ◽

Semantic Role Labeling ◽

Identification Task ◽

Wall Street ◽

Starting Point ◽

Corpus Data

Most semantic role labeling (SRL) research has been focused on training and evaluating on the same corpus. This strategy, although appropriate for initiating research, can lead to overtraining to the particular corpus. This article describes the operation of assert, a state-of-the art SRL system, and analyzes the robustness of the system when trained on one genre of data and used to label a different genre. As a starting point, results are first presented for training and testing the system on the PropBank corpus, which is annotated Wall Street Journal (WSJ) data. Experiments are then presented to evaluate the portability of the system to another source of data. These experiments are based on comparisons of performance using PropBanked WSJ data and PropBanked Brown Corpus data. The results indicate that whereas syntactic parses and argument identification transfer relatively well to a new corpus, argument classification does not. An analysis of the reasons for this is presented and these generally point to the nature of the more lexical/semantic features dominating the classification task where more general structural features are dominant in the argument identification task.

Download Full-text

Gated POS-Level Language Model for Authorship Verification

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/557 ◽

2020 ◽

Author(s):

Linshu Ouyang ◽

Yongzheng Zhang ◽

Hui Liu ◽

Yige Chen ◽

Yipeng Wang

Keyword(s):

State Of The Art ◽

Language Model ◽

Training Data ◽

Language Models ◽

Effective Parameters ◽

Part Of Speech ◽

Authorship Verification ◽

Verification Methods ◽

Optimal Accuracy ◽

Pos Tagger

Authorship verification is an important problem that has many applications. The state-of-the-art deep authorship verification methods typically leverage character-level language models to encode author-specific writing styles. However, they often fail to capture syntactic level patterns, leading to sub-optimal accuracy in cross-topic scenarios. Also, due to imperfect cross-author parameter sharing, it's difficult for them to distinguish author-specific writing style from common patterns, leading to data-inefficient learning. This paper introduces a novel POS-level (Part of Speech) gated RNN based language model to effectively learn the author-specific syntactic styles. The author-agnostic syntactic information obtained from the POS tagger pre-trained on large external datasets greatly reduces the number of effective parameters of our model, enabling the model to learn accurate author-specific syntactic styles with limited training data. We also utilize a gated architecture to learn the common syntactic writing styles with a small set of shared parameters and let the author-specific parameters focus on each author's special syntactic styles. Extensive experimental results show that our method achieves significantly better accuracy than state-of-the-art competing methods, especially in cross-topic scenarios (over 5\% in terms of AUC-ROC).

Download Full-text

Semi-Supervised Learning under Class Distribution Mismatch

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5763 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3569-3576

Author(s):

Yanbei Chen ◽

Xiatian Zhu ◽

Wei Li ◽

Shaogang Gong

Keyword(s):

Image Classification ◽

Supervised Learning ◽

Error Propagation ◽

State Of The Art ◽

Training Data ◽

Class Distribution ◽

Unlabelled Data ◽

Popular Image ◽

Soft Targets ◽

Novel Algorithm

Semi-supervised learning (SSL) aims to avoid the need for collecting prohibitively expensive labelled training data. Whilst demonstrating impressive performance boost, existing SSL methods artificially assume that small labelled data and large unlabelled data are drawn from the same class distribution. In a more realistic scenario with class distribution mismatch between the two sets, they often suffer severe performance degradation due to error propagation introduced by irrelevant unlabelled samples. Our work addresses this under-studied and realistic SSL problem by a novel algorithm named Uncertainty-Aware Self-Distillation (UASD). Specifically, UASD produces soft targets that avoid catastrophic error propagation, and empower learning effectively from unconstrained unlabelled data with out-of-distribution (OOD) samples. This is based on joint Self-Distillation and OOD filtering in a unified formulation. Without bells and whistles, UASD significantly outperforms six state-of-the-art methods in more realistic SSL under class distribution mismatch on three popular image classification datasets: CIFAR10, CIFAR100, and TinyImageNet.

Download Full-text

Do all fragments count?

Natural Language Engineering ◽

10.1017/s1351324903003140 ◽

2003 ◽

Vol 9 (4) ◽

pp. 307-323 ◽

Cited By ~ 1

Author(s):

RENS BOD

Keyword(s):

Wall Street Journal ◽

Training Data ◽

Minimal Set ◽

Wall Street ◽

Statistical Parsing ◽

History Of ◽

Parse Trees ◽

Dependency Relations

We aim at finding the minimal set of fragments that achieves maximal parse accuracy in Data Oriented Parsing (DOP). Experiments with the Penn Wall Street Journal (WSJ) treebank show that counts of almost arbitrary fragments within parse trees are important, leading to improved parse accuracy over previous models tested on this treebank. We isolate a number of dependency relations which previous models neglect but which contribute to higher accuracy. We show that the history of statistical parsing models displays a tendency towards using more and larger fragments from training data.

Download Full-text

Reading the Image of Race: Neurocriminology, Medical Imaging Technologies and Literary Intervention

10.3366/edinburgh/9781474400046.003.0013 ◽

2018 ◽

Author(s):

Lindsey Andrews ◽

Jonathan M. Metzl

Keyword(s):

Twentieth Century ◽

Wall Street Journal ◽

Genetic Basis ◽

Three Dimensional ◽

Criminal Behaviour ◽

Dominant Model ◽

Wall Street ◽

Behavioural Genetics ◽

Black And White ◽

Brain Scan

On 26 April 2013, the Wall Street Journal published an essay by neurocriminologist Adrian Raine promoting his newest book, The Anatomy of Violence: The Biological Roots of Crime. On the newspaper’s website, an image of a black-and-white brain scan overlaid with handcuffs headed the essay. Clicking ‘play’ turned the image into a video filled with three-dimensional brain illustrations and Raine’s claims that some brains are simply more biologically prone to violence than others. Rejecting what he describes as ‘the dominant model for understanding criminal behaviour in the twentieth century’ – a model based ‘almost exclusively on social and sociological’ explanations – Raine wrote that ‘the genetic basis of criminal behaviour is now well established’ through molecular and behavioural genetics.

Download Full-text

Wall Street Journal Announcements and the Securities Markets

Financial Analysts Journal ◽

10.2469/faj.v38.n2.69 ◽

1982 ◽

Vol 38 (2) ◽

pp. 69-76 ◽

Cited By ~ 16

Author(s):

Dale Morse

Keyword(s):

Wall Street Journal ◽

Securities Markets ◽

Wall Street

Download Full-text

ANALISIS DIVERSIFIKASI INTERNASIONAL : PEMBENTUKAN PORTOFOLIO OPTIMAL INDEKS SAHAM DUNIA (Studi Kasus Pada Indeks Saham Aktif Dunia Versi The Wall Street Journal)

JMM17: Jurnal Ilmu Ekonomi dan Manajemen ◽

10.30996/jmm.v6i02.2995 ◽

2019 ◽

Vol 6 (02) ◽

Author(s):

Rony Mahendra ◽

Erwin Dyah Astawinetu

Keyword(s):

Wall Street Journal ◽

Optimal Portfolio ◽

Stock Index ◽

Wall Street ◽

Expected Return ◽

Single Index ◽

Index Model ◽

Single Index Model ◽

Stock Indices ◽

The Wall Street Journal

The research objective is to establish an optimal portfolio and know the difference between risk and return stock index portfolio candidates and non-candidates. Method used in the preparation of this research portfolio is the single index model, while the samples of this study are active world stock indices version of The Wall Street Journal during the period August 2012 - August 2016 and The Global Dow is used as the benchmark stock index. In establishing the optimal portfolio is used two perspectives: the Rupiah perspective and the U.S. Dollar perspective. The results showed there were three stock indices from the perspective of Rupiah and 8 share index menurutperspektif U.S. Dollar that make up the optimal portfolio, with the cut-of-pointsebesar 0,01393menurut Rupiah perspective and the perspective of 0.0078 US Dollars Based on the perspective of return expectations Rupiah obtained by 0.0258 with a risk of 0.06512. Berdarkan perspective of US Dollars, obtained return expectations at 0.0154 with a risk of 0.0292. From the test results showed that the hypothesis, the return on both perspectives there are significant differences between the index of the candidate, with a non-candidate. Then the risk of stock index, among the candidates, with a non-candidate, the Rupiah perspective there is no difference, but in the perspective of US Dollars, there are significant differences.Keywords: Single Index Model, candidate portfolio, optimal portfolio, expected return, excess return to beta, cut-off-point

Download Full-text

Weibo vs Wall Street Journal: How are Social & Mass Media Different in the Stock Market?

SSRN Electronic Journal ◽

10.2139/ssrn.3434161 ◽

2019 ◽

Author(s):

Hang Dong ◽

Jie Ren ◽

Balaji Padmanabhan ◽

Jeffrey V. Nickerson

Keyword(s):

Mass Media ◽

Stock Market ◽

Wall Street Journal ◽

Wall Street

Download Full-text

Improving Semi-Supervised Learning for Audio Classification with FixMatch

Electronics ◽

10.3390/electronics10151807 ◽

2021 ◽

Vol 10 (15) ◽

pp. 1807

Author(s):

Sascha Grollmisch ◽

Estefanía Cano

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Transfer Learning ◽

Data Transfer ◽

State Of The Art ◽

Training Data ◽

Audio Classification ◽

Image Domain ◽

Full Dataset ◽

Audio Data

Including unlabeled data in the training process of neural networks using Semi-Supervised Learning (SSL) has shown impressive results in the image domain, where state-of-the-art results were obtained with only a fraction of the labeled data. The commonality between recent SSL methods is that they strongly rely on the augmentation of unannotated data. This is vastly unexplored for audio data. In this work, SSL using the state-of-the-art FixMatch approach is evaluated on three audio classification tasks, including music, industrial sounds, and acoustic scenes. The performance of FixMatch is compared to Convolutional Neural Networks (CNN) trained from scratch, Transfer Learning, and SSL using the Mean Teacher approach. Additionally, a simple yet effective approach for selecting suitable augmentation methods for FixMatch is introduced. FixMatch with the proposed modifications always outperformed Mean Teacher and the CNNs trained from scratch. For the industrial sounds and music datasets, the CNN baseline performance using the full dataset was reached with less than 5% of the initial training data, demonstrating the potential of recent SSL methods for audio data. Transfer Learning outperformed FixMatch only for the most challenging dataset from acoustic scene classification, showing that there is still room for improvement.

Download Full-text