Learning Tractable Word Alignment Models with Complex Constraints

João V. Graça; Kuzman Ganchev; Ben Taskar

doi:10.1162/coli_a_00007

Learning Tractable Word Alignment Models with Complex Constraints

Computational Linguistics ◽

10.1162/coli_a_00007 ◽

2010 ◽

Vol 36 (3) ◽

pp. 481-504 ◽

Cited By ~ 6

Author(s):

João V. Graça ◽

Kuzman Ganchev ◽

Ben Taskar

Keyword(s):

Probabilistic Models ◽

Learning Algorithm ◽

Word Alignment ◽

Word Level ◽

Word Alignments ◽

Symmetry Constraints ◽

Critical Resource ◽

Complex Constraints ◽

Bilingual Text ◽

Efficient Learning

Word-level alignment of bilingual text is a critical resource for a growing variety of tasks. Probabilistic models for word alignment present a fundamental trade-off between richness of captured constraints and correlations versus efficiency and tractability of inference. In this article, we use the Posterior Regularization framework (Graça, Ganchev, and Taskar 2007) to incorporate complex constraints into probabilistic models during learning without changing the efficiency of the underlying model. We focus on the simple and tractable hidden Markov model, and present an efficient learning algorithm for incorporating approximate bijectivity and symmetry constraints. Models estimated with these constraints produce a significant boost in performance as measured by both precision and recall of manually annotated alignments for six language pairs. We also report experiments on two different tasks where word alignments are required: phrase-based machine translation and syntax transfer, and show promising improvements over standard methods.

Download Full-text

Graph-Based Word Alignment for Clinical Language Evaluation

Computational Linguistics ◽

10.1162/coli_a_00232 ◽

2015 ◽

Vol 41 (4) ◽

pp. 549-578 ◽

Cited By ~ 7

Author(s):

Emily Prud'hommeaux ◽

Brian Roark

Keyword(s):

Language Processing ◽

Expectation Maximization ◽

Automated Analysis ◽

Screening Tools ◽

Word Alignment ◽

Word Level ◽

Word Alignments ◽

Language Data ◽

Novel Method ◽

Time Required

Among the more recent applications for natural language processing algorithms has been the analysis of spoken language data for diagnostic and remedial purposes, fueled by the demand for simple, objective, and unobtrusive screening tools for neurological disorders such as dementia. The automated analysis of narrative retellings in particular shows potential as a component of such a screening tool since the ability to produce accurate and meaningful narratives is noticeably impaired in individuals with dementia and its frequent precursor, mild cognitive impairment, as well as other neurodegenerative and neurodevelopmental disorders. In this article, we present a method for extracting narrative recall scores automatically and highly accurately from a word-level alignment between a retelling and the source narrative. We propose improvements to existing machine translation–based systems for word alignment, including a novel method of word alignment relying on random walks on a graph that achieves alignment accuracy superior to that of standard expectation maximization–based techniques for word alignment in a fraction of the time required for expectation maximization. In addition, the narrative recall score features extracted from these high-quality word alignments yield diagnostic classification accuracy comparable to that achieved using manually assigned scores and significantly higher than that achieved with summary-level text similarity metrics used in other areas of NLP. These methods can be trivially adapted to spontaneous language samples elicited with non-linguistic stimuli, thereby demonstrating the flexibility and generalizability of these methods.

Download Full-text

Neural network modelling of flow stress and mechanical properties for hot strip rolling of TRIP steel using efficient learning algorithm

Ironmaking & Steelmaking ◽

10.1179/1743281212y.0000000047 ◽

2013 ◽

Vol 40 (4) ◽

pp. 298-304 ◽

Cited By ~ 8

Author(s):

S K Das

Keyword(s):

Neural Network ◽

Mechanical Properties ◽

Flow Stress ◽

Trip Steel ◽

Learning Algorithm ◽

Hot Strip Rolling ◽

Strip Rolling ◽

Network Modelling ◽

Hot Strip ◽

Efficient Learning

Download Full-text

Max–min fuzzy Hopfield neural networks and an efficient learning algorithm

Fuzzy Sets and Systems ◽

10.1016/s0165-0114(98)00091-8 ◽

2000 ◽

Vol 112 (1) ◽

pp. 41-49 ◽

Cited By ~ 8

Author(s):

Puyin Liu

Keyword(s):

Neural Networks ◽

Learning Algorithm ◽

Hopfield Neural Networks ◽

Efficient Learning

Download Full-text

Contextual-Bandit Based Personalized Recommendation with Time-Varying User Interests

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6125 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6518-6525

Author(s):

Xiao Xu ◽

Fang Dong ◽

Yanghua Li ◽

Shaojian He ◽

Xin Li

Keyword(s):

Learning Algorithm ◽

General Setting ◽

Personalized Recommendation ◽

Time Varying ◽

Bandit Problem ◽

User Interests ◽

Specific Preference ◽

Coefficient Vector ◽

Real World Datasets ◽

Efficient Learning

A contextual bandit problem is studied in a highly non-stationary environment, which is ubiquitous in various recommender systems due to the time-varying interests of users. Two models with disjoint and hybrid payoffs are considered to characterize the phenomenon that users' preferences towards different items vary differently over time. In the disjoint payoff model, the reward of playing an arm is determined by an arm-specific preference vector, which is piecewise-stationary with asynchronous and distinct changes across different arms. An efficient learning algorithm that is adaptive to abrupt reward changes is proposed and theoretical regret analysis is provided to show that a sublinear scaling of regret in the time length T is achieved. The algorithm is further extended to a more general setting with hybrid payoffs where the reward of playing an arm is determined by both an arm-specific preference vector and a joint coefficient vector shared by all arms. Empirical experiments are conducted on real-world datasets to verify the advantages of the proposed learning algorithms against baseline ones in both settings.

Download Full-text

An SVM-Based Ensemble Approach for Intrusion Detection

International Journal of Information Technology and Web Engineering ◽

10.4018/ijitwe.2019010104 ◽

2019 ◽

Vol 14 (1) ◽

pp. 66-84 ◽

Cited By ~ 4

Author(s):

Santosh Kumar Sahu ◽

Akanksha Katiyar ◽

Kanchan Mala Kumari ◽

Govind Kumar ◽

Durga Prasad Mohapatra

Keyword(s):

Support Vector Machine ◽

Intrusion Detection ◽

Learning Algorithm ◽

Majority Voting ◽

Support Vector ◽

Detection Model ◽

Ensemble Approach ◽

Efficient Learning

The objective of this article is to develop an intrusion detection model aimed at distinguishing attacks in the network. The aim of building IDS relies on upon preprocessing of intrusion data, choosing most relevant features and in the plan of an efficient learning algorithm that properly groups the normal and malicious examples. In this experiment, the detection model uses an ensemble approach of supervised (SVM) and unsupervised (K-Means) to detect the patterns. This technique first divides the data and forms two clusters as per K-Means and labels the clusters using the Support Vector Machine (SVM). The parameters of K-Means and SVM are tuned and optimized using an intrusion dataset. The SVM provides up to 88%, and K-Means provides up to 83% accuracy individually. However, the ensemble of K-Means and SVM provides more than 99% on three benchmarked datasets in less time. The SVM only classifies three instances of each cluster randomly and labels them as per a majority voting approach. The proposed approach outperforms compared to earlier ensemble approaches on intrusion datasets.

Download Full-text

Theoretical and Empirical Analysis of a Spatial EA Parallel Boosting Algorithm

Evolutionary Computation ◽

10.1162/evco_a_00202 ◽

2018 ◽

Vol 26 (1) ◽

pp. 43-66 ◽

Cited By ~ 1

Author(s):

Uday Kamath ◽

Carlotta Domeniconi ◽

Kenneth De Jong

Keyword(s):

Real World ◽

Learning Algorithm ◽

Learning Algorithms ◽

Real World Data ◽

Meta Level ◽

Meta Learning ◽

Robustness To Noise ◽

Boosting Algorithm ◽

Efficient Learning ◽

Empirical Analyses

Many real-world problems involve massive amounts of data. Under these circumstances learning algorithms often become prohibitively expensive, making scalability a pressing issue to be addressed. A common approach is to perform sampling to reduce the size of the dataset and enable efficient learning. Alternatively, one customizes learning algorithms to achieve scalability. In either case, the key challenge is to obtain algorithmic efficiency without compromising the quality of the results. In this article we discuss a meta-learning algorithm (PSBML) that combines concepts from spatially structured evolutionary algorithms (SSEAs) with concepts from ensemble and boosting methodologies to achieve the desired scalability property. We present both theoretical and empirical analyses which show that PSBML preserves a critical property of boosting, specifically, convergence to a distribution centered around the margin. We then present additional empirical analyses showing that this meta-level algorithm provides a general and effective framework that can be used in combination with a variety of learning classifiers. We perform extensive experiments to investigate the trade-off achieved between scalability and accuracy, and robustness to noise, on both synthetic and real-world data. These empirical results corroborate our theoretical analysis, and demonstrate the potential of PSBML in achieving scalability without sacrificing accuracy.

Download Full-text

Adaptive Online Learning Algorithms for Blind Separation: Maximum Entropy and Minimum Mutual Information

Neural Computation ◽

10.1162/neco.1997.9.7.1457 ◽

1997 ◽

Vol 9 (7) ◽

pp. 1457-1482 ◽

Cited By ~ 218

Author(s):

Howard Hua Yang ◽

Shun-ichi Amari

Keyword(s):

Mutual Information ◽

Maximum Entropy ◽

Learning Algorithm ◽

Adaptive Method ◽

Descent Method ◽

Stochastic Gradient Descent ◽

Gradient Descent Method ◽

Natural Gradient ◽

Blind Separation ◽

Efficient Learning

There are two major approaches for blind separation: maximum entropy (ME) and minimum mutual information (MMI). Both can be implemented by the stochastic gradient descent method for obtaining the demixing matrix. The MI is the contrast function for blind separation; the entropy is not. To justify the ME, the relation between ME and MMI is first elucidated by calculating the first derivative of the entropy and proving that the mean subtraction is necessary in applying the ME and at the solution points determined by the MI, the ME will not update the demixing matrix in the directions of increasing the cross-talking. Second, the natural gradient instead of the ordinary gradient is introduced to obtain efficient algorithms, because the parameter space is a Riemannian space consisting of matrices. The mutual information is calculated by applying the Gram-Charlier expansion to approximate probability density functions of the outputs. Finally, we propose an efficient learning algorithm that incorporates with an adaptive method of estimating the unknown cumulants. It is shown by computer simulation that the convergence of the stochastic descent algorithms is improved by using the natural gradient and the adaptively estimated cumulants.

Download Full-text

An efficient learning algorithm for improving generalization performance of radial basis function neural networks

Neural Networks ◽

10.1016/s0893-6080(00)00029-0 ◽

2000 ◽

Vol 13 (4-5) ◽

pp. 545-553 ◽

Cited By ~ 31

Author(s):

Zheng-ou Wang ◽

Tao Zhu

Keyword(s):

Neural Networks ◽

Radial Basis Function ◽

Basis Function ◽

Learning Algorithm ◽

Generalization Performance ◽

Radial Basis ◽

Efficient Learning

Download Full-text

Deep Hurdle Networks for Zero-Inflated Multi-Target Regression: Application to Multiple Species Abundance Estimation

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/603 ◽

2020 ◽

Author(s):

Shufeng Kong ◽

Junwen Bai ◽

Jae Hee Lee ◽

Di Chen ◽

Andrew Allyn ◽

...

Keyword(s):

Large Scale ◽

Learning Algorithm ◽

Species Abundance ◽

Probit Model ◽

Regression Problem ◽

Deep Model ◽

The Difference ◽

Log Normal ◽

Efficient Learning ◽

Prediction Problems

A key problem in computational sustainability is to understand the distribution of species across landscapes over time. This question gives rise to challenging large-scale prediction problems since (i) hundreds of species have to be simultaneously modeled and (ii) the survey data are usually inflated with zeros due to the absence of species for a large number of sites. The problem of tackling both issues simultaneously, which we refer to as the zero-inflated multi-target regression problem, has not been addressed by previous methods in statistics and machine learning. In this paper, we propose a novel deep model for the zero-inflated multi-target regression problem. To this end, we first model the joint distribution of multiple response variables as a multivariate probit model and then couple the positive outcomes with a multivariate log-normal distribution. By penalizing the difference between the two distributions’ covariance matrices, a link between both distributions is established. The whole model is cast as an end-to-end learning framework and we provide an efficient learning algorithm for our model that can be fully implemented on GPUs. We show that our model outperforms the existing state-of-the-art baselines on two challenging real-world species distribution datasets concerning bird and fish populations.

Download Full-text

An Intrinsically-Motivated Approach for Learning Highly Exploring and Fast Mixing Policies

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5968 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5232-5239

Author(s):

Mirco Mutti ◽

Marcello Restelli

Keyword(s):

Optimal Policy ◽

Optimization Problems ◽

Learning Algorithm ◽

Empirical Evaluation ◽

Steady State Distribution ◽

State Distribution ◽

State Action ◽

Uniform State ◽

Minimum Number ◽

Efficient Learning

What is a good exploration strategy for an agent that interacts with an environment in the absence of external rewards? Ideally, we would like to get a policy driving towards a uniform state-action visitation (highly exploring) in a minimum number of steps (fast mixing), in order to ease efficient learning of any goal-conditioned policy later on. Unfortunately, it is remarkably arduous to directly learn an optimal policy of this nature. In this paper, we propose a novel surrogate objective for learning highly exploring and fast mixing policies, which focuses on maximizing a lower bound to the entropy of the steady-state distribution induced by the policy. In particular, we introduce three novel lower bounds, that lead to as many optimization problems, that tradeoff the theoretical guarantees with computational complexity. Then, we present a model-based reinforcement learning algorithm, IDE3AL, to learn an optimal policy according to the introduced objective. Finally, we provide an empirical evaluation of this algorithm on a set of hard-exploration tasks.

Download Full-text