Calibration of Constraint Promotion Does Not Help with Learning Variation in Stochastic Optimality Theory

2020 ◽  
Vol 51 (1) ◽  
pp. 97-123
Author(s):  
Giorgio Magri ◽  
Benjamin Storme

The Calibrated Error-Driven Ranking Algorithm (CEDRA; Magri 2012 ) is shown to fail on two test cases of phonologically conditioned variation from Boersma and Hayes 2001 . The failure of the CEDRA raises a serious unsolved challenge for learnability research in stochastic Optimality Theory, because the CEDRA itself was proposed to repair a learnability problem ( Pater 2008 ) encountered by the original Gradual Learning Algorithm. This result is supported by both simulation results and a detailed analysis whereby a few constraints and a few candidates at a time are recursively “peeled off” until we are left with a “core” small enough that the behavior of the learner is easy to interpret.

2001 ◽  
Vol 32 (1) ◽  
pp. 45-86 ◽  
Author(s):  
Paul Boersma ◽  
Bruce Hayes

The Gradual Learning Algorithm (Boersma 1997) is a constraint-ranking algorithm for learning optimality-theoretic grammars. The purpose of this article is to assess the capabilities of the Gradual Learning Algorithm, particularly in comparison with the Constraint Demotion algorithm of Tesar and Smolensky (1993, 1996, 1998, 2000), which initiated the learnability research program for Optimality Theory. We argue that the Gradual Learning Algorithm has a number of special advantages: it can learn free variation, deal effectively with noisy learning data, and account for gradient well-formedness judgments. The case studies we examine involve Ilokano reduplication and metathesis, Finnish genitive plurals, and the distribution of English light and dark /l/.


Author(s):  
Karen Jesney

Many error-driven learning algorithms for constraint-based phonological grammars, including the Gradual Learning Algorithm for Optimality Theory and Harmonic Grammar, predict that more frequent input forms will be acquired earlier than less frequent input forms – a fact that has been commonly taken as a virtue of these models. These models also predict, however, that the rate of learning for more frequent input forms should be faster than the rate of learning for less frequent input forms. In other words, these models predict that sequence and rate of acquisition are related; structures acquired earlier in the course of learning will be acquired more rapidly, while those that are acquired relatively later will be acquired more slowly. This paper explicates these predictions and argues that they are not consistently supported by child language data.  Evidence from six children’s acquisition of consonant clusters is presented, demonstrating that, contrary to the predictions of the learning models, learning sequence and rate of acquisition are largely disassociated.


2013 ◽  
Vol 44 (4) ◽  
pp. 569-609 ◽  
Author(s):  
Giorgio Magri

Various authors have recently endorsed Harmonic Grammar (HG) as a replacement for Optimality Theory (OT). One argument for this move is that OT seems not to have close correspondents within machine learning while HG allows methods and results from machine learning to be imported into computational phonology. Here, I prove that this argument in favor of HG and against OT is wrong. In fact, I show that any algorithm for HG can be turned into an algorithm for OT. Hence, HG has no computational advantages over OT. This result allows tools from machine learning to be systematically adapted to OT. As an illustration of this new toolkit for computational OT, I prove convergence for a slight variant of Boersma’s (1998) (nonstochastic) Gradual Learning Algorithm.


2009 ◽  
Vol 40 (4) ◽  
pp. 667-686 ◽  
Author(s):  
Paul Boersma

This article shows that Error-Driven Constraint Demotion (EDCD), an error-driven learning algorithm proposed by Tesar (1995) for Prince and Smolensky's (1993/2004) version of Optimality Theory, can fail to converge to a correct totally ranked hierarchy of constraints, unlike the earlier non-error-driven learning algorithms proposed by Tesar and Smolensky (1993). The cause of the problem is found in Tesar's use of “mark-pooling ties,” indicating that EDCD can be repaired by assuming Anttila's (1997) “permuting ties” instead. Proofs show, and simulations confirm, that totally ranked hierarchies can indeed be found by both this repaired version of EDCD and Boersma's (1998) Minimal Gradual Learning Algorithm.


2005 ◽  
Vol 38 ◽  
pp. 187
Author(s):  
Jason Mattausch

The purpose of this dissertation is to defend the idea that the empirical responsibilities of binding theory can be handled in a more psychologically and historically realistic way when assigned to the field of pragmatics. In particular, I wish to show that Optimality Theory (OT) (Prince & Smolensky, 1993), the stochastic OT and Gradual Learning Algorithm of Boersma (1998), the Recoverability of OT of Wilson (2001) and Buchwald et al. (2002), and the bidirectional OT of Blutner (2000b) and Bidirectional Gradual Learning Algorithm of Jäger (2003a) can all participate in a formal framework in which one can formally spell out and justify the idea that the distributional behavior of bound pronouns and reflexivs is a pragmatic phenomenon.  


Author(s):  
Vsevolod Kapatsinski

AbstractRussian velar palatalization changes velars into alveopalatals before certain suffixes, including the stem extension -i and the diminutive suffixes -ok and -ek/ik. While velar palatalization always applies before the relevant suffixes in the established lexicon, it often fails with nonce loanwords before -i and -ik but not before -ok or -ek. This is shown to be predicted by the Minimal Generalization Learner (MGL), a model of rule induction and weighting developed by Albright and Hayes (Cognition 90: 119–161, 2003), by a novel version of Network Theory (Bybee, Morphology: A study of the relation between meaning and form, John Benjamins, 1985, Phonology and language use, Cambridge University Press, 2001), which uses competing unconditional product-oriented schemas weighted by type frequency and paradigm uniformity constraints, and by stochastic Optimality Theory with language-specific constraints learned using the Gradual Learning Algorithm (GLA, Boersma, Proceedings of the Institute of Phonetic Sciences of the University of Amsterdam 21: 43–58, 1997). The successful models are shown to predict that a morphophonological rule will fail if the triggering suffix comes to attach to inputs that are not eligible to undergo the rule. This prediction is confirmed in an artificial grammar learning experiment. Under either model, the choice between generalizations or output forms is shown to be stochastic, which requires retrieving known word-forms from the lexicon as wholes, rather than generating them through the grammar. Furthermore, MGL and GLA are shown to succeed only if the suffix and the stem shape are chosen simultaneously, as opposed to the suffix being chosen first and then triggering (or failing to trigger) a stem change. In addition, the GLA is shown to require output-output faithfulness to be ranked above markedness at the beginning of learning (Hayes, Phonological acquisition in Optimality Theory: the early stages, Cambridge University Press, 2004) to account for the present data.


Phonology ◽  
2020 ◽  
Vol 37 (3) ◽  
pp. 383-418
Author(s):  
Shigeto Kawahara

An experiment showed that Japanese speakers’ judgement of Pokémons’ evolution status on the basis of nonce names is affected both by mora count and by the presence of a voiced obstruent. The effects of mora count are a case of counting cumulativity, and the interaction between the two factors a case of ganging-up cumulativity. Together, the patterns result in what Hayes (2020) calls ‘wug-shaped curves’, a quantitative signature predicted by MaxEnt. I show in this paper that the experimental results can indeed be successfully modelled with MaxEnt, and also that Stochastic Optimality Theory faces an interesting set of challenges. The study was inspired by a proposal made within formal phonology, and reveals important previously understudied aspects of sound symbolism. In addition, it demonstrates how cumulativity is manifested in linguistic patterns. The work here shows that formal phonology and research on sound symbolism can be mutually beneficial.


Author(s):  
S N Huang ◽  
K K Tan ◽  
T H Lee

A novel iterative learning controller for linear time-varying systems is developed. The learning law is derived on the basis of a quadratic criterion. This control scheme does not include package information. The advantage of the proposed learning law is that the convergence is guaranteed without the need for empirical choice of parameters. Furthermore, the tracking error on the final iteration will be a class K function of the bounds on the uncertainties. Finally, simulation results reveal that the proposed control has a good setpoint tracking performance.


2011 ◽  
Vol 121-126 ◽  
pp. 4239-4243 ◽  
Author(s):  
Du Jou Huang ◽  
Yu Ju Chen ◽  
Huang Chu Huang ◽  
Yu An Lin ◽  
Rey Chue Hwang

The chromatic aberration estimations of touch panel (TP) film by using neural networks are presented in this paper. The neural networks with error back-propagation (BP) learning algorithm were used to catch the complex relationship between the chromatic aberration, i.e., L.A.B. values, and the relative parameters of TP decoration film. An artificial intelligent (AI) estimator based on neural model for the estimation of physical property of TP film is expected to be developed. From the simulation results shown, the estimations of chromatic aberration of TP film are very accurate. In other words, such an AI estimator is quite promising and potential in commercial using.


Author(s):  
Caroline R. Wiltshire

This study uses data from Indian English as a second language, spoken by speakers of five first languages, to illustrate and evaluate the role of the emergence of the unmarked (TETU) in phonological theory. The analysis focusses on word-final consonant devoicing and cluster reduction, for which the five Indian first languages have various constraints, while Indian English is relatively unrestricted. Variation in L2 Indian Englishes results from both transfer of L1 phonotactics and the emergence of the unmarked, accounted for within Optimality Theory. The use of a learning algorithm also allows us to test the relative importance of markedness and frequency and to evaluate the relative markedness of various clusters. Thus, data from Indian Englishes provides insight into the form and function of markedness constraints, as well as the mechanisms of Second Language Acquisition (SLA).


Sign in / Sign up

Export Citation Format

Share Document