probabilistic grammars
Recently Published Documents


TOTAL DOCUMENTS

43
(FIVE YEARS 2)

H-INDEX

10
(FIVE YEARS 0)

2021 ◽  
Vol 224 ◽  
pp. 107077
Author(s):  
Jure Brence ◽  
Ljupčo Todorovski ◽  
Sašo Džeroski

2019 ◽  
Vol 24 (2) ◽  
pp. 413-440 ◽  
Author(s):  
IVÁN TAMAREDO ◽  
MELANIE RÖTHLISBERGER ◽  
JASON GRAFMILLER ◽  
BENEDIKT HELLER

Szmrecsanyi et al. (2016) define probabilistic indigenization as the process whereby probabilistic constraints shape variation patterns in different ways, which eventually leads to more heterogeneity in the constraints governing syntactic variation across different varieties of English. The present study extends our knowledge of the heterogeneity of probabilistic grammars by sketching a corpus-based variationist method for calculating the similarity between varieties thereby drawing inspiration from the comparative sociolinguistics literature. Based on linguistic material from the International Corpus of English, we ascertain the degree of regional variability of five probabilistic constraints on the genitive, dative, particle placement and subject pronoun omission alternations across three varieties of English, namely British, Indian and Singapore English. Our results indicate that, of the four alternations under study, the genitive alternation is the most homogeneous one from a regional perspective, followed – in increasing order of heterogeneity – by subject pronoun omission, dative and particle placement alternations. On the basis of these findings, we evaluate claims in the literature according to which the extent of probabilistic indigenization is proportional to the lexical specificity of the syntactic phenomenon under study, a hypothesis that is borne out by our data.


Author(s):  
Aleksei Nazarov

This paper proposes a novel method of inferring diacritics for representing between-word variation (exceptionality) in Optimality Theoretic (OT) grammars (e.g., Pater 2000, 2010) that makes it possible to infer such diacritics in the face of within-word variation. Existing methods of inferring diacritics in OT (Pater 2010, Becker 2009, Coetzee 2009) are based in categorical grammar learning (Tesar 1995), which makes them unable to handle within-word variation. Existing methods of inferring probabilistic OT grammars (e.g., Boersma 1998) handle within-word variation well, but have no provision to distinguish exceptional from non-exceptional words, and are incompatible with the main idea in Pater (2010). I show that this latter idea can be made compatible with probabilistic grammars based on a case study from Hebrew (Temkin-Martínez 2010), so that both within- and between-word variation can be learned.


Author(s):  
Benedikt Szmrecsanyi

AbstractThe paper surveys overlap between corpus linguistics and variationist sociolinguistics. Corpus linguistics is customarily defined as a methodology that bases claims about language on usage patterns in collections of naturalistic, authentic speech or text. Because this is what is typically done in variationist sociolinguistics work, I argue that variationist sociolinguists are by definition corpus linguists, though of course the reverse is not true: the variationist method entails more than merely analyzing usage data, and not all corpus analysts are interested in variation. But that being said, a considerable and arguably increasing number of corpus linguists not formally trained in variationist sociolinguistics are explicitly concerned with variation and engage in what I callcorpus-based variationist linguistics(CVL). I first discuss what unites or divides work in CVL and in variationist sociolinguistics. In a plea to cross subdisciplinary boundaries, I subsequently identify three research areas where variationist sociolinguists may draw inspiration from work in CVL: conducting multi-variable research, paying more attention to probabilistic grammars, and taking more seriously the register-sensitivity of variation patterns.


2015 ◽  
pp. 157-189 ◽  
Author(s):  
Samer Abdallah ◽  
Nicolas Gold ◽  
Alan Marsden

Sign in / Sign up

Export Citation Format

Share Document