Text types in Brazilian Portuguese: a multidimensional perspective

Corpora ◽  
2017 ◽  
Vol 12 (3) ◽  
pp. 483-515 ◽  
Author(s):  
Tony Berber Sardinha

This paper presents a new typology of texts for Brazilian Portuguese, based on a thorough description of the linguistic characteristics of 960 texts in a 5.6 million-word corpus ( Berber Sardinha et al., 2014 ). The typology follows the Multi-dimensional framework proposed by Biber (1989) , which defines text types as linguistic constructs derived from dimensions of variation, or co-occurring sets of linguistic characteristics that underlie register variation in a particular language or language variety. The text types were identified following a cluster analysis that took as input the dimension scores for each text on each of the six dimensions of variation. The clusters were interpreted as nine text types, each representing a typical textual configuration found in Brazilian Portuguese.

Corpora ◽  
2014 ◽  
Vol 9 (2) ◽  
pp. 239-271 ◽  
Author(s):  
Tony Berber Sardinha ◽  
Carlos Kauffmann ◽  
Cristina Mayer Acunzo

In this paper, we present a Multi-Dimensional analysis of Brazilian Portuguese, based on a large, diverse corpus comprising forty-eight different spoken and written registers. Previous research in MD analysis includes multi-register investigations of a range of languages, including English, Spanish, Somali and Korean, among others. At the same time, a large body of literature on text varieties in Brazilian Portuguese exists, but previous research focusses on specific aspects of one, or at the most, a few varieties at a time and, therefore, does not present a comprehensive picture of register use in the linguistic community of Brazilian Portuguese speakers. In this study, we attempt to fill this gap by employing the MD framework, enabling researchers to account for a large number of different registers, based on a wide repertory of linguistic features. The analysis revealed six dimensions of variation, which are presented, illustrated and discussed here.


2021 ◽  
Author(s):  
Peter Collins ◽  
Minna Korhonen ◽  
Haidee Kotze ◽  
Adam Smith ◽  
Xinyue Yao

Abstract A number of studies have found that grammatical differences across registers are more extensive than those across dialects. However, there is a paucity of research examining intervarietal register change, exploring how registers change differently over time in different regional varieties. The present study addresses this diachronic deficit, focusing on grammatical developments – from the early 20th to the early 21st century – in corpora representing three written registers and two speech-based registers in Australian, British and American English. We conducted a factor analysis on 68 lexicogrammatical features to identify six dimensions of register variation, and subsequently investigated the diachronic change of the five registers across these dimensions. We interpret our findings in terms of the differential effects of broad social changes on individual registers, in light of existing findings on trends of change in different registers and varieties.


2017 ◽  
Vol 6 (2) ◽  
pp. 319
Author(s):  
Angvarrah Lieungnapar ◽  
Richard Watson Todd ◽  
Wannapa Trakulkasemsuk

In most current work on genre, a set of genre categories needs to be predetermined. However, there are some cases where such predetermined genres cannot be clearly identified. Popular science, for instance, is a broad register carrying several specific purposes within it, suggesting that there are several genres of popular science, but it is unclear what these genres are. This paper introduces a linguistic approach to reveal hidden genres. For 600 written popular science texts from a variety of sources and disciplines, linguistic features were analysed using a range of computer programs and a cluster analysis conducted. The analysis produced four clusters with shared linguistic features, representing text types. The association of these text types with key features, functional relations, dominant sources, and prototypical members of each cluster helps us to induce genres on the basis of communicative purposes, a traditional criterion in identifying genres. Whether the produced text types are equivalent to genres was evaluated with a test set of data. The proposed approach achieves more than 70 % accuracy. The approach appears applicable for identifying genres of popular science and has pedagogical implications.


2017 ◽  
Vol 22 (2) ◽  
pp. 153-186 ◽  
Author(s):  
Paul Thompson ◽  
Susan Hunston ◽  
Akira Murakami ◽  
Dominik Vajn

Abstract Multi-Dimensional Analysis (MDA) has been widely used to explore register variation. This paper reports on a project using MDA to explore the features of an interdisciplinary academic domain. Six dimensions of variation are identified in a corpus of 11,000 journal articles in environmental studies. We then focus on articles in one interdisciplinary journal, Global Environmental Change (GEC). It is expected that these articles will diverge sufficiently to produce differences that are analogous to register differences. Instead of identifying these “registers” on external criteria, we use the dimensional profiles of individual texts to identify ‘constellations’ of texts sharing combinations of features. Six such constellations are derived, consisting of texts with commonalities in their approaches to research: the development of predictive models; quantitative research; discussions of theory and policy; and human-environment studies focusing on individual voices. The identification of these constellations could not have been achieved through an a priori categorisation of texts.


Author(s):  
Tony Berber Sardinha ◽  
Marcia Veirano Pinto

Abstract This paper presents the first entirely linguistic typology of contemporary American television, derived from a multi-dimensional (MD) analysis of the USTV corpus. The USTV corpus comprises 930 texts from 191 different TV programs, classified into 31 different registers (including nine telecinematic ones: drama series, miniseries, movies, sitcoms, soap operas, general animation, children’s animation, short-feature animation, and children’s and teens’ shows). The linguistic typology we present in this study is based on the linguistic characteristics present in the individual programs, with no a priori textual categorizations. A cluster analysis grouped the individual programs into clusters that shared similar dimensional profiles. The resulting typology comprises nine different text types – namely Presentation of information, Opinion and discussion, Analysis and debate, Description, Interactive recount, Engaging demonstration, Playful discourse, Simplified interaction, and Simulated conversation. The paper discusses and illustrates each text type and considers how telecinematic discourse relates to each of them.


1995 ◽  
Vol 3 (2) ◽  
pp. 139-150 ◽  
Author(s):  
Janis Wiley Driscoll

AbstractA questionnaire was used to assess people's attitudes toward 33 species of animals on six dimensions (useful-useless, smart-stupid, responsive-unresponsive, lovable-unlovable, safe-dangerous, and important-unimportant). A cluster analysis resulted in five groups of animals with similar ratings on these dimensions. Respondents were also asked about their attitudes toward hunting, fishing, and medical, scientific and product-testing research using animals.


2015 ◽  
Vol 13 (3) ◽  
pp. 266-291 ◽  
Author(s):  
Łukasz Grabowski

Focusing on the exploration of intra-disciplinary register variation in the pharmaceutical domain, this corpus-driven study attempts to describe the use, composition and discourse functions of phrase frames, that is, contiguous sequences of words identical except for one (Fletcher, 2002-2007), found in samples of four English pharmaceutical text types, such as patient information leaflets, summaries of product characteristics, clinical trial protocols and chapters/sections from academic textbooks on pharmacology. The study deals with a specific sub-type of phrase frames, that is, 4-word units with a variable slot in the medial position, e.g. be * with caution, to take * medicine. The results showed, among others, that the use and discourse functions of phrase frames vary across pharmaceutical text types, that the correlation between the frequency of phrase frames and their pattern variability may depend on a register or genre, and that it is justified to treat the discourse functions of phrase frames as distinct from those of their textual variants.


2018 ◽  
Vol 23 (2) ◽  
pp. 125-157 ◽  
Author(s):  
Tony Berber Sardinha

Abstract This paper presents a study that sought to identify the dimensions of variation underlying a corpus of Internet texts, using Biber’s (1988) multi-dimensional (MD) analysis framework. The corpus was compiled following the method proposed by Biber (1993), according to which the size of each register subcorpus should be determined based on the linguistic variation across the texts. The corpus was tagged using the Biber Tagger and the features were counted and submitted to a factor analysis, which suggested three factors. The factors were interpreted as three dimensions of variation: involved, interactive discourse versus informational focus; expression of stance: interactional evidentiality; and expression of stance: interactional affect. The amount of register variation captured by the register distinctions on the dimensions ranged from 8.7% to 57.1%. Dimension 1 corroborate the oral/involved versus literate/informational distinction defined in previous MD studies of non-Internet registers, whereas Dimensions 2 and 3 highlight the important role played by stance in social media.


Sign in / Sign up

Export Citation Format

Share Document