From student hard drive to web corpus (part 1): the design, compilation and genre classification of the Michigan Corpus of Upper-level Student Papers (MICUSP)

Corpora ◽  
2011 ◽  
Vol 6 (2) ◽  
pp. 159-177 ◽  
Author(s):  
Ute Römer ◽  
Matthew Brook O'Donnell

In this paper, we provide a detailed account of the steps that were central to designing and compiling the Michigan Corpus of Upper-level Student Papers (MICUSP). MICUSP is a new collection of 829 papers (around 2.6 million words) written by University of Michigan students in their final undergraduate year or in their first three years of graduate education. The papers come from sixteen disciplines, ranging from Humanities and Arts to Physical Sciences, and represent a range of different text types. In this paper, we offer an overview of the design of MICUSP, the online submission process used to collect papers, and the text-type classification of the papers.

2006 ◽  
Vol 1 (9) ◽  
pp. 601
Author(s):  
Tonia J. Buchholz ◽  
Bruce Palfey ◽  
Anna K. Mapp ◽  
Gary D. Glick

2013 ◽  
Vol 51 (6) ◽  
pp. 3328-3335 ◽  
Author(s):  
Scott Havens ◽  
Hans-Peter Marshall ◽  
Christine Pielmeier ◽  
Kelly Elder

2019 ◽  
Author(s):  
Marion Poupard ◽  
Paul Best ◽  
Jan Schlüter ◽  
Helena Symonds ◽  
Paul Spong ◽  
...  

Killer whales (Orcinus orca) can produce 3 types of signals: clicks, whistles and vocalizations. This study focuses on Orca vocalizations from northern Vancouver Island (Hanson Island) where the NGO Orcalab developed a multi-hydrophone recording station to study Orcas. The acoustic station is composed of 5 hydrophones and extends over 50 km 2 of ocean. Since 2015 we are continuously streaming the hydrophone signals to our laboratory in Toulon, France, yielding nearly 50 TB of synchronous multichannel recordings. In previous work, we trained a Convolutional Neural Network (CNN) to detect Orca vocalizations, using transfer learning from a bird activity dataset. Here, for each detected vocalization, we estimate the pitch contour (fundamental frequency). Finally, we cluster vocalizations by features describing the pitch contour. While preliminary, our results demonstrate a possible route towards automatic Orca call type classification. Furthermore, they can be linked to the presence of particular Orca pods in the area according to the classification of their call types. A large-scale call type classification would allow new insights on phonotactics and ethoacoustics of endangered Orca populations in the face of increasing anthropic pressure.


2019 ◽  
Vol 18 (2) ◽  
pp. 66-72
Author(s):  
Abhijit Bhowmik ◽  
AZM Ehtesham Chowdhury

The necessity for designing autonomous indexing tools to establish expressive and efficient means of describing musical media content is well recognized. Music genre classification systems are significant to manage and use music databases. This research paper proposes an enhanced method to automatically classify music into different genre using a machine learning approach and presents the insight and results of the application of the proposed scheme to the classification of a large set of The Bangla music content, a South-East Asian language rich with a variety of music genres developed over many centuries. Building upon musical feature extraction and decision-making techniques, we propose new features and procedures to achieve enhanced accuracy. We demonstrate the efficacy of the proposed method by extracting features from a dataset of hundreds of The Bangla music pieces and testing the automatic classification decisions. This is the first development of an automated classification technique applied specifically to the Bangla music to the best of our knowledge, while the superior accuracy of the method makes it universally applicable.


Author(s):  
A. T. Anisimova

The article introduces a phenomenon of computer game as an emerging field in translation studies. The development and expanding of the world industry of interactive entertainment demands a proficient video games translation of high quality as the international market of video products is dominated by American and Japanese producers. The author discusses the issues of videogames translation in the concept field of localization as a videogames is not only an audiovisual product but a software product. The concept of translation and translator’s competence is about to leave the traditional equivalency paradigm and needs the application of other dimensions. The article discusses the genre classification of videogames, characteristics and difficulties of RPG translation, various simulators translation. The author analyses the most popular translation strategies used by the modern translators of multimedia products: foreignization – keeping a “foreign flavor” of the text; domestication – texts adaptation to the particular features and standards of the target culture; no translation strategy – leaving the original titles, names, culture references without translation. The dominant translation strategy influences the localization strategy and others.


Author(s):  
Anne Scott Sørensen

<p>In this paper, I will document the use of Facebook in a Danish context, taking a mediatisation perspective focused on the network sociality in question (Jensen, 2009; Tække, 2010a/b) and the communication (Miller, 2008) of social media. This discussion is based on a qualitative study from 2010, consisting of participants recruited from a survey study. The study explores three dilemmas resulting from network media’s communicative paradox, involving the premises of self-representation, use of status updates, and social regulation. These dilemmas are contextualised by recent theories of genre and speech-acts (Miller, 2004; Butler, 2005) as well as by existing studies of related issues, such as the composition of personal networks (friend lists) and the degree to which personal profiles are open and accessible (privacy). While the study generally confirms recent research in these fields, such research has not previously been documented (or refined) in a Danish context. The paper’s most important contributions, however, consist of its identification of the three communicative dilemmas, its tentative genre classification of the status update, and its discussion of implicit social regulation and ethics, which have not been previously been considered.</p>


Sign in / Sign up

Export Citation Format

Share Document