scholarly journals Actors (Automated Content Analysis)

Author(s):  
Valerie Hase

Actors in coverage might be individuals, groups, or organizations, which are discussed, described, or quoted in the news. The datasets referred to in the table are described in the following paragraph: Benoit and Matuso (2020) uses fictional sentences (N = 5) to demonstrate how named entities and noun phrases can be identified automatically. Lind and Meltzer (2020) demonstrate the use of organic dictionaries to identify actors in German newspaper articles (2013-2017, N = 348,785). Puschmann (2019) uses four data sets to demonstrate how sentiment/tone may be analyzed by the computer. Using tweets (2016, N = 18,826), German newspaper articles (2011-2016, N = 377), Swiss newspaper articles (2007-2012, N = 21,280), and debate transcripts (1970-2017, N = 7,897), he extracts nouns and named entities from text. Lastly, Wiedemann and Niekler (2017) extract proper nouns from State of the Union speeches (1790-2017, N = 233). Field of application/theoretical foundation: Related to theories of “Agenda Setting” and “Framing”, analyses might want to know how much weight is given to a specific actor, how these actors are evaluated and what perspectives and frames they might bring into the discussion how prominently. References/combination with other methods of data collection: Oftentimes, studies use both manual and automated content analysis to identify actors in text. This might be a useful tool to extend the lists of actors that can be found as well as to validate automated analyses. For example, Lind and Meltzer (2020) combine manual coding and dictionaries to identify the salience of women in the news.   Table 1. Measurement of “Actors” using automated content analysis. Author(s) Sample Procedure Formal validity check with manual coding as benchmark* Code Benoit & Matuso (2020) Fictional sentences Part-of-Speech tagging; syntactic parsing Not reported https://cran.r-project.org/web/packages/spacyr/vignettes/using_spacyr.html Lind & Meltzer (2020) Newspapers Dictionary approach Reported https://osf.io/yqbcj/?view_only=369e2004172b43bb91a39b536970e50b Puschmann (2019) (a) Tweets (b) German newspaper articles (c) Swiss newspaper articles (d) United Nations General Debate Transcripts Part-of-Speech tagging; syntactic parsing Not reported http://inhaltsanalyse-mit-r.de/ner.html Wiedemann & Niekler (2017) State of the Union speeches Part-of-Speech tagging Not reported https://tm4ss.github.io/docs/Tutorial_8_NER_POS.html *Please note that many of the sources listed here are tutorials on how to conducted automated analyses – and therefore not focused on the validation of results. Readers should simply read this column as an indication in terms of which sources they can refer to if they are interested in the validation of results. References Benoit, K., & Matuso. (2020). A Guide to Using spacyr. Retrieved from https://cran.r-project.org/web/packages/spacyr/vignettes/using_spacyr.html Lind, F., & Meltzer, C. E. (2020). Now you see me, now you don’t: Applying automated content analysis to track migrant women’s salience in German news. Feminist Media Studies, 1–18. Puschmann, C. (2019). Automatisierte Inhaltsanalyse mit R. Retrieved from http://inhaltsanalyse-mit-r.de/index.html Wiedemann, G., Niekler, A. (2017). Hands-on: a five day text mining course for humanists and social scientists in R. Proceedings of the 1st Workshop Teaching NLP for Digital Humanities (Teach4DH@GSCL 2017), Berlin. Retrieved from https://tm4ss.github.io/docs/index.html

Author(s):  
Valerie Hase

Sentiment/tone describes the way issues or specific actors are described in coverage. Many analyses differentiate between negative, neutral/balanced or positive sentiment/tone as broader categories, but analyses might also measure expressions of incivility, fear, or happiness, for example, as more granular types of sentiment/tone. Analyses can detect sentiment/tone in full texts (e.g., general sentiment in financial news) or concerning specific issues (e.g., specific sentiment towards the stock market in financial news or a specific actor). The datasets referred to in the table are described in the following paragraph: Puschmann (2019) uses four data sets to demonstrate how sentiment/tone may be analyzed by the computer. Using Sherlock Holmes stories (18th century, N = 12), tweets (2016, N = 18,826), Swiss newspaper articles (2007-2012, N = 21,280), and debate transcripts (2013-2017, N = 205,584), he illustrates how dictionaries may be applied for such a task. Rauh (2019) uses three data sets to validate his organic German language dictionary for sentiment/tone. His data consists of sentences from German parliament speeches (1991-2013, N = 1,500), German-language quasi-sentences from German, Austrian and Swiss party manifestos (1998-2013, N = 14,008) and newspaper, journal and news wire articles (2011-2012, N = 4,038). Silge and Robinson (2020) use six Jane Austen novels to demonstrate how dictionaries may be used for sentiment analysis. Van Atteveldt and Welbers (2020) use state of the Union speeches (1789-2017, N = 58) for the same purpose. The same authors (van Atteveldt & Welbers, 2019) show based on a dataset of N = 2,000 movie reviews how supervised machine learning might also do the trick. In their Quanteda tutorials, Watanabe and Müller (2019) demonstrate the use of dictionaries and supervised machine learning for sentiment analysis on UK newspaper articles (2012-2016, N = 6,000) as well as the same set of movie reviews (n = 2,000). Lastly, Wiedemann and Niekler (2017) use state of the Union speeches (1790-2017, N = 233) to demonstrate how sentiment/tone can be coded automatically via a dictionary approach. Field of application/theoretical foundation: Related to theories of “Framing” and “Bias” in coverage, many analyses are concerned with the way the news evaluates and interprets specific issues and actors. References/combination with other methods of data collection: Manual coding is needed for many automated analyses, including the ones concerned with sentiment. Studies for example use manual content analysis to develop dictionaries, to create training sets on which algorithms used for automated classification are trained, or to validate the results of automated analyses (Song et al., 2020).   Table 1. Measurement of “Sentiment/Tone” using automated content analysis. Author(s) Sample Procedure Formal validity check with manual coding as benchmark* Code Puschmann (2019) (a) Sherlock Holmes stories (b) Tweets (c) Swiss newspaper articles (d) German Parliament transcripts   Dictionary approach Not reported http://inhaltsanalyse-mit-r.de/sentiment.html Rauh (2018) (a) Bundestag speeches (b) Quasi-sentences from German, Austrian and Swiss party manifestos (c) Newspapers, journals, agency reports Dictionary approach Reported https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/BKBXWD Silge & Robinson (2020) Books by Jane Austen Dictionary approach Not reported https://www.tidytextmining.com/sentiment.html van Atteveldt & Welbers (2020) State of the Union speeches Dictionary approach Reported https://github.com/ccs-amsterdam/r-course-material/blob/master/tutorials/sentiment_analysis.md van Atteveldt & Welbers (2019) Movie reviews Supervised Machine Learning Approach Reported https://github.com/ccs-amsterdam/r-course-material/blob/master/tutorials/r_text_ml.md Watanabe & Müller (2019) Newspaper articles Dictionary approach Not reported https://tutorials.quanteda.io/advanced-operations/targeted-dictionary-analysis/ Watanabe & Müller (2019) Movie reviews Supervised Machine Learning Approach Reported https://tutorials.quanteda.io/machine-learning/nb/ Wiedemann & Niekler (2017) State of the Union speeches Dictionary approach Not reported https://tm4ss.github.io/docs/Tutorial_3_Frequency.html *Please note that many of the sources listed here are tutorials on how to conducted automated analyses – and therefore not focused on the validation of results. Readers should simply read this column as an indication in terms of which sources they can refer to if they are interested in the validation of results. References Puschmann, C. (2019). Automatisierte Inhaltsanalyse mit R. Retrieved from http://inhaltsanalyse-mit-r.de/index.html Rauh, C. (2018). Validating a sentiment dictionary for German political language—A workbench note. Journal of Information Technology & Politics, 15(4), 319–343. doi:10.1080/19331681.2018.1485608 Silge, J., & Robinson, D. (2020). Text mining with R. A tidy approach. Retrieved from https://www.tidytextmining.com/ Song, H., Tolochko, P., Eberl, J.-M., Eisele, O., Greussing, E., Heidenreich, T., Lind, F., Galyga, S., & Boomgaarden, H.G. (2020) In validations we trust? The impact of imperfect human annotations as a gold standard on the quality of validation of automated content analysis. Political Communication, 37(4), 550-572. van Atteveldt, W., & Welbers, K. (2019). Supervised Text Classification. Retrieved from https://github.com/ccs-amsterdam/r-course-material/blob/master/tutorials/r_text_ml.md van Atteveldt, W., & Welbers, K. (2020). Supervised Sentiment Analysis in R. Retrieved from https://github.com/ccs-amsterdam/r-course-material/blob/master/tutorials/sentiment_analysis.md Watanabe, K., & Müller, S. (2019). Quanteda tutorials. Retrieved from https://tutorials.quanteda.io/ Wiedemann, G., Niekler, A. (2017). Hands-on: a five day text mining course for humanists and social scientists in R. Proceedings of the 1st Workshop Teaching NLP for Digital Humanities (Teach4DH@GSCL 2017), Berlin. Retrieved from https://tm4ss.github.io/docs/index.html


Author(s):  
Valerie Hase

Frames describe the way issues are presented, i.e., what aspects are made salient when communicating about these issues. Field of application/theoretical foundation: The concept of frames is directly based on the theory of “Framing”. However, many studies using automated content analysis are lacking a clear theoretical definition of what constitutes a frame. As an exception, Walter and Ophir (2019) use automated content analysis to explore issue and strategy frames as defined by Cappella and Jamieson (1997). Vu and Lynn (2020) refer to Entman’s (1991) understanding of frames. The datasets referred to in the table are described in the following paragraph: Van der Meer et al. (2010) use a dataset consisting of Dutch newspaper articles (1991-2015, N = 9,443) and LDA topic modeling in combination with k-means clustering to identify frames. Walter and Ophir (2019) use three different datasets and a combination of topic modeling, network analysis and community detection algorithms to analyze frames. Their datasets consist of political newspaper articles and wire service coverage (N = 8,337), newspaper articles on foreign nations (2010-2015, N = 18,216) and health-related newspaper coverage (2009-2016, N = 5,005). Lastly, Vu and Lynn (2020) analyze newspaper coverage of the Rohingya crisis (2017-2018, N = 747) concerning frames. References/combination with other methods of data collection: While most approaches only rely on automated data collection and analyses, some also combine automated and manual coding. For example, a recent study by Vu and Lynn (2020) proposes to combine semantic networks and manual coding to identify frames.   Table 1. Measurement of “Frames” using automated content analysis. Author(s) Sample Procedure Formal validity check with manual coding as benchmark* Code Vu & Lynn (2020) Newspaper articles Semantic networks; manual coding Reported Not available van der Meer et al. (2019) Newspaper articles LDA topic modeling; k-means clustering Not reported Not available Walter & Ophir (2019) (a) U.S. newspapers and wire service articles (b) Newspaper articles (c) Newspaper articles     LDA topic modeling, network analysis; community detection algorithms Not reported https://github.com/DrorWalt/ANTMN *Please note that many of the sources listed here are tutorials on how to conducted automated analyses – and therefore not focused on the validation of results. Readers should simply read this column as an indication in terms of which sources they can refer to if they are interested in the validation of results. References Cappella, J. N., & Jamieson, K. H. (1997). Spiral of cynicism: The press and the public good. New York: Oxford University Press. Entman, R. M. 1991. Framing U.S. coverage of international news: contrasts in narratives of the KAL and Iran Air incidents. Journal of Communication, 41(4), 6-7. van der Meer, T. G. L. A., Kroon, A. C., Verhoeven, P., & Jonkman, J. (2019). Mediatization and the disproportionate attention to negative news: The case of airplane crashes. Journalism Studies, 20(6), 783–803. Walter, D., & Ophir, Y. (2019). News frame analysis: an inductive mixed-method computational approach. Communication Methods and Measures, 13(4), 248–266. Vu, H. T., & Lynn, N. (2020). When the news takes sides: Automated framing analysis of news coverage of the rohingya crisis by the elite press from three countries. Journalism Studies. Online first publication. doi:10.1080/1461670X.2020.1745665


Author(s):  
Nindian Puspa Dewi ◽  
Ubaidi Ubaidi

POS Tagging adalah dasar untuk pengembangan Text Processing suatu bahasa. Dalam penelitian ini kita meneliti pengaruh penggunaan lexicon dan perubahan morfologi kata dalam penentuan tagset yang tepat untuk suatu kata. Aturan dengan pendekatan morfologi kata seperti awalan, akhiran, dan sisipan biasa disebut sebagai lexical rule. Penelitian ini menerapkan lexical rule hasil learner dengan menggunakan algoritma Brill Tagger. Bahasa Madura adalah bahasa daerah yang digunakan di Pulau Madura dan beberapa pulau lainnya di Jawa Timur. Objek penelitian ini menggunakan Bahasa Madura yang memiliki banyak sekali variasi afiksasi dibandingkan dengan Bahasa Indonesia. Pada penelitian ini, lexicon selain digunakan untuk pencarian kata dasar Bahasa Madura juga digunakan sebagai salah satu tahap pemberian POS Tagging. Hasil ujicoba dengan menggunakan lexicon mencapai akurasi yaitu 86.61% sedangkan jika tidak menggunakan lexicon hanya mencapai akurasi 28.95 %. Dari sini dapat disimpulkan bahwa ternyata lexicon sangat berpengaruh terhadap POS Tagging.


2021 ◽  
Vol 184 ◽  
pp. 148-155
Author(s):  
Abdul Munem Nerabie ◽  
Manar AlKhatib ◽  
Sujith Samuel Mathew ◽  
May El Barachi ◽  
Farhad Oroumchian

Sign in / Sign up

Export Citation Format

Share Document