scholarly journals NASca and NASes: Two Monolingual Pre-Trained Models for Abstractive Summarization in Catalan and Spanish

2021 ◽  
Vol 11 (21) ◽  
pp. 9872
Author(s):  
Vicent Ahuir ◽  
Lluís-F. Hurtado ◽  
José Ángel González ◽  
Encarna Segarra

Most of the models proposed in the literature for abstractive summarization are generally suitable for the English language but not for other languages. Multilingual models were introduced to address that language constraint, but despite their applicability being broader than that of the monolingual models, their performance is typically lower, especially for minority languages like Catalan. In this paper, we present a monolingual model for abstractive summarization of textual content in the Catalan language. The model is a Transformer encoder-decoder which is pretrained and fine-tuned specifically for the Catalan language using a corpus of newspaper articles. In the pretraining phase, we introduced several self-supervised tasks to specialize the model on the summarization task and to increase the abstractivity of the generated summaries. To study the performance of our proposal in languages with higher resources than Catalan, we replicate the model and the experimentation for the Spanish language. The usual evaluation metrics, not only the most used ROUGE measure but also other more semantic ones such as BertScore, do not allow to correctly evaluate the abstractivity of the generated summaries. In this work, we also present a new metric, called content reordering, to evaluate one of the most common characteristics of abstractive summaries, the rearrangement of the original content. We carried out an exhaustive experimentation to compare the performance of the monolingual models proposed in this work with two of the most widely used multilingual models in text summarization, mBART and mT5. The experimentation results support the quality of our monolingual models, especially considering that the multilingual models were pretrained with many more resources than those used in our models. Likewise, it is shown that the pretraining tasks helped to increase the degree of abstractivity of the generated summaries. To our knowledge, this is the first work that explores a monolingual approach for abstractive summarization both in Catalan and Spanish.

Information ◽  
2020 ◽  
Vol 11 (2) ◽  
pp. 78 ◽  
Author(s):  
Tulu Tilahun Hailu ◽  
Junqing Yu ◽  
Tessfu Geteye Fantaye

Text summarization is a process of producing a concise version of text (summary) from one or more information sources. If the generated summary preserves meaning of the original text, it will help the users to make fast and effective decision. However, how much meaning of the source text can be preserved is becoming harder to evaluate. The most commonly used automatic evaluation metrics like Recall-Oriented Understudy for Gisting Evaluation (ROUGE) strictly rely on the overlapping n-gram units between reference and candidate summaries, which are not suitable to measure the quality of abstractive summaries. Another major challenge to evaluate text summarization systems is lack of consistent ideal reference summaries. Studies show that human summarizers can produce variable reference summaries of the same source that can significantly affect automatic evaluation metrics scores of summarization systems. Humans are biased to certain situation while producing summary, even the same person perhaps produces substantially different summaries of the same source at different time. This paper proposes a word embedding based automatic text summarization and evaluation framework, which can successfully determine salient top-n sentences of a source text as a reference summary, and evaluate the quality of systems summaries against it. Extensive experimental results demonstrate that the proposed framework is effective and able to outperform several baseline methods with regard to both text summarization systems and automatic evaluation metrics when tested on a publicly available dataset.


Author(s):  
Dolores Tierney

Guillermo del Toro (b. 1964) is an Oscar-winning Mexican director, screenwriter, producer, novelist, film scholar, curator, and nonfiction writer who works internationally on English-language and Spanish-language projects in Mexico, New Zealand, Spain, and the United States and across a number of different media, including film, television, animation, and novels. Although he has worked in multiple genres, including horror (Mimic (1997), Blade II (2002), Crimson Peak (2015)), action/fantasy (Hellboy (2004), Hellboy II: The Golden Army (2008)), science fiction (Pacific Rim (2013)), and hybrids of these and other genres (The Shape of Water (2017)), he is most known for the gothic sensibility of many of his projects (Cronos (1993), The Devil’s Backbone (2001), Pan’s Labyrinth (2006), Crimson Peak (2015)). Relatedly, Del Toro’s Cronos and his subsequent films, including those he has produced have contributed greatly to the rehabilitation of the horror and fantasy genres from the cultural disreputability they suffered through the 1960s to the early 1990s and also facilitated more horror production in Mexico going forward. In addition to the gothic quality of his work, Del Toro’s auteur status is often traced through the recurring imagery, themes, and monsters that appear across his oeuvre and through the recurring preoccupations with the contiguity of real and fantasy worlds and with ghosts as manifestations of the (historical and political) past. Although Del Toro has made and been involved in the production of some notable franchise films in recent years, directing Blade II, Hellboy, and Hellboy II: The Golden Army, receiving a screenwriting credit for The Hobbit: An Unexpected Journey (2012), The Hobbit: The Desolation of Smaug (2013), and The Hobbit: The Battle of the Five Armies (2014) he has also turned down several opportunities to work on franchise films in the Narnia and Harry Potter series (passing on directing Harry Potter and the Prisoner of Azkaban but suggesting his compatriot Alfonso Cuarón for the job instead) and leaving the production of The Hobbit films after work on the scripts. He’s also received writing credit on Trox Nixey’s Don’t Be Afraid of the Dark (2010).


Automatic text summarization is a technique of generating short and accurate summary of a longer text document. Text summarization can be classified based on the number of input documents (single document and multi-document summarization) and based on the characteristics of the summary generated (extractive and abstractive summarization). Multi-document summarization is an automatic process of creating relevant, informative and concise summary from a cluster of related documents. This paper does a detailed survey on the existing literature on the various approaches for text summarization. Few of the most popular approaches such as graph based, cluster based and deep learning-based summarization techniques are discussed here along with the evaluation metrics, which can provide an insight to the future researchers.


Author(s):  
Shawna Holmes

This paper examines the changes to procurement for school food environments in Canada as a response to changes to nutrition regulations at the provincial level. Interviews with those working in school food environments across Canada revealed how changes to the nutrition requirements of foods and beverages sold in schools presented opportunities to not only improve the nutrient content of the items made available in school food environments, but also to include local producers and/or school gardens in procuring for the school food environment. At the same time, some schools struggle to procure nutritionally compliant foods due to increased costs associated with transporting produce to rural, remote, or northern communities as well as logistic difficulties like spoilage. Although the nutrition regulations have facilitated improvements to food environments in some schools, others require more support to improve the overall nutritional quality of the foods and beverages available to students at school.


2018 ◽  
Vol 1 (1) ◽  
pp. 32-41 ◽  
Author(s):  
Abdulmalik Usman ◽  
Dahiru Musa Abdullahi

The paper seeks to investigate the level of productive knowledge of ESL learners, the writing quality and the relationship between the vocabulary knowledge and the writing quality. 150 final year students of English language in a university in Nigeria were randomly selected as respondents. The respondents were asked to write an essay of 300 words within one hour. The essays were typed into Vocab Profiler of Cobb (2002) and analyzed the Lexical Frequency Profile of the respondents. The essays were also assessed by independent examiners using a standard rubric. The findings reveal that the level of productive vocabulary knowledge of the respondents is limited. The writing quality of the majority of the respondent is fair and there is a significant correlation between vocabulary and the witting quality of the subjects. The researchers posit that productive vocabulary is the predictor of writing quality and recommend various techniques through which teaching and learning of vocabulary can be improved.


2020 ◽  
Vol 47 (1) ◽  
pp. 89-95 ◽  
Author(s):  
Garry D. Carnegie

ABSTRACT This response to the recent contribution by Matthews (2019) entitled “The Past, Present, and Future of Accounting History” specifically deals with the issues associated with concentrating on counting publication numbers in examining the state of a scholarly research field at the start of the 2020s. It outlines several pitfalls with the narrowly focused publications count analysis, in selected English language journals only, as provided by Matthews. The commentary is based on three key arguments: (1) accounting history research and publication is far more than a “numbers game”; (2) trends in the quality of the research undertaken and published are paramount; and (3) international publication and accumulated knowledge in accounting history are indeed more than a collection of English language publications. The author seeks to contribute to discussion and debate between accounting historians and other researchers for the benefit and development of the international accounting history community and global society.


2021 ◽  
Vol 18 (1) ◽  
Author(s):  
Maurizio Nicola D’Alterio ◽  
Stefania Saponara ◽  
Mirian Agus ◽  
Antonio Simone Laganà ◽  
Marco Noventa ◽  
...  

AbstractEndometriosis impairs the quality of life (QoL) of many women, including their social relationships, daily activity, productivity at work, and family planning. The aim of this review was to determine the instruments used to examine QoL in previous clinical studies of endometriosis and to evaluate the effect of medical and surgical interventions for endometriosis on QoL. We conducted a systematic search and review of studies published between January 2010 and December 2020 using MEDLINE. Search terms included “endometriosis” and “quality of life.” We only selected studies that used a standardized questionnaire to evaluate QoL before and after medical or surgical interventions. Only articles in the English language were examined. The initial search identified 720 results. After excluding duplicates and applying inclusion criteria, 37 studies were selected for analysis. We found that the two scales most frequently used to measure QoL were the Short Form-36 health survey questionnaire (SF-36) and the Endometriosis Health Profile-30 (EHP-30). Many medical and surgical treatments demonstrated comparable benefits in pain control and QoL improvement. There is no clear answer as to what is the best treatment for improving QoL because each therapy must be personalized for the patient and depends on the woman’s goals. In conclusion, women must be informed about endometriosis and given easily accessible information to improve treatment adherence and their QoL.


2021 ◽  
Vol 14 (1) ◽  
pp. 205979912098776
Author(s):  
Joseph Da Silva

Interviews are an established research method across multiple disciplines. Such interviews are typically transcribed orthographically in order to facilitate analysis. Many novice qualitative researchers’ experiences of manual transcription are that it is tedious and time-consuming, although it is generally accepted within much of the literature that quality of analysis is improved through researchers performing this task themselves. This is despite the potential for the exhausting nature of bulk transcription to conversely have a negative impact upon quality. Other researchers have explored the use of automated methods to ease the task of transcription, more recently using cloud-computing services, but such services present challenges to ensuring confidentiality and privacy of data. In the field of cyber-security, these are particularly concerning; however, any researcher dealing with confidential participant speech should also be uneasy with third-party access to such data. As a result, researchers, particularly early-career researchers and students, may find themselves with no option other than manual transcription. This article presents a secure and effective alternative, building on prior work published in this journal, to present a method that significantly reduced, by more than half, interview transcription time for the researcher yet maintained security of audio data. It presents a comparison between this method and a fully manual method, drawing on data from 10 interviews conducted as part of my doctoral research. The method presented requires an investment in specific equipment which currently only supports the English language.


2020 ◽  
Vol 79 (Suppl 1) ◽  
pp. 1133.1-1133
Author(s):  
S. Elangovan ◽  
Y. H. Kwan ◽  
W. Fong

Background:Spondyloarthritis (SpA) is a family of chronic inflammatory disorders. Social media, such as YouTube, is a popular online platform where patients often visit for information. However, the validity of the content uploaded onto YouTube is not known.Objectives:This study aimed to evaluate the content, reliability and quality of the most viewed English-language YouTube videos on SpA.Methods:Keywords “spondyloarthritis”, “spondyloarthropathy” and “ankylosing spondylitis” were searched on YouTube on October 7th, 2019. The top 270 videos were screened. Videos were excluded if they were irrelevant, in non-English language or if they had no audio. Total number of views, duration on YouTube (days), video length, upload date, number of likes, dislikes, subscribers and comments were recorded for videos. A modified 5-point DISCERN tool1and the 5-point Global Quality Scale (GQS) score2were used to assess the reliability and quality of the videos, with higher scores indicating greater reliability and quality respectively.Results:Two hundred of 270 videos were included in the final analysis [61.5% from healthcare professionals, 37.0% from patients, 1.5% from news channels]. Of the 200 videos, 15 were uploaded within the last year and 112 in the last five years. 120 (60%) were categorized as useful information (Group 1), 6 (3%) as misleading information (Group 2), 52 (26%) as useful patient opinion (Group 3) and 22 (11%) as misleading patient opinion (Group 4). Useful videos were mainly from healthcare professionals or patients (86%). Useful videos (Group 1 and 3) had higher median (IQR) number of subscribers [2700 (14700) vs 211 (457), p < 0.01], reliability scores [3 (1) vs 2 (1), p < 0.01] and GQS scores [3 (1) vs. 2 (1), p < 0.001] compared to misleading videos (Group 2 and 4), respectively.Videos uploaded by healthcare professionals tended to have more useful information [94% (116 of 123) vs. 66% (49 of 74), p < 0.001] and had higher median (IQR) reliability scores [3 (1) vs 2 (1), p < 0.001] and GQS scores [3 (2) vs 2 (1), p < 0.001] compared to patient uploaded videos respectively. Of the 5 (out of 123) videos from healthcare professionals that had misleading information, it was because of outdated information on diagnosis (3 videos) and treatment (5 videos) of SpA. Of the 22 videos that had misleading patient opinion, 9 (41%) wrongly described the clinical features for SpA and 14 (64%) portrayed the current evidence based treatment options as ineffective and described alternative treatment plans (i.e. diet restrictions, complementary and alternative medicine).Conclusion:The majority of English language YouTube videos have useful information on the topic of SpA, however, 31% of patient opinions have inaccurate information on the clinical features and treatment options, and viewers need to be cognisant of these “fake news”.References:[1]Charnock D, Shepperd S, Needham G, Gann R (1999) DISCERN: an instrument for judging the quality of written consumer health information on treatment choices. J Epidemiol Community Health 53(2): 105-111[2]Bernard A, Langille M, Hughes S, Rose C, Leddin D, Veldhuyzen van Zanten S (2007) A systematic review of patient inflammatory bowel disease information resources on the World Wide Web. Am J Gastroenterol 102(9):2070-2077Disclosure of Interests:Sakktivel Elangovan: None declared, Yu Heng Kwan: None declared, Warren Fong Consultant of: Abbvie, Janssen, Novartis, Speakers bureau: Abbvie, Janssen, Novartis


2018 ◽  
Vol 19 (2) ◽  
pp. 195-209 ◽  
Author(s):  
Zachary W. Taylor

This study examines first-year undergraduate admissions materials from 325 bachelor-degree granting U.S. institutions, closely analyzing the English-language readability and Spanish-language readability and translation of these materials. Via Yosso’s linguistic capital, the results reveal 4.9% of first-year undergraduate admissions materials had been translated into Spanish, 4% of institutional admissions websites embed translation widgets, and the average readability of English-language content is above the 13th-grade reading level. Implications for research and practice are discussed.


Sign in / Sign up

Export Citation Format

Share Document