scholarly journals Dpaper: An Authoring Tool for Extractable Digital Papers

2017 ◽  
Vol 1 (1) ◽  
pp. 86-97
Author(s):  
Xiaoqiu Le ◽  
Chenyu Mao ◽  
Yuanbiao He ◽  
Changlei Fu ◽  
Liyuan Xu

AbstractPurposeTo develop a structured, rich media digital paper authoring tool with an object-based model that enables interactive, playable, and convertible functions.Design/methodology/approachWe propose Dpaper to organize the content (text, data, rich media, etc.) of dissertation papers as XML and HTML5 files by means of digital objects and digital templates.FindingsDpaper provides a structured-paper editorial platform for the authors of PhDs to organize research materials and to generate various digital paper objects that are playable and reusable. The PhD papers are represented as Web pages and structured XML files, which are marked with semantic tags.Research limitationsThe proposed tool only provides access to a limited number of digital objects. For instance, the tool cannot create equations and graphs, and typesetting is not yet flexible compared to MS Word.Practical implicationsThe Dpaper tool is designed to break through the patterns of unstructured content organization of traditional papers, and makes the paper accessible for not only reading but for exploitation as data, where the document can be extractable and reusable. As a result, Dpaper can make the digital publishing of dissertation texts more flexible and efficient, and their data more assessable.Originality/valueThe Dpaper tool solves the challenge of making a paper structured and object-based in the stage of authoring, and has practical values for semantic publishing.

Author(s):  
Byung-Kwon Park ◽  
Il-Yeol Song

As the amount of data grows very fast inside and outside of an enterprise, it is getting important to seamlessly analyze both data types for total business intelligence. The data can be classified into two categories: structured and unstructured. For getting total business intelligence, it is important to seamlessly analyze both of them. Especially, as most of business data are unstructured text documents, including the Web pages in Internet, we need a Text OLAP solution to perform multidimensional analysis of text documents in the same way as structured relational data. We first survey the representative works selected for demonstrating how the technologies of text mining and information retrieval can be applied for multidimensional analysis of text documents, because they are major technologies handling text data. And then, we survey the representative works selected for demonstrating how we can associate and consolidate both unstructured text documents and structured relation data for obtaining total business intelligence. Finally, we present a future business intelligence platform architecture as well as related research topics. We expect the proposed total heterogeneous business intelligence architecture, which integrates information retrieval, text mining, and information extraction technologies all together, including relational OLAP technologies, would make a better platform toward total business intelligence.


Author(s):  
Masaomi Kimura ◽  

Text mining has been growing; mainly due to the need to extract useful information from vast amounts of textual data. Our target here is text data, a collection of freely described data from questionnaires. Unlike research papers, newspaper articles, call-center logs and web pages, which are usually the targets of text mining analysis, the freely described data contained in the questionnaire responses have specific characteristics, including a small number of short sentences forming individual pieces of data, while the wide variety of content precludes the applications of clustering algorithms used to classify the same. In this paper, we suggest the way to extract the opinions which are delivered by multiple respondents, based on the modification relationships included in each sentence in the freely described data. Certain applications of our method are also presented after the introduction of our approach.


2018 ◽  
Vol 13 (1) ◽  
pp. 183-194 ◽  
Author(s):  
Megan Senseney ◽  
Eleanor Dickson ◽  
Beth Namachchivaya ◽  
Bertram Ludäscher

Text data mining and analysis has emerged as a viable research method for scholars, following the growth of mass digitization, digital publishing, and scholarly interest in data re-use. Yet the texts that comprise datasets for analysis are frequently protected by copyright or other intellectual property rights that limit their access and use. This article discusses the role of libraries at the intersection of data mining and intellectual property, asserting that academic libraries are vital partners in enabling scholars to effectively incorporate text data mining into their research. We report on activities leading up to an IMLS-funded National Forum of stakeholders and discuss preliminary findings from a systematic literature review, as well as initial results of interviews with forum stakeholders. Emerging themes suggest the need for a multi-pronged distributed approach that includes a public campaign for building awareness and advocacy, development of best practice guides for library support services and training, and international efforts toward data standardization and copyright harmonization.


2020 ◽  
Vol 70 (2) ◽  
pp. 283-289
Author(s):  
D.R. Rakhimova ◽  
◽  
A.R. Satybaldiev ◽  

This work is devoted to the creation of a system for the automatic collection and processing of open data in Kazakh from Internet resources, and bears practical significance in the tasks of collecting and analyzing text. The introduction substantiates the relevance of the chosen topic, a review of existing approaches, formulates the objectives of the study. We consider such a problem as the collection and primary processing of text data with subsequent analysis. Data collection is a priority, since open data from Internet resources is not structured and needs to be processed. The authors provide a system for processing web pages of Kazakh-language portals, and also gives practical application of this approach to real data of open resources using the created system. The approach of indexing documents using features is presented. The system will help structure open data from Internet resources, as well as analyze collected data. Practical results are presented


Author(s):  
Shigeru Ikuta ◽  
Satsuki Yamashita ◽  
Hayato Higo ◽  
Jinko Tomiyama ◽  
Noriko Saotome ◽  
...  

Original teaching materials with dot codes, which can be linked to multimedia such as audio, movies, web pages, html files, and PowerPoint files, were created for use with students with disabilities. Hand-crafted original teaching materials can easily be created by the users themselves—for example, by schoolteachers—with newly developed and easy-to-handle software. A maximum of four multimedia files can be linked to each Post-It sticker icon and/or dot codes overlaid with a specially-designed software (GM Authoring Tool), and such multimedia files are replayed with a specially-designed sound pen (G-Speak) and scanner pen (G-Pen Blue) with Bluetooth functionality just by using the pen to touch the Post-It sticker icon and/or the dot codes on the printed document. Many activities using dot code materials have been successfully conducted, especially at special needs schools. Basic information on the creation of these materials—and on their use in schools—is presented in this chapter.


Author(s):  
Yiming Wang ◽  
Ximing Li ◽  
Jihong Ouyang

Neural topic modeling provides a flexible, efficient, and powerful way to extract topic representations from text documents. Unfortunately, most existing models cannot handle the text data with network links, such as web pages with hyperlinks and scientific papers with citations. To resolve this kind of data, we develop a novel neural topic model , namely Layer-Assisted Neural Topic Model (LANTM), which can be interpreted from the perspective of variational auto-encoders. Our major motivation is to enhance the topic representation encoding by not only using text contents, but also the assisted network links. Specifically, LANTM encodes the texts and network links to the topic representations by an augmented network with graph convolutional modules, and decodes them by maximizing the likelihood of the generative process. The neural variational inference is adopted for efficient inference. Experimental results validate that LANTM significantly outperforms the existing models on topic quality, text classification and link prediction..


2021 ◽  
Author(s):  
Jeremy M. Littler

Xstreamulator is a .NET based web casting application that utilizes the Microsoft Windows Media Server to broadcast classroom lectures and events. Uniquely, the application supports the synchronized delivery of captured bitmap content (slides), which are displayed in an ASPIHTML based cross-browser viewing environment. At present, Xstreamulator supports bitmap slide capturing from PowerPoint presentations, computer desktops, images, web pages and external VGA sources. Additional capture capabilities are currently in development. Although Xstreamulator has been used extensively for live webcasting, it can also be employed to record webcasts for distribution through ondemand delivery or removable media. In contrast to commercial solutions, Xstreamulator's live webcasting functionality is not constrained to traditional academic settings (i.e., classrooms). Indeed, many instructors at Ryerson University have successfully employed Xstreamulator to web cast lectures from their office or home. In addition, Xstreamulator has been employed effectively in the delivery of events, lectures, symposiums and conferences. Xstreamulator has from the outset been designed to operate reliably in diverse hardware environments. For example, the application can be installed on personal computers, classroom presentation systems, or portable encoding "stations". Thus, by leveraging the existing computer infrastructure at Ryerson University, it has been possible to circumvent the acquisition of costly commercial web casting systems. Xstreamulator's comprehensive content delivery approach and hardware neutrality has addressed the entire range of webcast requirements within the University environment in very cost effective and scalable manner. Xstreamulator's development process has been driven by the philosophy of participatory design (PD). Students, faculty and staff at Ryerson University have generously donated their time to test Xstreamulator prototypes, and have contributed significantly to the evolution of the application's user interface and functionality. Therefore, the Xstreamulator project demonstrates the significant advantages of implementing participatory design goals in the development of rich media webcasting solutions. Indeed, while the technological achievements of the project are noteworthy, they could have only been achieved in an environment that fostered collaboration at all levels. The development of an in-house web casting solution requires a commitment of development personnel and technical resources. However, the cost of providing these inhouse resources will be offset by reduced webcasting costs over the long-term. Additionally, applications like Xstreamulator can be rapidly employed to generate webcasting revenue from university events (e.g., conferences). In summary, as the use of Xstreamulator at Ryerson University has eliminated a dependence on commercial solutions, it has been possible to re-assign these cost savings to the design of some of the most powerful event webcasting systems in North America.


2021 ◽  
Vol 3 (1) ◽  
pp. 120-131
Author(s):  
Luis Antonio Tavares ◽  
Matheus Carvalho Meira ◽  
Sérgio Ferreira do Amaral

Abstract This paper presents an extension of the mind map pedagogical tool, a conception in which the mind map becomes interactive and dynamic. We took advantage of all the mind map’s learning potential and benefits, and we add new ones when we propose the interactive mind map tool. We develop a model in which the teacher would have an authoring tool for creating a mind map with its elements, relationships, and interactive content related to each map element. The proposed tool is rich media, as it incorporates different types of media, allowing it to reach students with different learning profiles and needs. Furthermore, the technological aspect brings the school closer to the student’s reality.


Sign in / Sign up

Export Citation Format

Share Document