Character-Level Interaction in Computer-Assisted Transcription of Text Images

Author(s):  
Veronica Romero ◽  
Alejandro H. Toselli ◽  
Enrique Vidal
Author(s):  
Alejandro Héctor Toselli ◽  
Enrique Vidal ◽  
Francisco Casacuberta

2020 ◽  
Vol 65 (1) ◽  
pp. 37-52
Author(s):  
Adinel C. Dincă ◽  
Emil Ștețco

"The objective of the present paper is to introduce to a wider audience, at a very early stage of development, the initial results of a Romanian joint initiative of AI software engineers and palaeographers in an experimental project aiming to assist and improve the transcription effort of medieval texts with AI software solutions, uniquely designed and trained for the task. Our description will start by summarizing the previous attempts and the mixed-results achieved in e-palaeography so far, a continuously growing field of combined scholarship at an international level. The second part of the study describes the specific project, developed by Zetta Cloud, with the aim of demonstrating that, by applying state of the art AI Computer Vision algorithms, it is possible to automatically binarize and segment text images with the final scope of intelligently extracting the content from a sample set of medieval handwritten text pages. Keywords: Middle Ages, Latin writing, palaeography, Artificial Intelligence, Computer Vision, automatic transcription."


Author(s):  
DANIEL MARTÍN-ALBO ◽  
VERÓNICA ROMERO ◽  
ALEJANDRO H. TOSELLI ◽  
ENRIQUE VIDAL

Currently, automatic handwriting recognition systems are ineffectual in unconstrained handwriting documents. Therefore, to obtain perfect transcriptions, heavy human intervention is required to validate and correct the results of such systems. Given that this post-editing process is inefficient and uncomfortable, a multimodal interactive approach has been proposed in previous works, which aims at obtaining correct transcriptions with the minimum human effort. In this approach, the user interacts with the system by means of an e-pen and/or more traditional methods such as keyboard or mouse. This user's feedback allows to improve system accuracy and multimodality increases system ergonomics and user acceptability. Until now, multimodal interaction has been considered only at whole-word level. In this work, multimodal interaction at character-level is studied, that may lead to more effective interactivity, since it is faster and easier to write only one character rather than a whole word. Here we study this kind of fine-grained multimodal interaction and present developments that allow taking advantage of interaction-derived context to significantly improve feedback decoding accuracy. Empirical tests on three cursive handwritten tasks suggest that, despite losing the deterministic accuracy of traditional peripherals, this approach can save significant amounts of user effort with respect to fully manual transcription as well as to noninteractive post-editing correction.


Sign in / Sign up

Export Citation Format

Share Document