A Compression-Based Toolkit for Modelling and Processing Natural Language Text

William Teahan

doi:10.3390/info9120294

A Compression-Based Toolkit for Modelling and Processing Natural Language Text

Information ◽

10.3390/info9120294 ◽

2018 ◽

Vol 9 (12) ◽

pp. 294 ◽

Cited By ~ 3

Author(s):

William Teahan

Keyword(s):

Natural Language ◽

Model Building ◽

Channel Model ◽

Text Processing ◽

Training Model ◽

Second Phase ◽

Two Phase ◽

Natural Language Text ◽

Language Text

A novel compression-based toolkit for modelling and processing natural language text is described. The design of the toolkit adopts an encoding perspective—applications are considered to be problems in searching for the best encoding of different transformations of the source text into the target text. This paper describes a two phase `noiseless channel model’ architecture that underpins the toolkit which models the text processing as a lossless communication down a noise-free channel. The transformation and encoding that is performed in the first phase must be both lossless and reversible. The role of the verification and decoding second phase is to verify the correctness of the communication of the target text that is produced by the application. This paper argues that this encoding approach has several advantages over the decoding approach of the standard noisy channel model. The concepts abstracted by the toolkit’s design are explained together with details of the library calls. The pseudo-code for a number of algorithms is also described for the applications that the toolkit implements including encoding, decoding, classification, training (model building), parallel sentence alignment, word segmentation and language segmentation. Some experimental results, implementation details, memory usage and execution speeds are also discussed for these applications.

Download Full-text

Natural language text processing and the maximal join operator

Lecture Notes in Computer Science - Conceptual Structures: Knowledge Representation as Interlingua ◽

10.1007/3-540-61534-2_6 ◽

1996 ◽

pp. 100-114

Author(s):

Heike Petermann

Keyword(s):

Natural Language ◽

Text Processing ◽

Natural Language Text ◽

Language Text

Download Full-text

Automatic argumentation mining and the role of stance and sentiment

Journal of Argumentation in Context ◽

10.1075/jaic.00006.ste ◽

2020 ◽

Vol 9 (1) ◽

pp. 19-41

Author(s):

Manfred Stede

Keyword(s):

Natural Language ◽

Sentiment Analysis ◽

Computational Linguistics ◽

Current Practice ◽

Structural Components ◽

Short Introduction ◽

Natural Language Text ◽

Argumentation Mining ◽

Language Text

Abstract Argumentation mining is a subfield of Computational Linguistics that aims (primarily) at automatically finding arguments and their structural components in natural language text. We provide a short introduction to this field, intended for an audience with a limited computational background. After explaining the subtasks involved in this problem of deriving the structure of arguments, we describe two other applications that are popular in computational linguistics: sentiment analysis and stance detection. From the linguistic viewpoint, they concern the semantics of evaluation in language. In the final part of the paper, we briefly examine the roles that these two tasks play in argumentation mining, both in current practice, and in possible future systems.

Download Full-text

A common architecture to encourage reuse of natural language/text processing tools

Proceedings of 8th Knowledge-Based Software Engineering Conference ◽

10.1109/kbse.1993.341193 ◽

2002 ◽

Cited By ~ 1

Author(s):

T. MacMillan ◽

E. Lusher ◽

M. Farinacci ◽

S. Laskowski ◽

L. Seligman ◽

...

Keyword(s):

Natural Language ◽

Text Processing ◽

Natural Language Text ◽

Common Architecture ◽

Language Text

Download Full-text

AWide-Reflective-Equilibrium Conception of Reconstructive Formalization

History of Philosophy and Logical Analysis ◽

10.30965/26664275-01701007 ◽

2014 ◽

Vol 17 (1) ◽

pp. 130-151

Author(s):

Winfried Löffler

Keyword(s):

Natural Language ◽

Reflective Equilibrium ◽

Wide Reflective Equilibrium ◽

Natural Language Text ◽

Logical Formalization ◽

Language Text ◽

Relevant Factors

I propose that a logical formalization of a natural language text (especially an argument) may be regarded as adequate if the following three groups of beliefs can be integrated into a wide reflective equilibrium: (1) our initial, spontaneous beliefs about the structure and logical quality of the text; (2) our beliefs about its structure and logical quality as reflected in the proposed formalization, and (3) our background beliefs about the original text’s author, his thought and other contextually relevant factors. Unlike a good part of the literature, I stress the indispensable role of initial beliefs in achieving such a wide reflective equilibrium. In the final sections I show that my approach does not succumb to undue subjectivism or the mere perpetuation of prejudice. The examples I use to illustrate my claims are chiefly taken from Anselm’s Proslogion 2–3 and the various attempts to formalize these texts.

Download Full-text

The experience of developing a large-scale natural language text processing system

10.3115/974235.974271 ◽

1988 ◽

Cited By ~ 14

Author(s):

Stephen D. Richardson ◽

Lisa C. Braden-Harder

Keyword(s):

Natural Language ◽

Large Scale ◽

Text Processing ◽

Processing System ◽

Natural Language Text ◽

Language Text

Download Full-text

Morality Classification in Natural Language Text

IEEE Transactions on Affective Computing ◽

10.1109/taffc.2020.3034050 ◽

2020 ◽

pp. 1-1

Author(s):

Matheus C. Pavan ◽

Vitor G. Santos ◽

Alex G. J. Lan ◽

Joao Martins ◽

Wesley Ramos Santos ◽

...

Keyword(s):

Natural Language ◽

Natural Language Text ◽

Language Text

Download Full-text

Accurate fact harvesting from natural language text in wikipedia with Lector

Proceedings of the 19th International Workshop on Web and Databases - WebDB '16 ◽

10.1145/2932194.2932203 ◽

2016 ◽

Cited By ~ 2

Author(s):

Matteo Cannaviccio ◽

Denilson Barbosa ◽

Paolo Merialdo

Keyword(s):

Natural Language ◽

Natural Language Text ◽

Language Text

Download Full-text

Generation of Natural Language Text using Perspective Descriptor in Frames

IETE Journal of Research ◽

10.1080/03772063.2001.11416202 ◽

2001 ◽

Vol 47 (1-2) ◽

pp. 43-57

Author(s):

G V Uma ◽

T V Geetha

Keyword(s):

Natural Language ◽

Natural Language Text ◽

Language Text

Download Full-text

Word-based self-indexes for natural language text

ACM Transactions on Information Systems ◽

10.1145/2094072.2094073 ◽

2012 ◽

Vol 30 (1) ◽

pp. 1-34 ◽

Cited By ~ 27

Author(s):

Antonio Fariña ◽

Nieves R. Brisaboa ◽

Gonzalo Navarro ◽

Francisco Claude ◽

Ángeles S. Places ◽

...

Keyword(s):

Natural Language ◽

Natural Language Text ◽

Language Text

Download Full-text

Document representation in natural language text retrieval

Proceedings of the workshop on Human Language Technology - HLT '94 ◽

10.3115/1075812.1075896 ◽

1994 ◽

Cited By ~ 1

Author(s):

Tomek Strzalkowski

Keyword(s):

Natural Language ◽

Text Retrieval ◽

Document Representation ◽

Natural Language Text ◽

Language Text

Download Full-text