Syntactic analysis of natural language using linguistic rules and corpus-based patterns

PRINCIPAL PROBLEMS OF NATURAL LANGUAGE PROCESSING SYSTEMS

Studia Philologica ◽

10.28925/2311-2425.2018.11.5 ◽

2018 ◽

pp. 35-38

Author(s):

O. Hyryn

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Syntactic Analysis ◽

Syntactic Ambiguity ◽

Grammatical Structure ◽

English Sentence ◽

Analysis Methods ◽

The Way

The article deals with natural language processing, namely that of an English sentence. The article describes the problems, which might arise during the process and which are connected with graphic, semantic, and syntactic ambiguity. The article provides the description of how the problems had been solved before the automatic syntactic analysis was applied and the way, such analysis methods could be helpful in developing new analysis algorithms. The analysis focuses on the issues, blocking the basis for the natural language processing — parsing — the process of sentence analysis according to their structure, content and meaning, which aims to analyze the grammatical structure of the sentence, the division of sentences into constituent components and defining links between them.

Download Full-text

Recognition of spatial data from natural language texts for the purpose of visualization

Transaction Kola Science Cetnre ◽

10.37614/2307-5252.2021.5.12.004 ◽

2021 ◽

Vol 12 (5-2021) ◽

pp. 50-56

Author(s):

Boris M. Pileckiy ◽

Keyword(s):

Natural Language ◽

Spatial Data ◽

Software Tool ◽

Syntactic Analysis ◽

Practical Implementation ◽

Distributed Software ◽

Preliminary Results

This paper describes one of the possible implementation options for the recognition of spatial data from natural language texts. The proposed option is based on the lexico-syntactic analysis of texts, which requires the use of special grammars and dictionaries. Spatial data recognition is carried out for their subsequent geocoding and visualization. The practical implementation of spatial data recognition is done using a free, freely distributed software tool. Also, some applications of spatial data are considered in the work and preliminary results of spatial data recognition are given.

Download Full-text

Identificação de Pragas e Doenças na Cultura da Soja por meio de um Sistema Computacional em Linguagem Natural

10.14210/cotb.v12.p324-331 ◽

2021 ◽

Author(s):

Carolinne Roque e Faria ◽

Cinthyan Renata Sachs Camerlengo de Barb

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Computer System ◽

Language Processing ◽

Agricultural Area ◽

Syntactic Analysis ◽

Dependency Parsing ◽

Named Entities ◽

Pests And Diseases ◽

Improve Production

Technology is becoming expressively popular among agribusiness producers and is progressing in all agricultural area. One of the difficulties in this context is to handle data in natural language to solve problems in the field of agriculture. In order to build up dialogs and provide rich researchers, the present work uses Natural Language Processing (NLP) techniques to develop an automatic and effective computer system to interact with the user and assist in the identification of pests and diseases in the soybean farming, stored in a database repository to provide accurate diagnoses to simplify the work of the agricultural professional and also for those who deal with a lot of information in this area. Information on 108 pests and 19 diseases that damage Brazilian soybean was collected from Brazilian bibliographic manuals with the purpose to optimize the data and improve production, using the spaCy library for syntactic analysis of NLP, which allowed the pre-process the texts, recognize the named entities, calculate the similarity between the words, verify dependency parsing and also provided the support for the development requirements of the CAROLINA tool (Robotized Agronomic Conversation in Natural Language) using the language belonging to the agricultural area.

Download Full-text

Parsing

10.1093/oxfordhb/9780199276349.013.0012 ◽

2012 ◽

Author(s):

John Carroll

Keyword(s):

Natural Language ◽

Language Processing ◽

Real World ◽

Level Of Detail ◽

Semantic Interpretation ◽

Syntactic Analysis ◽

Real World Applications ◽

Grammar Formalisms ◽

Speech Recognizer ◽

Context Free

This article introduces the concepts and techniques for natural language (NL) parsing, which signifies, using a grammar to assign a syntactic analysis to a string of words, a lattice of word hypotheses output by a speech recognizer or similar. The level of detail required depends on the language processing task being performed and the particular approach to the task that is being pursued. This article further describes approaches that produce ‘shallow’ analyses. It also outlines approaches to parsing that analyse the input in terms of labelled dependencies between words. Producing hierarchical phrase structure requires grammars that have at least context-free (CF) power. CF algorithms that are widely used in parsing of NL are described in this article. To support detailed semantic interpretation more powerful grammar formalisms are required, but these are usually parsed using extensions of CF parsing algorithms. Furthermore, this article describes unification-based parsing. Finally, it discusses three important issues that have to be tackled in real-world applications of parsing: evaluation of parser accuracy, parser efficiency, and measurement of grammar/parser coverage.

Download Full-text

NATURAL LANGUAGE PROCESSING WITHIN A SLOT GRAMMAR FRAMEWORK

International Journal of Artificial Intelligence Tools ◽

10.1142/s021821309200020x ◽

1992 ◽

Vol 01 (02) ◽

pp. 229-277 ◽

Cited By ~ 2

Author(s):

MICHAEL MCCORD ◽

ARENDSE BERNTH ◽

SHALOM LAPPIN ◽

WLODEK ZADROZNY

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Logical Form ◽

Translation System ◽

Inference System ◽

Linguistic Rules ◽

Machine Translation System ◽

Single Structure ◽

Grammar Analysis

This paper contains brief descriptions of the latest form of Slot Grammar and four natural language processing systems developed in this framework. Slot Grammar is a lexicalist, dependency-oriented grammatical system, based on the systematic expression of linguistic rules and data in terms of slots (essentially grammatical relations) and slot frames. The exposition focuses on the kinds of analysis structures produced by the Slot Grammar parser. These structures offer convenient input to post-syntactic processing (in particular to the applications dealt with in the paper); they contain in a single structure a useful combination of surface structure and logical form. The four applications discussed are: (1) An anaphora resolution system dealing with both NP anaphora and VP anaphora (and combinations of the two). (2) A meaning postulate based inference system for natural language, in which inference is done directly with Slot Grammar analysis structures. (3) A new transfer system for the machine translation system LMT, based on a new representation for Slot Grammar analyses which allows more convenient tree exploration. (4) A parser of "constructions", viewed as an extension of the core grammar allowing one to handle some linguistic phenomena that are often labeled "extragrammatical", and to assign a semantics to them.

Download Full-text

Basic challenges in natural language processing systems

Studia Philologica ◽

10.28925/2311-2425.2020.145 ◽

2020 ◽

pp. 41-45

Author(s):

O. Hyryn

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Information Search ◽

Question Answering ◽

Syntactic Analysis ◽

Anaphora Resolution ◽

Grammatical Structure ◽

English Sentence ◽

Improved Model

The article proceeds from the intended use of parsing for the purposes of automatic information search, question answering, logical conclusions, authorship verification, text authenticity verification, grammar check, natural language synthesis and other related tasks, such as ungrammatical speech analysis, morphological class definition, anaphora resolution etc. The study covers natural language processing challenges, namely of an English sentence. The article describes formal and linguistic problems, which might arise during the process and which are connected with graphic, semantic, and syntactic ambiguity. The article provides the description of how the problems had been solved before the automatic syntactic analysis was applied and the way, such analysis methods could be helpful in developing new analysis algorithms today. The analysis focuses on the issues, blocking the basis for the natural language processing — parsing — the process of sentence analysis according to their structure, content and meaning, which aims to examine the grammatical structure of the sentence, the division of sentences into constituent components and defining links between them. The analysis identifies a number of linguistic issues that will contribute to the development of an improved model of automatic syntactic analysis: lexical and grammatical synonymy and homonymy, hypo- and hyperonymy, lexical and semantic fields, anaphora resolution, ellipsis, inversion etc. The scope of natural language processing reveals obvious directions for the improvement of parsing models. The improvement will consequently expand the scope and improve the results in areas that already employ automatic parsing. Indispensable achievements in vocabulary and morphology processing shall not be neglected while improving automatic syntactic analysis mechanisms for natural languages.

Download Full-text

Liana

ITL - International Journal of Applied Linguistics ◽

10.1075/itl.30.01bat ◽

1975 ◽

Vol 30 ◽

pp. 1-24

Author(s):

I. Batoni ◽

R. Henning ◽

H. Lehmann ◽

B. Schirmer ◽

M. Zoeppritz

Keyword(s):

Natural Language ◽

Question Answering ◽

Semantic Analysis ◽

Tree Structure ◽

Semantic Interpretation ◽

Syntactic Analysis ◽

Question Answering System ◽

Continuous Text ◽

Language Input

Abstract LIANA is a question answering system in PL/1. The program takes German natural language input and, by morphological, syntactic and semantic analysis, creates a representation of the text, which is stored and can be accessed for retrieval purposes. All individuals (objects) mentioned in the sentence are found and stored. In continuous text, therefore, information about individuals can be piled up successively. LIANA uses the programming concept of the Boston Syntax Analyzer. Therefore, the output of syntactic analysis is a tree structure, simulated through pointers which connect the nodes in the tree. Each node is associated with a feature table which is operated on by the semantic interpretation. Node and feature handling is facilitated by a set of macros for adding, erasing, and checking features and copying, deleting, and inserting nodes.

Download Full-text

Natural Language Processing Techniques in Requirements Engineering

Knowledge Engineering for Software Development Life Cycles ◽

10.4018/978-1-60960-509-4.ch002 ◽

2011 ◽

pp. 21-33 ◽

Cited By ~ 2

Author(s):

A. Egemen Yilmaz ◽

I. Berk Yilmaz

Keyword(s):

Natural Language ◽

Language Processing ◽

Knowledge Bases ◽

Syntactic Analysis ◽

Requirement Analysis ◽

Specific Knowledge ◽

Crucial Step ◽

Development Processes ◽

Processing Techniques ◽

Application Specific

Requirement analysis is the very first and crucial step in the software development processes. Stating the requirements in a clear manner, not only eases the following steps in the process, but also reduces the number of potential errors. In this chapter, techniques for the improvement of the requirements expressed in the natural language are revisited. These techniques try to check the requirement quality attributes via lexical and syntactic analysis methods sometimes with generic, and sometimes domain and application specific knowledge bases.

Download Full-text

Natural Language Interfaces to Ontologies: Combining Syntactic Analysis and Ontology-Based Lookup through the User Interaction

Lecture Notes in Computer Science - The Semantic Web: Research and Applications ◽

10.1007/978-3-642-13486-9_8 ◽

2010 ◽

pp. 106-120 ◽

Cited By ~ 75

Author(s):

Danica Damljanovic ◽

Milan Agatonovic ◽

Hamish Cunningham

Keyword(s):

Natural Language ◽

User Interaction ◽

Syntactic Analysis ◽

Natural Language Interfaces

Download Full-text

Extraction of Construction Quality Requirements from Textual Specifications via Natural Language Processing

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211001385 ◽

2021 ◽

pp. 036119812110013

Author(s):

JungHo Jeon ◽

Xin Xu ◽

Yuxi Zhang ◽

Liu Yang ◽

Hubo Cai

Keyword(s):

Neural Network ◽

South Carolina ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Syntactic Analysis ◽

Test Case ◽

Promising Alternative ◽

Construction Inspection ◽

Construction Specification

Construction inspection is an essential component of the quality assurance programs of state transportation agencies (STAs), and the guidelines for this process reside in lengthy textual specifications. In the current practice, engineers and inspectors must manually go through these documents to plan, conduct, and document their inspections, which is time-consuming, very subjective, inconsistent, and prone to error. A promising alternative to this manual process is the application of natural language processing (NLP) techniques (e.g., text parsing, sentence classification, and syntactic analysis) to automatically extract construction inspection requirements from textual documents and present them as straightforward check questions. This paper introduces an NLP-based method that: 1) extracts individual sentences from the construction specification; 2) preprocesses the resulting sentences; 3) applies Word2Vec and GloVe algorithms to extract vector features; 4) uses a convolutional neural network (CNN) and recurrent neural network to classify sentences; and 5) converts the requirement sentences into check questions via syntactic analysis. The overall methodology was assessed using the Indiana Department of Transportation (DOT) specification as a test case. Our results revealed that the CNN + GloVe combination led to the highest accuracy, at 91.9%, and the lowest loss, at 11.7%. To further validate its use across STAs nationwide, we applied it to the construction specification of the South Carolina DOT as a test case, and our average accuracy was 92.6%.

Download Full-text