Improving Statistical Linguistic Algorithms for Parsing Mathematics

Parsing multi-ordered grammars with the Gray algorithm

10.7287/peerj.preprints.27465v2 ◽

2019 ◽

Author(s):

Nick Papoulias

Keyword(s):

Programming Languages ◽

Language Processing ◽

Language Design ◽

Formal Specifications ◽

Problem Statement ◽

Language Constructs ◽

Chart Parsing ◽

Parsing Algorithm ◽

Definition Of ◽

Context Free

Background. Context-free grammars (CFGs) and Parsing-expression Grammars (PEGs) are the two main formalisms used by formal specifications and parsing frameworks to describe programming languages. They mainly differ in the definition of the choice operator, describing language alternatives. CFGs support the use of non-deterministic choice (i.e., unordered choice), where all alternatives are equally explored. PEGs support a deterministic choice (i.e., ordered choice), where alternatives are explored in strict succession. In practice the two formalisms, are used through concrete classes of parsing algorithms (such as Left-to-right, rightmost derivation (LR) for CFGs and Packrat parsing for PEGs), that follow the semantics of the formal operators. Problem Statement. Neither the two formalisms, nor the accompanying algorithms are sufficient for a complete description of common cases arising in language design. In order to properly handle ambiguity, recursion, precedence or associativity, parsing frameworks either introduce implementation specific directives or ask users to refactor their grammars to fit the needs of the framework/algorithm/formalism combo. This introduces significant complexity even in simple cases and results in incompatible grammar specifications. Our Proposal. We introduce Multi-Ordered Grammars (MOGs) as an alternative to the CFG and PEG formalisms. MOGs aim for a better exploration of ambiguity, ordering, recursion and associativity during language design. This is achieved by (a) allowing both deterministic and non-deterministic choices to co-exist, and (b) introducing a form of recursive and scoped ordering. The formalism is accompanied by a new parsing algorithm (Gray) that extends chart parsing (normally used for Natural Language Processing) with the proposed MOG operators. Results. We conduct two case-studies to assess the expressiveness of MOGs, compared to CFGs and PEGs. The first consists of two idealized examples from literature (an expression grammar and a simple procedural language). The second examines a real-world case (the entire Smalltalk grammar and eleven new Smalltalk extensions) probing the complexities of practical needs. We show that in comparison, MOGs are able to reduce complexity and naturally express language constructs, without resorting to implementation specific directives. Conclusion. We conclude that combining deterministic and non-deterministic choices in a single grammar specification is indeed not only possible but also beneficial. Moreover, augmented by operators for recursive and scoped ordering the resulting multi-ordered formalism presents a viable alternative to both CFGs and PEGs. Concrete implementations of MOGs can be constructed by extending chart parsing with MOG operators for recursive and scoped ordering.

Download Full-text

The design of Maple's sum-of-products and POLY data structures for representing mathematical objects

10.7287/peerj.preprints.504v1 ◽

2014 ◽

Author(s):

Michael M Monagan

Keyword(s):

Data Structure ◽

Data Structures ◽

Computer Algebra Systems ◽

Improve Performance ◽

Mathematical Expressions ◽

Sum Of Products ◽

Pros And Cons ◽

Work Done ◽

Purpose Computer ◽

Special Purpose Computer

The principal data structure in Maple used to represent polynomials and general mathematical expressions involving functions like sqrt(x), sin x, exp(2x), y'(x) etc., is known to the Maple developers as the sum-of-products data structure. Gaston Gonnet, as the primary author of the Maple kernel, designed and implemented this data structure in the early 80s. As part of the process of simplifying a mathematical formula, he represented every Maple object and every sub-object uniquely in memory. This makes testing for equality, which is used in many operations, very fast. In this article, on occasion of Gaston's retirement, we present details of his design, its pros and cons, and changes we have made to it over the years. One of the cons is the sum-of-products data structure is not nearly as efficient for multiplying multivariate polynomials as other special purpose computer algebra systems. We describe the new data structure called POLY which we added to Maple 17 (released 2013) to improve performance for polynomials in Maple, and recent work done for Maple 18 (released 2014).

Download Full-text

Sample size determinations for stepped-wedge clinical trials from a three-level data hierarchy perspective

Statistical Methods in Medical Research ◽

10.1177/0962280216632564 ◽

2016 ◽

Vol 27 (2) ◽

pp. 480-489 ◽

Cited By ~ 5

Author(s):

Moonseong Heo ◽

Namhee Kim ◽

Michael L Rinke ◽

Judith Wylie-Rosett

Keyword(s):

Statistical Model ◽

Sample Size ◽

Power Function ◽

Data Structures ◽

Statistical Models ◽

Simulation Studies ◽

Stepped Wedge ◽

Power Functions ◽

Intervention Effect ◽

Level Data

Stepped-wedge (SW) designs have been steadily implemented in a variety of trials. A SW design typically assumes a three-level hierarchical data structure where participants are nested within times or periods which are in turn nested within clusters. Therefore, statistical models for analysis of SW trial data need to consider two correlations, the first and second level correlations. Existing power functions and sample size determination formulas had been derived based on statistical models for two-level data structures. Consequently, the second-level correlation has not been incorporated in conventional power analyses. In this paper, we derived a closed-form explicit power function based on a statistical model for three-level continuous outcome data. The power function is based on a pooled overall estimate of stratified cluster-specific estimates of an intervention effect. The sampling distribution of the pooled estimate is derived by applying a fixed-effect meta-analytic approach. Simulation studies verified that the derived power function is unbiased and can be applicable to varying number of participants per period per cluster. In addition, when data structures are assumed to have two levels, we compare three types of power functions by conducting additional simulation studies under a two-level statistical model. In this case, the power function based on a sampling distribution of a marginal, as opposed to pooled, estimate of the intervention effect performed the best. Extensions of power functions to binary outcomes are also suggested.

Download Full-text

The design of Maple's sum-of-products and POLY data structures for representing mathematical objects

10.7287/peerj.preprints.504 ◽

2014 ◽

Author(s):

Michael M Monagan

Keyword(s):

Data Structure ◽

Data Structures ◽

Computer Algebra Systems ◽

Improve Performance ◽

Mathematical Expressions ◽

Sum Of Products ◽

Pros And Cons ◽

Work Done ◽

Purpose Computer ◽

Special Purpose Computer

The principal data structure in Maple used to represent polynomials and general mathematical expressions involving functions like sqrt(x), sin x, exp(2x), y'(x) etc., is known to the Maple developers as the sum-of-products data structure. Gaston Gonnet, as the primary author of the Maple kernel, designed and implemented this data structure in the early 80s. As part of the process of simplifying a mathematical formula, he represented every Maple object and every sub-object uniquely in memory. This makes testing for equality, which is used in many operations, very fast. In this article, on occasion of Gaston's retirement, we present details of his design, its pros and cons, and changes we have made to it over the years. One of the cons is the sum-of-products data structure is not nearly as efficient for multiplying multivariate polynomials as other special purpose computer algebra systems. We describe the new data structure called POLY which we added to Maple 17 (released 2013) to improve performance for polynomials in Maple, and recent work done for Maple 18 (released 2014).

Download Full-text

Parsing multi-ordered grammars with the Gray algorithm

10.7287/peerj.preprints.27465v1 ◽

2019 ◽

Author(s):

Nick Papoulias

Keyword(s):

Programming Languages ◽

Language Processing ◽

Language Design ◽

Formal Specifications ◽

Problem Statement ◽

Language Constructs ◽

Chart Parsing ◽

Parsing Algorithm ◽

Definition Of ◽

Context Free

Background. Context-free grammars (CFGs) and Parsing-expression Grammars (PEGs) are the two main formalisms used by formal specifications and parsing frameworks to describe programming languages. They mainly differ in the definition of the choice operator, describing language alternatives. CFGs support the use of non-deterministic choice (i.e., unordered choice), where all alternatives are equally explored. PEGs support a deterministic choice (i.e., ordered choice), where alternatives are explored in strict succession. In practice the two formalisms, are used through concrete classes of parsing algorithms (such as Left-to-right, rightmost derivation (LR) for CFGs and Packrat parsing for PEGs), that follow the semantics of the formal operators. Problem Statement. Neither the two formalisms, nor the accompanying algorithms are sufficient for a complete description of common cases arising in language design. In order to properly handle ambiguity, recursion, precedence or associativity, parsing frameworks either introduce implementation specific directives or ask users to refactor their grammars to fit the needs of the framework/algorithm/formalism combo. This introduces significant complexity even in simple cases and results in incompatible grammar specifications. Our Proposal. We introduce Multi-Ordered Grammars (MOGs) as an alternative to the CFG and PEG formalisms. MOGs aim for a better exploration of ambiguity, ordering, recursion and associativity during language design. This is achieved by (a) allowing both deterministic and non-deterministic choices to co-exist, and (b) introducing a form of recursive and scoped ordering. The formalism is accompanied by a new parsing algorithm (Gray) that extends chart parsing (normally used for Natural Language Processing) with the proposed MOG operators. Results. We conduct two case-studies to assess the expressiveness of MOGs, compared to CFGs and PEGs. The first consists of two idealized examples from literature (an expression grammar and a simple procedural language). The second examines a real-world case (the entire Smalltalk grammar and eleven new Smalltalk extensions) probing the complexities of practical needs. We show that in comparison, MOGs are able to reduce complexity and naturally express language constructs, without resorting to implementation specific directives. Conclusion. We conclude that combining deterministic and non-deterministic choices in a single gram- mar specification is indeed not only possible but also beneficial. Moreover, augmented by operators for recursive and scoped ordering the resulting multi-ordered formalism presents a viable alternative to both CFGs and PEGs. Concrete implementations of MOGs can be constructed by extending chart parsing with MOG operators for recursive and scoped ordering.

Download Full-text

Dependency Chart Parsing Algorithm Based on Ternary-Span Combination

IEICE Transactions on Information and Systems ◽

10.1587/transinf.e96.d.93 ◽

2013 ◽

Vol E96.D (1) ◽

pp. 93-101

Author(s):

Meixun JIN ◽

Yong-Hun LEE ◽

Jong-Hyeok LEE

Keyword(s):

Chart Parsing ◽

Parsing Algorithm

Download Full-text

A chart-parsing algorithm for efficient semantic analysis

10.3115/1072228.1072251 ◽

2002 ◽

Author(s):

Pascal Vaillant

Keyword(s):

Semantic Analysis ◽

Chart Parsing ◽

Parsing Algorithm

Download Full-text

SPARQA: Skeleton-Based Semantic Parsing for Complex Questions over Knowledge Bases

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6426 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8952-8959

Author(s):

Yawei Sun ◽

Lingling Zhang ◽

Gong Cheng ◽

Yuzhong Qu

Keyword(s):

Knowledge Base ◽

Level Structure ◽

Knowledge Bases ◽

Coarse Grained ◽

Semantic Parsing ◽

Fine Grained ◽

Sentence Level ◽

Natural Language Question ◽

Parsing Algorithm ◽

High Level

Semantic parsing transforms a natural language question into a formal query over a knowledge base. Many existing methods rely on syntactic parsing like dependencies. However, the accuracy of producing such expressive formalisms is not satisfying on long complex questions. In this paper, we propose a novel skeleton grammar to represent the high-level structure of a complex question. This dedicated coarse-grained formalism with a BERT-based parsing algorithm helps to improve the accuracy of the downstream fine-grained semantic parsing. Besides, to align the structure of a question with the structure of a knowledge base, our multi-strategy method combines sentence-level and word-level semantics. Our approach shows promising performance on several datasets.

Download Full-text

An augmented chart parsing algorithm integrating unification grammar and Markov language model for continuous speech recognition

10.1109/icassp.1990.115780 ◽

2002 ◽

Cited By ~ 1

Author(s):

L.-F. Chien ◽

K.J. Chen ◽

L.-S. Lee

Keyword(s):

Speech Recognition ◽

Language Model ◽

Continuous Speech ◽

Continuous Speech Recognition ◽

Chart Parsing ◽

Parsing Algorithm ◽

Unification Grammar

Download Full-text

Parsing multi-ordered grammars with the Gray algorithm

10.7287/peerj.preprints.27465 ◽

2019 ◽

Author(s):

Nick Papoulias

Keyword(s):

Programming Languages ◽

Language Processing ◽

Language Design ◽

Formal Specifications ◽

Problem Statement ◽

Language Constructs ◽

Chart Parsing ◽

Parsing Algorithm ◽

Definition Of ◽

Context Free

Background. Context-free grammars (CFGs) and Parsing-expression Grammars (PEGs) are the two main formalisms used by formal specifications and parsing frameworks to describe programming languages. They mainly differ in the definition of the choice operator, describing language alternatives. CFGs support the use of non-deterministic choice (i.e., unordered choice), where all alternatives are equally explored. PEGs support a deterministic choice (i.e., ordered choice), where alternatives are explored in strict succession. In practice the two formalisms, are used through concrete classes of parsing algorithms (such as Left-to-right, rightmost derivation (LR) for CFGs and Packrat parsing for PEGs), that follow the semantics of the formal operators. Problem Statement. Neither the two formalisms, nor the accompanying algorithms are sufficient for a complete description of common cases arising in language design. In order to properly handle ambiguity, recursion, precedence or associativity, parsing frameworks either introduce implementation specific directives or ask users to refactor their grammars to fit the needs of the framework/algorithm/formalism combo. This introduces significant complexity even in simple cases and results in incompatible grammar specifications. Our Proposal. We introduce Multi-Ordered Grammars (MOGs) as an alternative to the CFG and PEG formalisms. MOGs aim for a better exploration of ambiguity, ordering, recursion and associativity during language design. This is achieved by (a) allowing both deterministic and non-deterministic choices to co-exist, and (b) introducing a form of recursive and scoped ordering. The formalism is accompanied by a new parsing algorithm (Gray) that extends chart parsing (normally used for Natural Language Processing) with the proposed MOG operators. Results. We conduct two case-studies to assess the expressiveness of MOGs, compared to CFGs and PEGs. The first consists of two idealized examples from literature (an expression grammar and a simple procedural language). The second examines a real-world case (the entire Smalltalk grammar and eleven new Smalltalk extensions) probing the complexities of practical needs. We show that in comparison, MOGs are able to reduce complexity and naturally express language constructs, without resorting to implementation specific directives. Conclusion. We conclude that combining deterministic and non-deterministic choices in a single grammar specification is indeed not only possible but also beneficial. Moreover, augmented by operators for recursive and scoped ordering the resulting multi-ordered formalism presents a viable alternative to both CFGs and PEGs. Concrete implementations of MOGs can be constructed by extending chart parsing with MOG operators for recursive and scoped ordering.

Download Full-text