SciELO - Scientific Electronic Library Online

 
vol.22 issue4Lifelong Learning Maxent for Suggestion ClassificationImproving Coherence of Topic Based Aspect Clusters using Domain Knowledge author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • Have no similar articlesSimilars in SciELO

Share


Computación y Sistemas

On-line version ISSN 2007-9737Print version ISSN 1405-5546

Comp. y Sist. vol.22 n.4 Ciudad de México Oct./Dec. 2018  Epub Feb 10, 2021

https://doi.org/10.13053/cys-22-4-3072 

Thematic section

Computational Linguistics

Context-Free Grammars Including Left Recursion using Recursive miniKanren

Hirotaka Niitsuma1  * 

1Okayama University, Graduate School of Natural Science and Technology, Okayama, Japan


Abstract:

Recursive miniKanren is logic programming language which can deal infinite recursive data structure and a subset of the Scheme language. We define a pattern match macro which can use the same syntax of the match macro of the Scheme language using recursive miniKanren. The macro enables to write searching sub-list with a given pattern by only few line code. Using this property, we introduce techniques writing context-free grammar with our match macro. Unlike other specific paraphrasing tools, our technique can combine logical relations of miniKanren with a context-free grammar. We show the logical relations resolves the ambiguity of a grammar.

Keywords: Context-free grammars; left recursion; recursive miniKanren

1 Introduction

miniKanren[5] is one of the major relational logic programming languages. Prolog is also a well-known relational programming language. A typical Prolog implementation consists of thousands of lines of C code. The main advantage of miniKanren is that it consists of less than 1000 lines of Scheme code. With this advantage, miniKanren can be easily modified. Therefore, many dialects of miniKanren are made[3, 4, 7].

Since miniKanren is not only a subset of Prolog but also a subset of the Scheme language, miniKanren has properties like Lisp. For example, miniKanren can use pattern match macros of the Scheme language[8]. However, the pattern match macro of miniKanren cannot use a pattern including “. . . ” which denotes sequence. This research introduces the extension of the match macro which enables using a pattern including “. . . ”. Let us call the extended match macro as match ee .

Due to the simplicity of miniKanren, miniKanren can not handle structures containing infinite recursion. For example, miniKanren can not handle Scheme code like #0 = ( #0# 3 ). Such recursions appear in many applications, e.g. type inference of delayed stream, Fourier analysis of signal processing, and context-free grammar with left-recursion. This research shows we can handle such infinite recursions by modifying some functions of miniKanren. Let us call the modified miniKanren and the original miniKanren recursive miniKanren1 and normal miniKanren2, respectively.

2 Notation

The following abbreviation is used in this paper. The symbol ≡ is == in our program code. The symbol ⊳ is ==> in our program code. Table 1 shows the list of the abbreviations.

Table 1 Notation 

abbreviation program code
==
==>
run* run*
−1 −2 −3 ⋯ _.0 _.1 _.2 ...
match e matche
match ee matchee

3 recursive miniKanren

Consider the following execution result of the normal miniKanren:

(run* (q) ( ≡ q ‘( ,q 3 ) ) )

> ()

where > denotes execution result. The normal miniKanren estimates that there is no result satisfying the recursive relation:

q = (q 3)

However, the above recursive relation represents the following circular list.

q = #0 = ( #0# 3 )

Expanding this recursive relation gives

q = ( ( ( ( ( . . . ) 3 ) 3 ) 3 ) 3 ).

We introduce symbol ⊳ to represent such infinite recursion. The expression

( ⊳ x y)

represents the expression y. The symbol ⊳ anotates the expression y has subexpression x which has self recursive structure. The “left hand side” x should be a single logical variable. We do not consider the case the “left hand side” x is an expression including multiple logical variables. This research does not consider such complicated infinite structure. However we cannot find the case this notation cannot represent relations in our experiments. It is seem to be sufficient for many cases that the combination of this notation based on a single logical variable. Note that this notation is alomost same to #0# of the circular list.

Let us show example usage of the symbol ⊳. The expression

( ⊳ z ( 1 z 2 ) )

represents

( 1 ( 1 ( 1 ( . . . ) 2 ) 2 ) 2 )

where z is a logical variable. The symbol ⊳ anotates the subexpression z is self recursive.

This research also consider expressions have multiple annotations using the symbol ⊳. Let us show another example. The expression

( ( ⊳ u ( ( ⊳ v ( 1 . v ) ) u) ) )

represents

where u and v are logical variables. Using multiple ⊳ can represent nested recursive structure.

recursive miniKanren has the mechanism of finding the self recursive structures that the symbol ⊳ can represent. Let us show the execution result of the recursive miniKanren:

( run* ( q ) (≡ q ‘( ,q 3 ) ) )

> ( ⊳ −0 ( −0 3 ) )

where −0 is a logical variable. Expanding this self recursive structures gives

q = −0 = ( ( ( ( ( . . . ) 3 ) 3 ) 3 ) 3 )

Comparing the expression using #0# with this result, recursive miniKanren can be regarded as an automatic detector of circular list. Let us call the self recursive relation based on ⊳ as circular like relation (CLR).

3.1 Extended Triangular Substitutions

Triangular substitution [1] is a fundamental mechanism in logical programming. Extending triangular substitution enables finding CLR.

Normal miniKanren has preprocessing occurs-check which excludes infinite recursive relation before the main proces of the triangular substitution. Algorithm 1 shows deltails of occurs-check. The function occurs-check(x v s) checks if it is self-recursive in all subtrees in the expression tree v. When a self-recursive subtree fund, normal miniKanren regards current relation invalid. recursive miniKanren assigns the annotation ⊳ about the self-recursive. Algorithm 2 shows the difference between normal miniKanren and recursive miniKanren. Infinite recursions are exceptions which the occurs-check function causes. Algorithm 2 shows that ⊳ can handle any exceptions which the occurs-check function causes.

Algorithm 1. occurs-check(x v s) 

Algorithm 2. normal and recursive miniKanren difference 

Triangular substitution also has other processes traversing all subtree in a given expression. Traversing all subtrees including the infinite annotation ⊳ is a complicated process. It might be infinite loop. To avoid infinite loop, recursive miniKanren uses on-trees macro [6] for traversing all subtrees.

4 matchee Macro

match e macro [8] is a pattern match macro which can describe the same pattern of the match macro of the Scheme language inside miniKanren. However, the match e macro can not describe an iterative pattern including “. . . ” . match ee macro3 is an extension of the match e macro so that the iterative pattern can use. This macro uses “__” instead of “. . . ” to describe the iterate patterns. Let us show example usage of this macro:

In the above example, the match pattern describes an iteration of the pattern (,a (2 ,b)). In this case, the logical variables a is matched to (1,10,100) and the logical variables b is matched to (3,30,300). Like this example, match ee macro can describe complicated iterative pattern by using “__”.

match ee is especially useful finding certain patterns from given list data. Let us consider the following example.

In this example, all possible sub-lists which can match to the given pattern are enumerated. The match pattern describes all possible sub-lists which can divide given input ’(1 2 3). The possible divisions are

() and (1 2 3), (1) and (2 3), (1 2) and (3), (1 2 3) and (). The macro successfully finds all possible patterns. This example shows the match pattern (,x ___ . ,r) can use to search all possible sub-list. This teqnique is useful search sub-sentence with some given patterns like context-free grammar.

matchee macro can use in both of normal miniKanren and recursive miniKanren. And matchee macro can use as original matche macro[8] . matchee gives the same results of original matche macro[8] as like the following example.

5 Context-Free Grammar with Left Recursion

Let us consider the following sentence example [2] for context-free grammar analysis.

“I shot an elephant in my pajamas.”

This sentence can be analyzed by using the flowing context-free grammar.

S NP VP
PP P NP
NP Det N
NP Det N PP
VP V NP
VP <styled-content style="color: #FF0000">VP PP</styled>
N ’elephant’|’pajamas’
V ’shot’
P ’in’

Here, the nonterminal S stands for sentence, NP for noun phrase, VP for verbphrase, Det for determiner, PP for prepositional phrase, N for Noun, V for Verb, and P for preposition. Note that the highlighted part has left recursion. This grammar is left recursive in the rules for NP. Usually, left recursion causes an infinite loop when analyzing a sentence. To avoid the infinite loop, a grammar including left recursion requires special treatments. For example, preprocessing to eliminate the left recursion can avoid the infinite loop [9].

The example sentence can be analyzed in two ways [2] as shown in Figure 1

Fig. 1. Structures of the example sentence 

To find both two results, we need an algorithm with a backtracking mechanism based on top-down search.

recursive miniKanren can deal left recursion without special treatments. And can find both two results with a backtracking mechanism. Using match ee macro, the grammar including left recursion can write as the following

Note that the match ee macro enables the grammar rule to almost directory write down the grammar rules as match patterns.

The grammar rule can apply to a sentence as the following way.

The desgired both two results are successfully extracted. Here, remove-duplicates is required because of this simple implementation extracts same results more than once. However, it is better than missing to find possible results. Recall the principal advantage of the miniKanren is it consists of under 1000 lines of Scheme code. recusive miniKanren is also consists of under 1000 lines of Scheme code. Using more than 1000 lines of Scheme code can remove remove-duplicates.

The match ee macro also can use in normal miniKanren. The above code can run with normal miniKanren. However, normal miniKanren can not find these phased results, because of the left recursion.

5.1 Adding Logical Relation

recursive miniKanren is not a paraphrasing tool but computer language. It can add various relations as the language sentences. Let us consider adding rule; a single NP can not contain “elephant” and “pajamas” simultaneously. This rule can describe by adding few lines to the match pattern in the cfg function as the following

Using this rule successfully excludes the phased results represents “elephant wearing pajama”. Here, excludee is a macro represents excluding a special case given in the first argument from current results:

containo is a function giving a decision whether the list tree given in the secound argument contains the element given in the first argument:

As like the above example, miniKanren can describe various conditions with a context-free grammar. The example code for this context-free grammar is in our repository4.

5.2 Expanding Context-Free Grammar

Our proposed technique can use to not only phase sentence also generate sentence. With the grammar rule cfg described with mcathee macro, just running following search program generates possible sentence which can keep the grammar rule.

5.3 Indirect Left Recursion

Our proposed technique also works for indirect left recursion. Let us consider the following grammar including indirect left recursion5.

A C d
B C e
C A | B | f

This grammar rule can be described as the follwoing match pattern.

The phase result for ’(f d e) is given by running the following code

(remove-duplicates (run* (q) (cfg ’(f d e) q) )))

( (B (C (A (C f ) d)) e) (C (B (C (A (C f ) d)) e)) )

This result shows our technique works for the indirect left recursion. Note that run* sentence is used for this phase procedure. run* sentence causes an infinete loop when executes for infinite recursive structure. However recursive miniKanren find and remove the indirect infinite recursive structure automatically. Running this code finishes in finite time.

6 Conclusion

match ee macro can describe a context-free grammar as just write down the grammar rules as match patterns in the match ee sentence. recursive miniKanren can handle left recursions of the grammar without special treatments. This technique can easily combine various logical statements to a context-free grammar.

References

1.  Baader, F., & Snyder, W. (1999). Unification theory. [ Links ]

2.  Bird, S., Klein, E., & Loper, E. (2009). Natural Language Processing with Python. O’Reilly Media. [ Links ]

3.  Byrd, W. E (2010). Relational programming in minikanren: techniques, applications, and implementations. Ph.D. thesis, Indiana University. [ Links ]

4.  Byrd, W. E., Holk, E., & Friedman, D. P. (2012). minikanren, live and untagged quine generation via relational interpreters. Proceedings of the 2012 Workshop on Scheme and Functional Programming. [ Links ]

5.  Friedman, D. P., Byrd, W. E., & Kiselyov, O. (2005). The Reasoned Schemer. MIT Press, Cambridge, MA. [ Links ]

6.  Graham, P (1993). On LISP: Advanced Techniques for Common LISP. Prentice Hall. [ Links ]

7.  Hemann, J., & Friedman, D. P. (2013). microkanren: A minimal functional core for relational programming. Proceedings of the 2013 Workshop on Scheme and Functional Programming. [ Links ]

8.  Keep, A. W., Adams, M. D., Kuper, L., Byrd, W. E., & Friedman, D. P. (2009). A pattern matcher for miniKanren or how to get into trouble with CPS macros. Scheme ’09: Proceedings of the 2009 Scheme and Functional Programming Workshop, number CPSLO-CSC-09-03 in California Polytechnic State University Technical Report, pp. 37-45. [ Links ]

9.  Moore, R. C (2000). Removing left recursion from context-free grammars. Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference, NAACL 2000, Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 249-255. [ Links ]

Received: December 14, 2017; Accepted: February 15, 2018

* Corresponding author: Hirotaka Niitsuma, e-mail: niitsuma@de.cs.okayama-u.ac.jp

Creative Commons License This is an open-access article distributed under the terms of the Creative Commons Attribution License