Automatic Theorem Proving for Natural Logic: A Case Study on Textual Entailment

Lavalle, Jesús; Montes, Manuel; Jiménez, Héctor; Villaseñor, Luis; Beltrán, Beatriz; Lavalle, Jesús; Montes, Manuel; Jiménez, Héctor; Villaseñor, Luis; Beltrán, Beatriz

doi:10.13053/cys-22-1-2778

Servicios Personalizados

Revista

Articulo

Indicadores

Citado por SciELO
Accesos

Links relacionados

Similares en SciELO

Otros
Otros

Permalink

Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Comp. y Sist. vol.22 no.1 Ciudad de México ene./mar. 2018

https://doi.org/10.13053/cys-22-1-2778

Articles of the Thematic Issue

Automatic Theorem Proving for Natural Logic: A Case Study on Textual Entailment

Jesús Lavalle¹²^*

Manuel Montes¹

Héctor Jiménez³

Luis Villaseñor¹

Beatriz Beltrán²

^¹ Insituto Nacional de Astrofísica, Óptica y Electrónica, Coordinación de Ciencias Computacionales, Santa María Tonanzintla, Mexico. mmontesg@ccc.inaoep.mx, villasen@ccc.inaoep.mx

^² Benemérita Universidad Autónoma de Puebla, Facultad de Ciencias de la Computación, Puebla, Mexico. bbeltran@cs.buap.mx

^³ Universidad Autónoma Metropolitana, Unidad Cuajimalpa, División de Ciencias de la Comunicación y Diseño, Departamento de Tecnologías de la Información, Ciudad de México, Mexico. hjimenez@correo.cua.uam.mx

Abstract:

Recognizing Textual Entailment (RTE) is a Natural Language Processing task. It is very important in tasks as Semantic Search and Text Summarization. There are many approaches to RTE, for example, methods based on machine learning, linear programming, probabilistic calculus, optimization, and logic. Unfortunately, no one of them can explain why the entailment is carried on. We can make reasonings, with Natural Logic, from the syntactic part of a natural language expression, and very little semantic information. This paper presents an Automatic Theorem Prover for Natural Logic that allows to know precisely the relationships needed in order to reach the entailment in a class of natural language expressions.

Keywords: Textual entailment; automatic theorem proving; natural logic

1 Introduction

The main objective of Automatic Theorem Proving is that given an expression of some logical system, a computer program can decide if that expression follows from a set of axioms and inference rules.

There are many procedures to reach this goal, for example, Resolution, Semantic Tableaux, Hilber Systems, Natural Deduction, Davis-Putnam, and Sequent Calculus.

This work is in the line of Sequent Calculus, in the sense that after applying an inference rule the size of the original expression decreases, which is called subformula property.

Intuitively, it means that the truth of an expression depends only on its constituent elements. Of course, the type of axioms and inference rules change from one system to another.

For example, in AB grammars, the words of an expression in English can be considered as axioms, if after applying modus ponens to them we get an expression of type t, it means that the expression in English is a sentence

A collateral objective in Automatic Theorem Proving is to say why an expression does not follow from the axioms. This is called, the explanatory power of the Automatic Theorem Prover. We are going to use this kind of tools to develop an automatic theorem prover for Natural Logic with emphasis on textual entailment.

Recognizing Textual Entailment is essential for other Natural Language Processing tasks such as: Semantic Search, Question Answering, Text Summarization, and Information Extraction. Different methods have been used to solve the RTE problem [⁷], those methods are based on machine learning, linear programming, probabilistic calculus, optimization, and logic.

The logical methods used in RTE, although they use an inference mechanism, decide on the entailment through machine learning algorithms or some kind of optimization. In such a situation, it is not possible to know what relationships, among the subexpressions of the text and the hypothesis, are avoiding the entailment.

Natural Logic was developed to reason in natural language without having to use some kind of logical form [²², ¹⁸]. Natural Logic only uses lexical, syntactic, and basic semantic information of a language. Natural Logic can be viewed as the joint of some kind of Categorial Grammar, with modus ponens as the unique inference rule, and reasoning with polarity.

In this paper, we are going to present an Automatic Theorem Prover for Natural Logic. Its main features are: it can make entailments on more than one subexpression, and it finds precisely the subexpressions that do not permit the entailment.

We explain briefly in section 2 four approaches to RTE that use some kind of inference mechanism, and one that is based on Natural Logic. We deal with Natural Logic in section 3, section 4 is devoted to construct the algorithms needed for the proof theory of an extension of AB grammars. Section 5 contains an adaptation of the algorithm of van Benthem to compute polarity in AB grammars, and an Automatic Theorem Prover for Natural Logic is developed. Later, section 6 shows some examples of the Automatic Theorem Prover. Finally, section 7 gives our conclusions, and future work directions.

2 Approaches based on an Inference Mechanism

The methods discussed in this section use some kind of inference mechanism to recognize textual entailment, excepting for the one of MacCartney and Manning, which is included because it is based on Natural Logic.

2.1 COGEX

The system of Hodges et al. [⁹] transforms the input text and hypothesis into logical forms. The transformation process includes part-of-speech tagging, parse tree generation, word sense disambiguation and semantic relations detection.

In order to use the logic prover COGEX, a list of clauses called ”set of support” is required, this is used to begin the search for inferences. Another list, called the usable list, contains clauses used by COGEX to produce inferences. The axioms are about knowledge of the world, linguistic rewriting rules, and synsets of WordNet.

The clauses in the set of support are weighted, a clause with a lesser weight is prefered to participate in the search. The negated hypothesis (COGEX proves by refutation) is added to the set of support with the largest weight, this guarantees that the hypothesis will be the last clause used in the search.

If a refutation is found the prover ends, if there is not a refutation the predicate arguments are relaxed. If despite arguments relaxation a refutation is not found, predicates are dropped from the negated hypothesis until a refutation is found.

When a refutation is found, a score for it is computed, beginning with a perfect score and subtracting points for axioms used, arguments relaxed, and predicates dropped.

If the score for a refutatiion is greater than a threshold, then it is considered that the entailment is true, otherwise it is considered false.

2.2 OTTER

In the proposal of Akhmatova [²] the meaning of a sentence is represented by the set of atomic propositions contained in it, then the sentences are compared by means of their associated propositions.

A syntax-driven semantic analysis is used to get the atomic propositions associated with a sentence. The output of the parser is used as input for the semantic analyser; from the output of the analyser, the representation of the sentence in first order logic, which is called the logic formula, can be derived.

For Akhmatova, there are many ways to describe meaning through logical form, but they are rigid and hard to produce. Because of that, a simplified representation is proposed.

The simplified representation is build from: three types of objects Subj(x), Obj(x) and Pred(x), a meaning attaching element iq(x,<meaning of x>), and two variants of relationships attr(x,y) and prep(x,y).

Later, usign WordNet, a relatedness score between words is computed from the paths between the senses of the words, the longer the path, the lesser is the relatedness. This score together with knowledge rules are given to the automatic theorem prover OTTER.

If for every proposition in the hypothesis sentence phi there is one proposition in the text sentence ptj, such that ptj → phi, then the entailment holds, otherwise the entailment does not hold.

2.3 Abduction

Raina et al. [¹⁷] begin constructing a syntactic dependency graph using a parser, hand written rules are used to find the heads of all nodes in the parse tree. The relations represented in the dependency graph are translated into a logical formula representation. Each node in the graph is converted into a logical term and it is assigned a unique constant.

Later, abductive theorem proving is realized by the resolution method, where each abductive assumption, and its degree of plausabilty is quantified as a nonnegative cost using the assumption cost model. The objective is to find the proof of minimun cost, which is chosen automatically by a machine learning algorithm.

2.4 Vampire and Paradox

The approach of Bos and Markert [⁴, ⁵] is based on what they call shallow semantic analysis and deep semantic analysis.

Four features are obtained from the shallow semantic analysis, the overlap between words in text and hypothesis, the length of text, the length of hypothesis, and the relative length of hypothesis with respect to the text.

To achieve the deep semantic analysis, they use a robust wide-coverage parser, which produces proof trees of Combinatory Categorial Grammar [¹⁹]. Afterwards, the proof trees are used to build discourse representation structures, these are the semantic representations from Discourse Representation Theory. Later, the semantic representations are translated into first order logic expressions.

The model checker Paradox and the automatic theorem prover Vampire are used to prove wheter or not the text implies the hypothesis. Bos and Markert take two features from the automatic theorem prover, and six from the model checker.

A decision tree is trained with the twelve features, and it is used to decide if the text implies the hypothesis.

2.5 Natural Logic

MacCartney and Manning [¹³, ¹²] use Natural Logic to avoid logical forms, their system is called NatLog. They begin with a linguistic pre-processing, the text and the hypothesis are parsed with the Stanford parser, the main purpose of this step is monotonicity marking; nevertheless, they do not use polarity (see section 3) as an inference mechanism.

The second step consists of an alignment between the text and the hypothesis, alignments are represented by sequences of atomic edits over words.

Finally, taking as features the monotonicity infromation and the sequences of edits, a decision tree is trained.

2.6 Brief Analysis of the Methods

As it can be seen in Table 1, the methods based on some inference mechanism use first-order logic (FOL) as a form to represent the text and the hypothesis.

Table 1 Summary of the main characteristics of some logical approaches

First author	Inference Mechanism	Logic	BK	Challenge	Decide by
Hodges	COGEX	FOL	WordNet	RTE-2	Optimization
Akhmatova	OTTER	FOL	WordNet	RTE-1	Unclear
Raina	Abduction	FOL		RTE-1	Machine Learning
Bos	Vampire and Paradox	FOL	WordNet	RTE-1	Machine Learning
MacCartney			WordNet	RTE-3	Machine Learning

For us, it is unclear the decision mechanism that follows the system of Akhmatova, a relatedness score is computed, but its role in the decision process is never mentioned.

The other methods use a decision process different to the inference mechanism, as it has been explained, hiding why the entailment was not carried out.

3 Natural Logic

Sánchez in his Ph. D. dissertation [¹⁸] formalizes the ideas of van Benthem about Natural Logic and monotonic reasoning [²⁰, ²¹]. Even though the origins of Natural Logic go back to Aristotle [²², ¹¹]; the central idea, in the program of Natural Logic of van Benthem et al., is that natural language, besides communicating ideas, serves to reason without having to use formal systems, as predicate calculus or high order logics. The idea is to use the syntactic structure of a sentence, semantic properties of their lexical constituents, and a functor constructor.

According to Icard and Moss [¹⁰], van Benthem [²⁰] and Sánchez [¹⁸] define proof systems to reason about entailment using monotonicity in high order languages.

Both van Benthem and Sánchez use, for the syntactic analysis of a sentence, a version of categorial grammars called calculus of Ajdukiewicz [¹, ³, ¹⁴]. This is based on basic types e (for entities), and t (for truth values), more complex types of the categorial language are constructed recursively by the creation of functors, formally:

Definition 3.1. The calculus of Ajdukiewicz. The categorial language of the calculus of Ajdukiewicz ℒℒ is given by:

e and t belong to ℒℒ,
If α and β belong to ℒℒ, then (α,β) also belong to ℒℒ.

The unique inference rule in the calculus of Ajdukiewicz takes the form:

(α,β) αβ (1)

and it does not matter if the type α appears either on the left, or on the right of the functor (α,β).

It is assumed that each word in the lexicon has a type, for example: common nouns have type (e,t), transitive verbs have type (e,(e,t)), intransitive verbs have type (e,t), adjectives and adverbs have type ((e,t),(e,t)), noun phrases have type ((e,t),t), and determiners have type ((e,t),((e,t),t)).

Hence to know whether a sentence is well formed, the inference rule (1) is used to build a proof tree, if its root is t, then the sentence is well formed, otherwise the sentence is ill-formed.

Nevertheless, as there are words that play different roles (for example, white could either be an adjective, or a noun, or a verb), if a sentence contains words of this kind, the algorithm that constructs the proof tree for such a sentence would have to try with the different types of each word until the type t has been derived.

We have a proof tree in Figure 1 for the sentence Dobie didn’t bring every ball. In this figure, it is worth noting that: the type of every has its first argument on the right; the type of every ball has its argument on the left; the type of Dobie has its argument on the right, and the type t has been derived from both of them, indicating that the expression in natural language is a well formed sentence.

Fig. 1 Proof tree for the sentence Dobie didn’t bring every ball

The first semantic element of Natural Logic is that each type denotes a set, hence De denotes the set of entities, Dt denotes the set {0,1},D(α,β) denotes the set whose elements are functions from α to β. Also, the following partial order relations are defined on each type.

Definition 3.2. Partial order relations. Partial order relations on the denotations of types can be defined in the following way:

If d,d′∈De, then d≤ed′ if and only if d=d′,
If d,d′∈Dt, then d≤td′ if and only if d=0 or d′=1,
If d,d′∈D(α,β), then d≤(α,β)d′ if and only if for all x∈Dα,d(x)≤βd′(x).

The second semantic element of Natural Logic is that working with a proof system based on functors, and taking into account a partial order relation on each type, it is possible to define static characteristics on functors, namely a functor can be either upward monotone, or downward monotone; obviously a functor can also be non-monotone, according to the following terms [⁸]:

Definition 3.3. Monotonicity. A function d∈D(α,β) is:

1. upward monotone(+d) if and only if:

for all x,y∈Dα,x≤αy implies that

d(x)≤βd(y),

2. downward monotone(−d) if and only if:

for all x,y∈Dα,x≤αy implies that

d(y)≤βd(x)

3. non-monotone(⋅d) if and only if it is neither upward monotone, nor downward monotone.

As it has been stated, the inference rule (1) must be applied to construct a proof tree, therefore a functor node and an argument node are required. The resulting node will serve as either the functor node, or the argument node to construct the following level of the proof tree. In this way, the construction of a proof tree is done by composing functors: if an upward (downward) monotone functor α is argument of a functor β, then the static characteristic of α can change, depending of the static characteristic of β. It is seen in Table 2 [¹⁰] the result of composing upward monotone (+), downward monotone (−), and non-monotone (⋅) functors.

Table 2 Result of the composition of upward monotone (+), downward monotone (−), and non-monotone (⋅) functors

○	+	−	·


+	+	−	·
−	−	+	·
·	·	·	·

From Table 2 we can infer that the composition of m upward and downward monotone functors will be upward monotone if the number of downward monotone functors is even, otherwise the composition will be downward monotone, and if one of the functors in composition is non-monotone, then the whole composition will be non-monotone. The composition of functors is called polarity, it is said that polarity is: positive (+) when the composition is upward monotone, negative (−) when the composition is downward monotone, and neutral (⋅) when the composition is non-monotone. Hence, polarity is a dynamic characteristic of some functors, which is given by the position of the functors in the composition.

In terms of the proof tree of a sentence, a functor node that is upward monotone will have positive polarity if it is the argument of a composition where an even number of downward monotone functors are involved, otherwise it will have negative polarity.

On the same terms, a functor node that is downward monotone will have positive polarity if it is the argument of a composition where an odd number of downward monotone functors are involved, else it will have negative polarity.

Once the polarity of a node i is known in the proof tree of the sentence S, the subtree whose root is node i can be replaced for a greater one (in the sense of Definition 3.2) if node i has positive polarity, giving as a result sentence S′; in a dual way, the subtree whose root is node i can be replaced for a lesser one (in the sense of Definition 3.2) if node i has negative polarity, giving as a result sentence S′. In both cases, it is said that S implies S′. Sánchez [¹⁸] proved the soundness of the above implication with respect to the Semantics associated with the proof system. This is the way of reasoning in Natural Logic.

In order to compute the polarity of the elements of a sentence, the following clause is added to definition 3.1.

If (α,β) is in ℒℒ, then (α+,β), and (α−,β), are also in ℒℒ.

This clause does not belong to the calculus of Ajdukiewicz, it has been included to mark either the upward monotonicity of a functor from α to β, (α+,β), or the downward monotonicity of a functor from α to β, (α−,β).

So that, it is supposed that the lexicon contains the information of monotonicity that some functors require, as shown below:

As an example, we have the proof tree in Figure 2a with lexical monotonicity marking for the sentence Dobie didn’t bring every ball.

Fig. 2 The two steps to compute polarity with the algorithm of van Benthem, for the sentence Dobie didn’t bring every ball

Once the proof tree has lexical monotonicity marks, the algorithm of van Benthem begins marking the root of the proof tree with polarity +, then if the functor in turn is upward monotone, the polarity mark is propagated. In case that the functor is downward monotone, then the polarity mark of the argument is reversed, because a downward monotone functor reverses the order relation of the elements of its domain. Figure 2b exemplifies Algorithm 3.1 for the sentence Dobie didn’t bring every ball.

Algorithm 3.1. van Benthem polarity algorithm

Label the root with +.
Propagate notations up the tree.
1. If a node of type β is labeled l and its children are of type (α+,β) and α, then both children are labeled l (diagrammatically, (α+,β)αllβl).
2. If a node of type β is labeled l and its children are of type (α−,β) and α , then the former child is to be labeled l and the latter child is to be labeled −l, that is, the flipped version of l (diagrammatically, (α−,β)αl−lβl).

To know when a sentence N′ is entailed from a sentence N, first we have to define what a subexpression of a natural language expression is.

Definition 3.4. Subexpression. Let N=w1w2…wn,n≥1, be a natural language expression, where each word wi:αi, it is said that M is a subexpression of N if and only if one of the following clauses holds:

M=wi,1≤i≤n;
M=wiM′,1≤i≤n−1, where M′ is a subexpression of N, and (wi∈Dα→β and M′∈Dα) or (wi∈Dα and M′∈Dα→β) holds;
M=M′wi,2≤i≤n, where M′ is a subexpression of N, and (wi∈Dα→β and M′∈Dα) or (wi∈Dα and M′∈Dα→β) holds;
M=M′M″, where M′ and M″ are subexpressions of N, and (M′∈Dα→β and M″∈Dα) or (M′∈Dα and M″∈Dα→β) holds.

Example 3.1. Let N = Dobie brought every ball, according to clause 1 of Definition 3.4, we have that Dobie, brought, every, and ball are subexpressions of N; looking at Figure 3 we have that every ball is also a subexpression by clause 2 of Definition 3.4, because every:((((t/e)/e)\+(t/e))−/(t/e)), and ball:(t/e), by the same clause are also subexpressions brought every ball, and Dobie brought every ball.

Fig. 3 Proof tree for the sentence Dobie brought every ball

When we say that N (M) is a natural language expression, we also mean that it has M as subexpression.

Definition 3.5. Entailment on the same subexpression. Let N (M), and N(M′) be two natural language expressions with M,M′:α,M≠M′, we define that N (M) entails on the same subexpressionN(M′) (symbolically N (M) ⊢ N(M′)) in the following way:

N(M)⊢N(M′) if and only if

{M has positive polarity and M≤αM′, orM has negative polarity and M′≤αM.

4 Automatic Theorem Proving for an Extension of AB Grammars

We are going to extend a version of categorial grammars called AB grammars [¹⁴]; in this extension, types are constructed by L::=P|(Ls/L)|(L\sL), where P is the set of primitive types (in our case e and t), and functors are constructed using the operators s/ and \s, with s∈{+,−,⋅}; these operators are intended to distinguish if the argument of a functor is either on the right or on the left, respectively. Also to mark whether a functor is upward monotone (+), downward monotone (−), or non-monotone (⋅).

As it has been stated, to prove that a natural language expression is well formed (it is a sentence), a proof tree with root t has to be constructed, using the inference rules for syntactic categories Xs/Y YX, and Y Y\sXX; as an example is the proof tree in Figure 4.

Fig. 4 Proof tree for the sentence Dobie didn’t bring every ball, in an extension of AB grammars

A natural language expression nlexp is represented by the list [w1:X1,…,wn:Xn], where wi is an English word in nlexp, and Xi is its type, 1≤i≤n. As an example, the natural language expression Dobie didn’t bring every ball is represented by the list [Dobie:(t+/(t/e)),didn't:((t/e)−/(t/e)),bring:((t/e)/e),every:((((t/e)/e)\+(t/e))−/(t/e)),ball:(t/e)].

As it is known, there are natural language expressions that admit more than one syntactic analysis. Evenmore, sometimes it is possible to use the inference rules in more than one pair of adjacent words.

Hence, we need to look for those pairs of words that may conform the initial subtrees of possible proof trees. Also, for each initial subtree we need to know the list of pairs w:X on the left, and on the right of the subtree, these three elements will be called an environment.

For example, for the proof tree in Figure 4 its unique initial subtree is every((((t/e)/e)\+(t/e))−/(t/e)) ball(t/e)((t/e)/e)\+(t/e)), and its lists of pairs are [Dobie:(t+/(t/e)),didn't:((t/e)−/(t/e)),bring:((t/e)/e)] on the left, and [] on the right; the algorithm BFSTL returns a list of environments of the form [([w1:X1,…,wi1−1:Xi1−1],wi1Xi1 wi1+1Xi1+1X1,[wi1+2:Xi1+2,…,wn:Xn]),…,([w1:X1,…,wim−1:Xim−1],wimXim wim+1Xim+1Xm,[wim+2:Xim+2,…,wn:Xn])],1≤i1,im,m<n.

The algorithm BFSTL has three input parameters: left the pairs wj:Xj,1≤j<i already processed; current the pairs wk:Xk,i≤n≤n which are not being processed; and envs the list of environments found until now.

In general (line 10), if some inference rule can be applied to Xi and Xi+1 (line 11), then a recursive call is performed appending (⊎) left and the list [wi:Xi]; stating that [wi+1:Xi+1,…,wn:Xn] is the new current, and appending envs and the list with the environment found [(left,wiXi wi+1Xi+1X,[wi+2:Xi+2,…,wn:Xn])] (lines 12-14).

If no rule can be applied (lines 15 and 16), a recursive call is made indicating that a word has been processed, and leaving envs without change.

If no environment was found (lines 2 and 6) then the NoFirstSubTree exception is raised (lines 3 and 7). If there are no more elements to process (line 4) or there is only one element (line 8), then the work has been done and the list of environments envs is returned (lines 5 and 9).

The algorithm BAWPT builds a proof tree from a list of environments. If the list of environments is not empty (lines 4 and 5), then the algorithm BAPT is called with the elements of the first environment in the list, and the type of the root of the first subtree (line 6).

If the algorithm BAPT could not build a proof tree, then a recursive call is performed with the rest of the list of environments (lines 8 and 9). If the algorithm BAPT built a proof tree, this is returned (line 10). If the environment list is empty, then it was not possible to build a proof tree and the exception NoProofTree is raised (lines 2 and 3).

The algorithm EXTRACTYPE merely returns the root of a unary tree (lines 2 and 3), or the root of a binary tree (lines 4 and 5). This is used in line 6 of the algorithm BAWPT.

The purpose of the algorithm BAPT is to build a proof tree. It takes four arguments: the list left of pairs w:X on the left of the proof subtree proofSubTree, the proof subtree proofSubTree already built, the list right of pairs w:X on the right of the proof subtree proofSubTree, and the type X of the root of the proof subtree proofSubTree.

If left and right are empty, then a proof tree has been constructed, and there is nothing more to process (lines 2 and 3).

If left is not empty, but right is, then to construct a new proof subtree is needed that the root X of subProofTree can combine with the type Xi of the last element of left, this is possible when X=Xi\sX′ is the functor and Xi is the argument, or when X is the argument and Xi=X′s/X is the functor, if that is the case, then a recursive call is performed pointing out that: wi:Xi has been processed, the new proof subtree wiXi proofSubTreeX′ has been constructed, right is still empty, the root of the new proof subtree is X′ (lines 4-7).

If it was not possible to build a new proof subtree, then the tree “”e is returned (lines 8, 13 and 23), so that the algorithm BAWPT tries to build a proof tree with the remaining environments.

If left is empty, but right is not, then to construct a new proof subtree it is needed that the root X of subProofTree can combine with the type Xj of the first element of right, this is possible when X=X′s/Xj is the functor and Xj is the argument, or when X is the argument and Xj=X\sX′ is the functor, if that is the case, then a recursive call is performed pointing out that: left is still empty, the new proof subtree proofSubTree wjXjX′ has been constructed, wj:Xj has been processed, the root of the new proof subtree is X′ (lines 9-12).

If left and right are not empty, then the root X of proofSubTree can combine with the type Xi of the last element of left (lines 17-20); or the root X of proofSubTree can combine with the type Xj of the first element of left (lines 21-24), just like the respective previous cases.

Finally, the algorithm BUILDPROOFTREE passes the appropriate initial values to BFSTL in order to get a list of environments, this list is passed to BAWPT, which constructs a proof tree.

5 Automatic Theorem Proving for Natural Logic

To compute polarity in our extension to AB grammars, the algorithm of van Benthem is adapted as follows.

Algorithm 5.1. van Benthem’s polarity algorithm adapted to an extension of AB grammars.

Label the root with +.
Propagate notations up the tree.
- (a) If a node of type X is labeled l and its children are of type (Xs/Y) and Y , then the former child is to be labeled l and the latter child is to be labeled l∘s (diagrammatically, (Xs/Y)YllosXl).
- (b) If a node of type X is labeled l and its children are of type Y and (Y\sX), then the former child is to be labeled l∘s and the latter child is to be labeled l (diagrammatically, Y(Y\sX)loslXl).

As an example with have the proof tree of Figure 5a.

Fig. 5 Proof trees with polarity marks using the adapted algorithm of van Benthem in an extension of AB grammars

The algorithm POLALG encodes the algorithm of van Benthem more precisely. It returns a proof tree with polarity marks. Its arguments are: the polarity label l for the root of tree, and the proof tree tree.

If the current tree is a unary tree, then it returns a unary tree marking the root with l (lines 2 and 3).

If the current tree is a binary tree and the functor is the left subtree, then it recursively propagates the polarity l on the left subtree lptp, and also it recursively propagates the polarity l∘s on the right subtree rptp. Finally it returns a binary tree marking the root with l, and having lptp and rptp as left and right subtrees, respectively (lines 4-7).

If the current tree is a binary tree and the functor is the right subtree, then it recursively propagates the polarity l∘s on the left subtree lptp, and also it recursively propagates the polarity l on the right subtree rptp. Finally it returns a binary tree marking the root with l, and having lptp and rptp as left and right subtrees, respectively (lines 8-11).

The algorithm POLARITY returns a proof tree with polarity marks, it receives as argument the representation of a natural language expression nlexp, as it was discussed in section 4. POLARITY calls POLALG with the mark of positive polarity and the proof tree for nlexp.

Example 5.1. A proof tree with polarity marks for the natural language expression An elk ran is shown in Figure 5b.

If N = An elk ran, M = elk and M′ = mammal, then we can say that An elk ran ⊢ A mammal ran, because elk has positive polarity, and elk≤(t/e)mammal, because elk is a meronym of mammal.

Example 5.2. If N = An elk ran, M = elk and M′ = animal we can say that An elk ran ⊢ An animal ran, because elk has positive polarity and elk≤(t/e)animal, because elk is a meronym of animal.

Finally, we want to define an automatic theorem prover, to get this it is needed to chain entailments on more than one subexpression, this is done in the following way:

Definition 5.1. Entailment. Let N, and N′ be two natural language expressions with N≠N′, we define that NentailsN′ (symbolically N⊢N′) as follows:

N⊢N′ if and only if

{N′=N(M′) and N(M)⊢N(M′), orN″=N(M″),N(M)⊢N(M″) and N″⊢N′.

Now, we define the algorithms ENTAILS, ENTAILSALL, and ENTAILSONE.

The algorithm ENTAILS has as input the natural language expressions N, and N′, it returns the result of the algorithm ENTAILSALL. The purpose of ENTAILS is to warrant that ENTAILSALL receives, in the set DifSub, all the pairs (M,M′), where M,M′ are respectively subexpressions of N,N′ that make N≠N′. Also it receives true the first time that it is called.

ENTAILSALL tries to find counterexamples to the entailment of two natural language expressions. The algorithm ENTAILSALL codes Definition 5.1, it takes two parameters as input: DifSub containing the pairs of subexpressions (M,M′) that make different N and N′, and the variable flag, which records (line 15) if a counterexample, that it is falsifying the entailment, has been found.

As it is implicit in Definition 3.5, a natural language expression N(M) entails N(M′) if they vary in subexpressions M and M′ of the same type, and M≤(≥)M′ according to their polarity.

Hence, the main purpose of ENTAILSALL is to process a pair (M,M′) from DifSub with the same type (lines 3 and 4), then if ENTAILSONE fails (line 5) and M has the same polarity as M′ (line 6), it means that a counterexample has been found, and it writes the cause of the failure (lines 9, 11, 13, 14), then it calls itself removing the pair (M,M′) from DifSub, and recording by false∧flag that a counterexample was found (line 15).

If ENTAILSONE does not fail, then a recursive call is performed (line 16) removing the pair (M,M′) from DifSub, and recording by true∧flag that a counterexample was not found.

If subexpressions M, and M′ have different types (line 17), then it is indicated that the subexpressions have not the same syntactic structure, because Natural Logic cannot reason with these kind of expressions, in this case the algorithm finishes returning false (line 18).

When each pair of DifSub has been processed, the result of ENTAILSALL has to do with whether or not counterexamples have been found (lines 1 and 2).

The algorithm ENTAILSONE codes almost directly Definition 3.5, it takes subexpressions M and M′, analysing if they meet the order relation according to their polarity.

To implement an automatic theorem prover, lexicons having pairs word:type are needed, but, as far as we know, there are no such lexicons. Another possibility is to have a Part of Speech (POS) tagger that associates each word with its proper type.

The C & C tools [⁶] have a POS tagger, but it uses the inference rules of Combinatory Categorial Grammars, and there is not an algorithm to compute polarity for these kind of grammars, actually the known algorithms to compute polarity only work with Categorial Grammars which have the inference rules Xs/Y YX, and Y Y\sXX as the unique inference rules.

There are no domains partially ordered as it is supposed in section 3, therefore it is not possible to check if M≤αM′, but it is possible to take advantage of tools such as WordNet [¹⁶], BabelNet [¹⁵], etc., to find synonyms, hyponyms, hyperonyms, meronyms, and troponyms.

6 Examples

At this time, we have a prototype that constructs possible counterexamples for the pair text-hypothesis of natural language expressions. It is implemented in Moscow ML version 2.10, for what has been discussed previously, the prototype asks the user for the veracity of the constructed relationships in the entailment process.

Example 6.1. An elk ran ⊢ An animal moved

This exemplifies that a very specific statement can be generalized at the extreme that it loses information. Nevertheless, the entailment is true.

Example 6.2. Dobie brought every ball ⊢ Dobie brought every black ball

In this case we have the role of ”every” that is downward monotone on its first argument, therefore it sets the polarity of ball to negative. Hence it can be replaced with black ball that is a lesser expression. The entailment is true.

Example 6.3. Dobie didn’t bring every ball ⊢ Dobie didn’t bring every black ball

In this case the verb is negated, therefore it changes the polarity of the following constituents. Hence, the prototype asks if ball is a kind of black ball, i.e, if ball is lesser than black ball. Thence the entailment is false.

Example 6.4. Don’t dig your grave with your own knife ⊢ Don’t trench your grave with your own penknife. For this example, refer to Figure 6.

Fig. 6 Proof tree with polarity marks for the sentence Don’t dig your grave with your own knife, in this tree p=(t/e)

If WordNet is consulted we find that trench is a direct troponym of dig. Also, that penknife is a hyponym of knife. The entailment is true.

Example 6.5. Don’t dig your grave with your own knife ⊢ Don’t trench your hole with your own penknife

penknife is a hyponym of knife, trench is a direct troponym of dig, but hole is an hypernym of grave, it is not an hyponym. The entailment is false.

7 Conclusions and Future Work

We have developed an Automatic Theorem Prover for Natural Logic to Recognize Textual Entailment; this includes algorithms to: construct proof trees as the syntactic part of Natural Logic; compute polarity as the base of reasoning in Natural Logic; and look for subexpressions that falsify the entailment process.

The main advantage of the Automatic Theorem Prover is that it provides the list of counterexamples (pairs of subexpressions of the same type) that do not allow the entailment between two natural language expressions. As a consequence, the scope of Natural Logic in Recognizing Textual Entailment is restricted to pairs of expressions having the same syntactic structure.

As future work, in order to widen the scope of Natural Logic to Recognize Textual Entailment, it is desirable to be able to compare subexpressions of similar types; for example, the type of nouns is similar to the type of noun phrases.

Other points on the agenda for future work are: to construct lexicons where the words are associated with their types, to define an algorithm to compute polarity for Combinatory Categorial Grammars, and to build interfaces to take advantage of resources such as WordNet, and BabelNet.

We wish to thank anonymous reviewers for their comments, which helped improve the present manuscript.

Acknowledgements

This paper was partly supported by PRODEP-SEP under grant PROMEP/103.5/13/5618, by Benemérita Universidad Autónoma de Puebla under grant BUAP-803, and by CONACYT under the Thematic Networks program (Language Technologies Thematic Network projects 260178 and 271622).

We wish to thank anonymous reviewers for their comments, which helped improve the present manuscript.

References

1. Ajdukiewicz, K. (1978). Syntactic connexion (1936). In Giedymin, J., editor, The Scientific World-Perspective and Other Essays, 1931-1963. Springer Netherlands, Dordrecht, pp. 118-139. [ Links ]

2. Akhmatova, E. (2005). Textual entailment resolution via atomic propositions. Proceedings of the First PASCAL Challenges Workshop on Recognizing Textual Entailment. [ Links ]

3. Bach, E. (1988). Categorial grammars as theories of language. In Oehrle, R. T., Bach, E., & Wheeler, D., editors, Categorial Grammars and Natural Language Structures. Springer Netherlands, Dordrecht, pp. 17-34. [ Links ]

4. Bos, J., & Markert, K. (2006). Recognising textual entailment with robust logical inference. MLCW 2005 , volume LNAI 3944, pp. 404-426. [ Links ]

5. Bos, J., & Markert, K. (2006). When logical inference helps determining textual entailment (and when it doesn’t). Proceedings of the Second Challenge Workshop, Recognizing Textual Entailment, Pascal. [ Links ]

6. Clark, S., & Curran, J. R. (2004). Parsing the wsj using ccg and log-linear models. Proceedings of the 42Nd Annual Meeting on Association for Computational Linguistics, ACL, Association for Computational Linguistics, Stroudsburg, PA, USA. [ Links ]

7. Dagan, I., Roth, D., Sammons, M., & Zanzotto, F. M. (2013). Recognizing Textual Entailment: Models and Applications. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers. [ Links ]

8. Dowty, D. (1994). The role of negative polarity and concord marking in natural language reasoning. Proceedings of the 4th. Conference on Semantics and Theoretical Linguistics, Cornel University, CLC Publications, Rochester, NY. [ Links ]

9. Hodges, D., Clark, C., Fowler, A., & Moldovan, D. (2006). Applying COGEX to recognize textual entailment. In Quiñonero-Candela, J., Dagan, I., Magnini, B., & d’Alché Buc, F., editors, Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment, volume 3944 of Lecture Notes in Computer Science. Springer Berlin Heidelberg, pp. 427-448. [ Links ]

10. Icard, I., Thomas, F., & Moss, L. (2014). Recent progress on monotonicity. Linguistic Issues in Language Technology, Vol. 9, Perspectives on Semantic Representations for Textual Inference, pp. 167-194. [ Links ]

11. Karttunen, L. (2015). From natural logic to natural reasoning. In Gelbukh, A., editor, Computational Linguistics and Intelligent Text Processing, volume 9041 of Lecture Notes in Computer Science. Springer International Publishing, pp. 295-309. [ Links ]

12. MacCartney, B. (2009). Natural Language Inference. Ph.D. thesis, Stanford University. [ Links ]

13. MacCartney, B., & Manning, C. D. (2007). Natural logic for textual inference. Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Association for Computational Linguistics, Prague, pp. 193-200. [ Links ]

14. Moot, R., & Retoré, C. (2012). The Logic of Categorial Grammars, A Deductive Account of Natural Language Syntax and Semantics. Springer. [ Links ]

15. Navigli, R., & Ponzetto, S. (2012). BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence, Vol. 193, pp. 217-250. [ Links ]

16. Princeton University (2010). Princeton University ”About WordNet”. http://wordnet.princeton.edu. [ Links ]

17. Raina, R., Ng, A. Y., & Manning, C. D. (2005). Robust textual inference via learning and abductive reasoning. Veloso, M. M., & Kambhampati, S., editors, AAAI, AAAI Press / The MIT Press, pp. 1099-1105. [ Links ]

18. Sánchez-Valencia, V. (1991). Studies on Natural Logic and Categorial Grammar. Ph.D. thesis, Universiteit van Amsterdam. [ Links ]

19. Steedman, M., & Baldridge, J. (2011). Combinatory categorial grammar. In Borsley, R., & Borjars, K., editors, Non-Transformational Syntax: Formal and Explicit Models of Grammar. Wiley-Blackwell. [ Links ]

20. van Benthem, J. (1986). Essays in Logical Semantics, volume 29 of Studies in Linguistics and Philosophy. Reidel, Dordrecht. [ Links ]

21. van Benthem, J. (1991). Language in Action: Categories, Lambdas, and Dynamic Logic, volume 130 of Studies in Logic. Elsevier, Amsterdam. [ Links ]

22. van Benthem, J. (2007). A brief history of natural logic. Technical report. Available at: https://www.illc.uva.nl/Research/Publications/Reports/PP-2008-05.text.pdf. [ Links ]

Received: August 06, 2016; Accepted: October 15, 2016

^* Corresponding author: Jesús Lavalle, e-mail: jlavalle@ccc.inaoep.mx

This is an open-access article distributed under the terms of the Creative Commons Attribution License