Linguistic Analysis for Question Answering

Dick Crouch
Palo Alto Research Center

Abstract

At PARC we are building a system that maps from language to logic (knowledge representation) and back as part of a larger question-answering system. The abstraction provided by mapping to KR allows for matching of questions and answers that are significantly different on a string and even deep syntactic level. A simple example is the regularization of active-passive pairs:

 IBM bought Lucent. / Lucent was bought by IBM.

A more interesting example involves synonyms and hypernyms:

 IBM bought Lucent. / IBM purchased Lucent. / IBM acquired Lucent.

Of particular importance is the representation of factivity in accurate question answering:

 IBM bought Lucent. / IBM did not buy Lucent.
 IBM managed to buy Lucent. / IBM failed to buy Lucent.
 We know that IBM bought Lucent. / We said that IBM bought Lucent.

In the parsing direction, the system works by mapping strings into dependency structures using a large-scale, robust English LFG grammar. These dependency structures are fed through a set of ordered rewriting rules to create a flattened, skolemized semantic representation and then an abstract knowledge representation.

From the abstract KR one can do one of two things. (1) Map the abstract KR into a concrete knowledge representation, like CycL, and use the machinery afforded by that KR to do full-blown reasoning, perhaps with arbitrary amounts of world knowledge, for question answering. (2) Attempt a simpler form of matching on the abstract KR itself.

The talk will outline the system and its motivations, and discuss some of the many areas where further work is required.