Summary of "Understanding Natural Language"

Article by Barr and Feigenbaum

Summary by Sunil Vemuri for CS 523, March 11, 1996

(updated to include information from other sources June 7, 1996)


Overview

Applications

Machine Translation

Started in 1946 based on diescussions between Warren Weaver and A. Donald Booth about the similarity of code breaking (i.e Enigma codes) and traslation.

Early research speculated that computers could be used for Machine Translation "Translation from one language to another".

Much of this work done in the 50s and 60s.

Proposed the notion of an interlingua - an intermediate "universal language" that all humans share. If such an interlingua exists, machine translation could be done through it.

Problem turned out to be much harder than expected because one cannot simplay translate the words from one sentence to another, but must understand the meaning of the sentence.

Systems:

Current work at ISI/USC:
Bar-Hillel's example that showed some the difficulty in translation
The pen is in the box.
The box is in the pen.

Machine Translation sort of died in 1966 with the ALPAC report
"There has been no machine translation of general scientific text, and none is in immediate prospect"

Area was revitalized somewhat with the introduction of transformational grammars, better knowledge representations which led to better inferencing capabilities (i.e. an interlingua)


Grammars

Formal Grammars

Stuff covered in undergrad automata class.

formal language, context-free grammar, context sensitive grammar

Chomsky hierarchy:
0: Unrestricted. Turing recognizable
1: context-sensitive grammar
2: context-free grammar
3: regular grammar

For regular and context-free grammars, there are practical parsing algorithms to determine whether or not a given string is an element of the language, and if so, to assign to it a syntactic structure in the form of a derivation tree.

Natural languages are not generally context-free.

Transformational Grammars

The combination of a grammer and transformational rules. Proposed by Chomsky in "Syntactic Structurs". (see p.246-247) for an example.

Generative Grammar (definition)
Context-sensitive (phrase-structured) grammers are inadequate for English. This is why transformations were introduced.
Katz-Postal hypothesis - the application of transformational rules to deep structures must preserve meaning.

Chomsky proposed:

Systemic Grammar

A theory within which linguistic structure as related to the function or use of the language, often termed pragmatics, is studied. (Halliday)

Functions of a language

Categories of Systemic Grammar
  1. The units of the language (sentence, clause, group, word, morpheme)
  2. The structure of the units
  3. Classification of the units (verbal, nominal, adverbial)
  4. The system (list of choices representing options to the speaker)

Case Grammars

Similar to the notion of case in traditional grammars. In English, case forms are mainly persoanal pronouns (I, my, me) and possessive endings ("'s").

Revision of the framework of transformational grammars.

Noun phrases and verb phrases are associated with each other in a particular relationship

Come up with a small number of fixed cases. (Agent, counter-agent, object, result, instrument, source, goal, experiencer)

"The Case for Case" What can case analysis provide answers for. (Fillmore, 1968)

  1. Which noun phrase is the subject
  2. Different cases may not be conjoined
  3. Buy/Sell, Teach/Learn. Same meaning, different case frames

Case frames can be represented as semantic nets.

Systems using case at the deepest level may represent the meaning of sentences in a way that collapses structurallly distinct but identical meaning phrases.

Other Grammers (not in Paper)

Augmented Phrase Structure Grammars (APSG)
Definite Clause Grammars (DCG)
Unification Grammar

Parsing

Issues in parser design
Parsing strategies
Actual parsing systems

Augmented Transition Network (ATN)

Recursive Transition Networks (RTN)
Augmented Transition Network [Woods]
One can essentially think of an ATN as an RTN with a frame attached to each arc. The frame can contain any information that would be helpful for parsing (slots on frame = registers). For example, partially formed derivation trees. Also, one can have procedural attachments within the frame to compute values of slots/registers

Part of the advantage with ATNs is that they allow the grammar designer to maintain a simple grammar structure and delegate some of the parsing details (i.e. matching noun & verb number) to the procedural attachments. One can encode parsing details in the grammer as is done with definite clause grammars, but the grammar becomes quite cumbersome and awkward and the general simple structure of the grammer is lost.

Another major advantage of ATNs is that they are quite efficient.

Several textsbooks (Ginsberg, Luger & Stubblefield) discuss ATNs in detail.

The General Syntactic Processor

- Kaplan, 1973

Charts
Grammatical Rules
Control Structure of GSP
GSP are similar to ATN with extensions
  1. GSP uses a chart
  2. Grammar is encoded as a chart too
  3. process can be suspended and resumed


Semantic Analysis

The general idea of semantic interpretation is to take NL utterances and mapping them onto some representation of the meaning or intermediate analysis. A few terms used in semantic interpretation are:
Some of the problems in semantic analysis include ambiguity and underdeterminism. Lexical ambiguity refers to words and phrases conveying more than one sense. Scope ambiguity refers to how to bind quantifiers ("all", "every", "not", "and", "or", most"). Underdeterminism refers to cases where all the information has not been specified explicity, but may be suggested implicitly.

Conceptual Dependancy theory (Schank) is one of the more famous examples of semantic representations.

Compositional Semantics - the semantics of any phrase is a function of the semantics of its subphrases. Compositional semantics allows us to handle an infinite grammar with a finite (and often small) set of rules.

Semantic interpretation along cannot be certain of the right interpretation of a phrase or sentence. So we divide the work -- semantic interpretation is responsible for combining meanings compositionally to get a set of possible interpretations, and disambiguation is responsible for choosing the best one. This disambiguation is the contract of the pragmatic analyzer.

In semantic interpretation, we have a quasi-logical form which we wish to convert to a logical form. Once in a logical form (i.e. FOPL), we buy the semantics of logic which will allow us to draw deductions, inferences, etc. LUNAR was the first system to attempt this as it had an exteneded notational variant of FOLPL.

Ambiguity

Disambiguation typically requires information about the world, the situation, time and place as well as information about the speaker's and hearer's intentions, desires and beliefs. To formalize this, we use a combination of four models for disambiguation
Probabilistic context-free grammer . Each context-free production rule has a probability associated with it.

Discourse Interpretation

When people use language, they write or speak not in single isolated utterances, but in extended sequences of them. Natural languages take advantage of this by providing ways for one utterance to use all aspects of the context of previous utterances to augment what is explicity said in the utterance itself. Context is determined by a combination of previous utterances in the discourse.

Early work assumed one could find pronoun reference by linearly searching back within the text. This was proven to be false. It will work in many cases, but not in general. The structure of the discourse is the primary factor. Also, the refering entity might not be mentioned explicitly (i.e. commonsense knowledge: "john ate the pie". Implicit in this sentence is that he put the food in his mouth). Conceptual Dependancy theory attempts to alleviate this by representing implicit entities explicitly.

Anaphora - the occurrence of phrases referring to objects that have been mentioned previously.

Focusing of attention. Immediate focusing operates at the level of an utterance while global focusing operates at the discourse segment level.

Language and Intention

The ways in which the beliefs and intentions of speakers and writers shape teh utterances they produce and, conversely, the ways in which listeners' and readers' beliefs about a speaker's or writer's beliefs and intentions affect how those utterances are understood.

Under one view, Speech Acts (requests, informings, promises, etc.) [Searle, 1969] and the intentions behind them, rather than phrases or sentences, are the basic building blocks of communication.

The system must be able to reason about the connection between knowledge and action.

Research in this area has focused on use of general AI planning techniques in order to plan speech acts and recognize the intended actions another agent has performed. There has been work on using STRIPS as the basis for a system that could plan individual requesting and informing actions [Cohen & Perrault].

Flipping the problem around Perrault worked on ways of applying plan recognition techniques to discourse analysis in NLP. In a similar light, the notion of a script in conceptual dependancy theory [Schank & Ablelson] was used in PAM [Wilensky, 1983] and it was found useful in discourse analysis.

Generation

Content determination

Text planning

Explanation Generation (Gruber, Gautier, Lester, Mallory)


NLP Systems

Two classes of NLP systems include interactive systems and text processing systems. Interactive systems include NL systems whose primary mode of intereaction is through natural language. Text processing systems are ones in which NL texts are the primary objects of interest.

Issues in design of an NLP System:


SHRDLU

Winograd, 1972. Ph.D. Thesis
to understand language, a program must deal in an integrated way with syntax, semantics, and reasoning
the basic viewpoint guiding its implmentation was that meaning (of words, phrases, and sentences) can be embodied in procedural structures and that language is a way of activating appropriate procedures withinthe hearer.

MARGIE

Conceptual Dependancy (CD)

MARGIE

SAM and PAM

SAM

PAM

BASEBALL

One of the first Q/A systems. Domain covered one season of statistics in American League baseball. Analyzed queries and created frame-like representation of them. Used the structured representation of the query to find the information in the database. Queries were in natural language, but not all types of queries were allowed:
Example good queries include:
Transportability, habitability, extensibility, and speed were not of concern. Major contribution of BASEBALL was that it was an existence proof that such systems were possible.

LUNAR

Domain was analyses of rock samples from Apollo 10 moon landing mission. Question answering system. Important in terms of dealing with proper scoping of quantifiers and it's well-defined procedural semantics. SHRDLU took a similar approach of attaching a procedure to each predicate.

ELIZA

Natural language interface to a psychaitrist. Had pattern matchine rules that triggered based on key words found in user's dialog. Used literal text form within users dialog to refomulate questions. No understanding of what was being said, it just merely spit back questions that seemed most relevant to the last user input. [Weisenbaum, 1966]

PARRY

Attempted to embody a theory of paranoia in a system. Used a natural language interface. Used pattern matching techniques similar to that found in ELIZA. Apparently, PARRY passed some form of modified Turing Test. Designed to maximize habitability and speed. Veracity was also important.

LADDER

Practical NL database Q/A system. Wanted to provide users with a transparent way of gaining access to information distributed over various databases.

SOPHIE

A prototype instructional system aimed at teaching students procedural knowledge and reasoning strategies. SOPHIE was fast and habitable at the expense of modularity.


Questions

How are NLP and Vision similar?

What are some approaches to syntactic analysis?

What are some approaches to semantic analysis?

What are some approaches to pragmatic analysis?

What are some approaches to discourse analysis?

Definitions

References



Readings in Natural Language Processing
Computers and Thought
Russell & Norvig
"Understanding Natural Language" Barr & Feigenbaum