AI Qual Summary: Planning

Planning
Patrick Doyle	AI Qual Summary	April 30, 1997

Table of Contents

Definitions
General Paradigms
- Planning as Search
- Complexity and Limits of Planning
Early Planning Systems (Pre-GPS)
GPS
Logic Planners
- Green's Formulation
- Kowalski's Formulation
Non-Hierarchical Linear Planners
Hierarchical Linear Planners
- ABSTRIPS
Non-Hierarchical Non-Linear Planners
- SIPE
- TWEAK
Hierarchical Non-Linear Planners
Opportunistic Planning
- OPM
Other Approaches and Issues
Extentions and State of the Art
Timeline and Accomplishments
Sources

Definitions

problem solving

The process of developing a sequence of actions to solve a goal. This encompasses all goal-directed AI programs; it is a very general concept.

planning

Deciding upon a course of action before acting. A plan is a representation of a course of action. A finished plan is a linear or partially ordered sequence of operators. Planning is a problem solving technique. Planning is reasoning about future events in order to verify the existence of a reasonable series of actions to take in order to accomplish a goal. There are three major benefits of planning: reducing search, resolving goal conflicts, and providing a basis for error recovery.

anytime algorithm

An algorithm that can be interrupted at any time during its operation and asked for a solution. The longer the algorithm operates, the better its answer.

hierarchical plan

A plan that describe how to take actions in levels of increasing refinement and specificity (e.g., "Do something" becomes the more specific "Go to work," "Do work," "Go home.") Most plans are hierarchical in nature.

nonhierarchical planning

This is roughly the intuitive idea of planning; it consists of finding a sequence of operators to achieve each of the goals. The problem is that the planner does not distinguish between important goals and less critical ones, and so it can waste considerable amounts of time finding solutions to non-critical parts of a plan, only to find it can't solve a critical part. Examples include STRIPS, HACKER, WARPLAN.

hierarchical planning

Planning that use a hierarchy of abstractions of a plan to solve the problem. The ground plan is the form that lists executable operations; levels above increase in abstraction (and simplicity). The goal of hierarchical planners is to simplify the search and reasoning process by finding vague solutions at levels where the details are not computationally overwhelming, and then refine them. Both non-hierarchical and hierarchical planners generate hierarchical plans, but only hierarchical planners use a hierarchy representation.

There are two ways to form hierarchies. Plan abstraction means using higher-level concepts as operators (e.g., in NOAH, NONLIN, SIPE). Space abstraction (e.g. in ABSTRIPS) forms equivalence classes of states by ignoring less-critical preconditions and subgoals.

least commitment approach

A constructive approach in which no commitment is made unless it is necessary to do so. Decisions are made only when the planner can determine they will not interfere with past or future decisions. The goal is to avoid backtracking as much as possible. This approach is used in nonlinear planners (e.g., NOAH, MOLGEN, SNLP).

nonlinear planning

Least commitment with respect to time. If the resulting plan is not linear, either linearization for a single agent, or execution by multiple agents can be used. Ordering decisions are only made when necessary. Partial plans are partial orders on operators, which are left unordered until a conflict detected, at which point constraints are added.

nonlinear planners

NOAH [Sacerdoti, 1977], NONLIN [Tate, 1977], DEVISER [Vere, 1983], SIPE [Wilkins, 1984], TWEAK [Chapman, 1984].

opportunistic planning

Real-time planning in which operators are invoked when they are useful in the current state. Explored in the OPM system by Hayes-Roth.

STRIPS assumption

A way of coping with the frame problem. Every satisfied formula not explicitly deleted by an operator continues to hold after the operator is executed. The extended STRIPS assumption allows for the possibility that formulae not in the delete list may become false, if it would be inconsistent to assume they are true.

goal

Usually specified as one or a set of world states. There are three kinds of goals: maintaining some condition, preventing some condition from occurring, or sequencing activities.

independence problem

When multiple agents are involved, we neither want to specify all external events that might happen, nor do we want a solution that assumes that anything unspecified did not happen.

situated activity

Concerned with not only how to do something in an environment, but how to make sure that the choice of action is actually implemented. Such systems assume what is known is far less than what is true.

embedded systems

Systems situated in the world that must survive given real-time constraints on reasoning and action.

linearity assumption

All subgoals are independent of one another, and thus can sequentially be achieved in an arbitrary order.

strong linearity assumption

Not only can a plan be totally ordered, but if operator S achieves goal g and T achieves h, and S is before T, then all the steps that achieve preconditions of S are before all the steps that achieve preconditions of T.

Sussman anomaly

A famous instance of subgoal interaction, where the solutions to two separate subgoals must be interleaved in order to solve them both. No sequential ordering of the subgoal solutions will work.

Specifically, the problem is as follows. In the Blocks World, the initial state consists of C on A on the table, and B on the table. The goal state is A on B on C.

interaction

Term describing how subgoals can interfere with one another. There are three kinds [Latombe]:

weak interaction The goals must be achieved in a certain order in order not to interfere with one another.
strong interaction (ex. Sussman anomaly) No ordering of the subgoals will achieve the goal, but after they are executed sequentially in some order, there is an additional plan that will then achieve the goal. (There is some argument about the validity of this definition -- POD)
very strong interaction No ordering works. Example: There are two rooms, right and left. The right room contains a mailbox, and the left contains a robot with a key. The goal is to put the key in the mailbox and lock the door of the left room from the inside. If the key is put in the mailbox, the door cannot be locked; if the door is locked, the robot cannot reach the mailbox.

abstraction

A way of reducing the complexity of planning by disregarding inessential features.

reflection

The distinction between acting and thinking about acting. Planning is reflection about acting; meta-planning is reflection about planning.

robustness

A robust plan is one that can deal with unexpected changes that may arise during execution; robust plans try to avoid taking courses of action that offer few alternatives if they fail. (Example of scheduling tight-connection in flight.)

reactive systems

Systems which favor responding to immediate percepts rather than planning actions in advance. Such systems use the world "as its own best model" (Brooks) rather than maintaining internal models. The argument for reactive systems is that planning in an unpredictable and complex domains has limited value.

fluent

A function on world states. Properties of the world can be described via fluents that indicate which elements of the world state possess the property. Propositional fluents are fluents whose value in a state is true or false.

deferred planning

A system that interleaves planning and execution may choose to defer planning of some part of a problem until a later time.

conjunctive goal

A goal that has more than condition to be satisfied. Conjunctive goals may require that their solutions be executed in a certain order, or may even require that they be interleaved. This problem is called the interacting subgoals problem.

interacting subgoals

A problem that arises when the solution to one subproblem interferes with the solution of another (e.g., putting up the drywall and then trying to install the wiring). Early planners (such as HACKER, INTERPLAN) dealt with this by creating complete linear plans and then "fixing" them; later planners ordered operations partially and only linearized when necessary.

goal regression

The idea is that, for some goal G we want to achieve, we can't guarantee that it will be true after operation O. Instead, we should find some new goal G', such that if G' is true before O, then G is true after O. Finding the new goal G' is regressing goal G through O. Introduced by Waldinger, 1975.

STRIPS operator

A STRIPS operator consists of an operator name together with a prerequisite list (or precondition list), an add list, and a delete list. The elements of these lists are all proposition expressions; the precondition list was a conjunction of positive literals, while the add and delete lists were conjunctions of positive or negative literals.

causal link

A triple where P is a proposition symbol, w is a step name that has P as a prerequisite, and s is a step name that has P in its add list. Causal links are written s -P-> w.

threat

A step that interferes with a causal link; more formally, a step name v is called a threat to a causal link s -P-> w if v is a step name, other than s and w, that either adds or deletes P. (This definition due to McAllester and Rosenblitt; Tate's earlier definition only considers steps that delete propositions.)

lifting

A concept invented by J. A. Robinson [1965] used in the development of resolution theorem proving. Any application of lifting is associated with a lifting lemma which states that, for every possible computation involving ground expressions, there is a lifted computation involving variables such that the ground computation is a substitution instance of the lifted computation.

downward solution property

This property means that if p is an abstract solution to a problem, then there is a primitive solution of which p is an abstraction.

upward solution property

If an abstract plan is inconsistent, then there is no primitive solution of which it is an abstraction. (We can prune away all descendents of an inconsistent abstract plan.) One way of guaranteeing this property is the unique main subaction condition.

unique main subaction condition

In a hierarchical planner, there is one step of a decomposed plan to which all of the preconditions and effects of the abstract operator are attached. This guarantees the upward solution property.

conditional planning

Also called contingency planning. A kind of planning that deals with incomplete information about the world by constructing a plan that accounts for each possible contingency that may arise. The plan is directed via the agent's sensory input.

coercion

When the state of the world is unknown, it can be coerced into a known state by performing some action (if you don't know what color a chair is, and you spray paint it blue, you now know it's blue).

General Paradigms

Most classical planners made the following assumptions:

The agent's action changes the world from one static state to another.
Effects of actions are predictable.
The world changes only as the result of the agent's actions.

Note that all of these assumptions have given way in recent planning systems.

There are five basic approaches to planning:

non-hierarchical
hierarchical
case-based
opportunistic
partial order planning

Planning as Search

There are two basic ways of viewing planning as search. The first is to look at planning as an operation of searching a space of world states, in which transitions from one state to another are actions the agent can perform.

The second and more general view is one in which the space consists of partial plans, and each node is a partially completed plan. Transitions in this space are accomplished by adding actions to plans. The first view can be seen as a particular instance of this view, in which a transition always adds an action to the beginning or end of a plan.

Complexity and Limits of Planning

Theorems are taken from Chapman's paper on TWEAK.

Theorem 1 (First Undecidability) Any Turing machine with its input can be encoded as a planning problem in the TWEAK representation. Therefore, planning is undecidable, and no upper bound can be put on the amount of time required to solve a problem.

(Note that this theorem relies upon an infinite initial state. If the initial state is restricted to be finite, planning may be decidable.)

Theorem 2 (Intractability) The problem of determining whether a proposition is necessarily true in a nonlinear plan whose action representation is sufficiently strong to represent conditional actions, dependency of effects on input situations, or derived side effects is NP-hard.

Theorem 3 (Second Undecidability) Planning is undecidable even with a finite initial situation if the action representation is extended to represent actions whose effects are a function of their situation.

In extended representations of action, linear planning does not require exponential amounts of work to verify the modal truth criterion, but there will be exponential amounts of backtracking.

Early Planning Systems (Pre-GPS)

Logic Theorist

[Newell, Shaw, and Simon, 1956] One of the first programs to use heuristics, sacrificing completeness for efficiency. The problem domain was theorems in propositional calculus (the system proved 38 of the first 52 theorems in Russell and Whitehead's Principia).

The Logic Theorist operates by backward reasoning from the theorem to the axioms and proved theorems. It was seriously limited by its heuristics and could not prove p v !!!p, for example.

Gelerntner's Geometry Theorem-Proving Machine

[Gelerntner, 1959] Written at IBM. The first program that could handle conjunctive subgoals. Used symmetry in reasoning, and would reuse introduced points to build new constructions.

New subgoals were checked for consistency with the existing diagram; if false in the diagram it could not be a theorem and was pruned in the search. The diagram and rules for avoiding circular reasoning eliminated about 99% of subgoals.

SAINT and SIN: Symbolic Integration

[Slagle, 1961] SAINT -- Symbolic Automatic INTegrator.

[Moses, 1967] SIN -- Symbolic INtegrator. SIN was "capable of solving integration problems as difficult as ones found in the largest tables."

GPS

[Newell, Shaw, Simon, 1957-1969] The General Problem Solver (GPS) was designed to solve problems requiring human intelligence, as well as to develop a theory of how humans solve problems. Although it was meant to be general it solved problems only in highly-structured domains (propositional calculus proofs, puzzles, symbolic integration, etc.).

GPS was the first planner to distinguish between general problem-solving knowledge and domain knowledge (hence the name). It introduced means-end analysis in its search. This approach tried to find a difference between the current object and the goal, and then used a lookup table to invoke an action that would minimize that difference. If the action could not be applied to the current object, it recursively tried to change the object into one that was appropriate.

Logic Planners

Logic planning systems are based on Robinson's method of refutation resolution in theorem proving.

Green's Formulation

[Green, 1969] Green was attempting to solve robot planning problems. His approach was to formulate the problem so that a theorem prover could solve it.

Green introduced a "state" or "situation" variable into each predicate. The goal condition then consisted of a formula with an existentially quantified state. The system's job was to prove that the goal condition was satisfiable; that is, there existed some state in which the condition was true.

A special function do(action, state) denoted the function mapping a state into the state resulting from acting.

Green's formulation also used frame assertion to state what did not change as the result of an action. A serious drawback of the system was that a frame assertion was required for every relation unaffected by an action.

An example of a rule in Green's formulation:


Clear(x, s) ^ Clear(z, s) ^ On(x, z, s) ^ Different(x, z)
-> [Clear(x, do[trans(x, y, z), s]) ^
    Clear(y, do[trans(x, y, z), s]) ^
    On(x, z, do[trans(x, y, z), s])]

Green implemented a system called QA3 that employed this formulation.

Kowalski's Formulation

An attempt to simplify the frame assertions of Green's formulation, Kowalski's approach introduced the Holds literal. This allowed the results of each action to be framed by a single axiom, e.g.:

Holds(v, s) ^ Different(v, clear(z)) ^ Different(v, On(x, y)) -> PACT(trans(x, y, z), s).

Non-Hierarchical Linear Planners

The earliest planning systems were non-hierarchical. This means that they made no distinction between more important (such as "go to the supermarket") and less important ("open the door") parts of the plan, so they could be bogged down dealing with unimportant details.

STRIPS

[Fikes and Nilsson, 1971] Devised as an improvement upon situation calculus and Green's formalism. STRIPS represents the world as a set of formulae in first-order logic. Each state in the search space consists of a world model and set of goals to be achieved. A resolution-based theorem prover used means-end analysis to search; the theorem prover was also used to verify operator preconditions and establishing the validity of the goal formula. A separate system, PLANEX, supervised execution.

Goals are maintained in a stack. Whenever the top goal on the stack matches the current state description, it is eliminated, and any match substitution in the goal is propagated down the stack. Otherwise, if the top goal is a compound goal, each component literal is added as a separate goal above it on the stack. If solving one component goal undoes an already solved component of the same compound goal, the undone goal is reconsidered and solved again if needed.

If the top goal on the stack is an unsolved single-literal goal, STRIPS finds a production that has that literal in its ADD list, and replaces the goal with the instantiation of that production, placing the production's preconditions above it on the stack.

If the top goal on the stack is actually a rule, then all of the preconditions of the rule have been met, and the

STRIPS had a mechanism for learning, which consisted of generalizing found plans and storing them. This was essentially a form of explanation-based learning in which constants were changed to variables whenever possible. The resulting plan was called a MACROP. Plan generalization reduced planning time by a considerable amount (2/3).

STRIPS can't handle the problem of switching two registers. Note that the technical formulation of STRIPS can handle the Sussman anomaly by reconsidering the subgoals of the final goal, and executing additional actions to reach it. It simply won't do it optimally.

Planning in standard STRIPS is PSPACE-complete. [Canny, 1985]

An example of a STRIPS rule:

     UNSTACK(x, y) has

     Precondition HandEmpty, Clear(x), On(x, y), x != y
     Delete-List (same as Precondition)
     Add-List Holding(x), Clear(y)

(The precondition is sometimes different from the delete list; for example, if we can only move red blocks, we don't delete the precondition that they're red.)

(What are the exact requirements on the expressivity of STRIPS operators?)

RSTRIPS

[Nilsson?] A modification of STRIPS that uses goal regression to deal with goal interaction problems. (It isn't clear whether this was an implemented system, or just one that Nilsson describes in his book.)

In RSTRIPS, when a conjunctive goal is added to the stack, and the individual components of the goal are added above it, they are all bound together with vertical parentheses. When goals are achieved within a vertical parenthesis, or rules would be invoked, they are not removed from the stack, but instead are protected. Protection means that RSTRIPS will ensure that no subsequent action within the vertical parenthesis undoes these achieved goals. When the bottom of the stack is reached, the actual sequence of operators to execute are exactly those in still in the goal stack.

If there is a protection violation within a parenthesis, RSTRIPS first checks whether any subsequent actions within the parenthesis will reachieve the violated goal, since the goal only has to be true at the end of the parenthesis. This constitutes a temporary violation of the goal and is acceptable.

If no subsequent action will reachieve the violated goal, the protection violating component of the parenthesis (the goal whose solution would violate the achieved goal) is regressed back through the productions in the parenthesis unitl it has been regressed through the production that achieved the protected subgoal. Then RSTRIPS attempts to solve the regressed version of the protection violating component at that point in the plan.

HACKER

[Sussman, 1973] HACKER was meant as a model of skill acquisition. A skill is a set of procedures each of which solves some problem in its domain. If a procedure does not exist to solve the problem, a new one must be created.

This approach held that achieving one subgoal and preventing another being accomplished is such a common problem, a system for skill aquisition should be able to debug the procedures it designs. If a subgoal interaction problem arises, it attempts to reorder the initial goals to find a solution.

Compared to PLANNER, which backtracks, HACKER instead designs plans and then corrects errors using debugging experts. It is not complete and solutions are not always optimal, an example being its solution of the Sussman anomaly.

WARPLAN

[David Warren, 1973] A case-based reasoning system that used a "skeleton model" similar to one later proposed by Waldinger. It uses promotion and chronological backtracking to solve the Sussman anomaly optimally. Notable because it was written in about 100 lines of PROLOG, making it far smaller than other planners of the time.

CONNIVER

[Fahlman, 1974] Used influential actions as secondary relations. Secondary relationships are defined in terms of primary relationships; they can be deduced from them and added only as needed.

Example: two blocks touching.

INTERPLAN

[Austin Tate, 1974] INTERPLAN uses a declarative tick list to make it easy to detect protection violations, and determine how to recover from them. Like HACKER, it creates complete plans and then repairs subgoal interference.

Like HACKER, INTERPLAN will reorder subgoals at single level of the hierarchy if there is a protection violation. However, if that does not solve the problem, it will promote a subgoal to a higher level of the hierarchy, above its superordinate goal and possibly other goals as well. Thus it considers a larger space of goal orderings than HACKER, but it can then optimize plans HACKER could not.

Waldinger's Planner

[Waldinger, 1975] This planner introduced goal regression, the basic idea of which is, rather than trying to guarantee goal G holds after operator O, instead find a goal G' such that if G' holds before O, then G holds after O.

The basic planning algorithm is to achieve the first of the conjunctive subgoals of the problem, and then regress the other subgoals through the solution to a point where their achievement will not violate those already achieved.

The main advantage of this approach is that it is constructive; plan steps can be added one at a time. The planner does not check whether regression results in redundant operations, so it can generate sub-optimal plans.

Example of failure: When A is on B, and C is adjacent on the table, and the goal is A on B on C. If the system starts by protecting "A on B," it can't remove A to achieve the other subgoals.

Such violations are allowable if a contract is made to re-achieve the protected relations. This can generate inefficient plans, of course.

Hierarchical Linear Planners

Hierarchical planners create vague plans, and refine the details incrementally until a fully specified plan is constructed.

The earliest hierarchical planner was GPS; it planned in an abstraction space in which all logical connectives were represented by a single symbol.

Hierarchical planners hopefully possess both the upward solution property and the downward solution property. At least one is required for guaranteed pruning.

Example: If we want to take a trip to Europe, we shouldn't begin our plan by worrying about how to get to the airport. Clearly finding a flight to Europe is the most important thing to consider first.

ABSTRIPS

[Sacerdoti, 1974] An improvement on STRIPS intended to deal with the combinatorial explosion problem.

ABSTRIPS is like STRIPS except that it plans in a hierarchy of abstraction spaces. The different levels of abstraction uses the same operators and objects, but certain preconditions are judged as more critical than others (it's more important to book the flight than figure out how to get to the airport). Thus more operators are considered in each successive refinement. This permits early recognition of dead-ends and saves pointless search.

Preconditions are partially ordered by a human being. ABSTRIPS then modifies the order as follows. All preconditions whose truth values can't be changed by an operator are given maximum criticality. For others, if a short plan can be found to achieve it (assuming previous preconditions are true) it is treated as a detail and the criticality given in the partial order is used. If no such plan can be found it is given a criticality just above the highest in the partial order. Planners that take this approach are now commonly said to use approximation hierarchies as opposed to abstraction hierarchies.

One problem with ABSTRIPS is that backtracking provides no information to higher levels about why the plan failed.

Non-Hierarchical Non-Linear Planners

SIPE

[David Wilkins, 1983] Used MOLGEN-like constraints and a new technique for detecting clobbering; a common precondition is a resource, which is a binary variable that must be available to be used, and which an operator changes to some other value during use. If two unordered steps try to use the same resource, they clobber each other. SIPE applies promotion in these cases, though it uses only heuristic means of detecting resource clobbering.

SIPE also allows derived effects and dependency of effects upon the input situation, though this treatment is incomplete and not generally correct. SIPE has the ability to handle reasoning with resources as well.

Planning domains for SIPE have included aircraft carrier operations, multi-story building construction, and beer factory job-shop scheduling.

TWEAK

[Chapman, 1987] A non-linear planner that uses constraint posting as its approach to problem solving. It is capable of solving any nonlinear planning problem.

Constraint posting is the process of defining an object (such as a plan) by incrementally specifying partial constraints it must fit. The search space is pruned as constraints are added, until all remaining alternatives satisfy the constraints. This approach minimizes backtracking.

The planning procedure consists of repeatedly choosing a goal and then making a plan to achieve it. It uses the modal truth criterion to do this; the criterion shows all the ways a proposition could be nessarily true; the procedure chooses one of them and modifies the plan accordingly.

If a set of constraints is inconsistent, TWEAK backtracks. The number of completions of a plan is exponential in its size, so computing whether something is possible or necessary is extremely expensive. The heart of TWEAK is a polynomial-time algorithm (polynomial in the number of steps in the plan) that computes possible and necessary properties of an incomplete plan (see Modal Truth Criterion, below).

TWEAK always has an incomplete plan while working on a problem; this is a partial specification of a plan that may solve the problem. Since it could be solved in many ways it represents a class of complete solutions. A complete plan is a total order on some finite set of plan steps.

Kinds of Constraints

temporal constraint -- A requirement that one step come before another. The set of temporal constraints is just a partial order on the steps in the plan.
codesignation constraints -- Roughly, bounding variables, or requiring them to be disjoint or equal. Formally, an equivalence relation on variables and constants. In a complete plan, all variables in pre- or postconditions must be constrained to codesignate with a specific constant.

Actions TWEAK Takes

step addition -- Creating new steps in the plan.
promotion -- Posting a constraint that a clobberer come after the situation it clobbers.
demotion -- The opposite of promotion. A white knight are constrained to be after the clobberer and before the situation it clobbers.
declobbering -- Placing a (possibly new) step between two old steps so that the new step reasserts some precondition that had been clobbered.
simple establishment -- Assigning a value to a variable to ensure the preconditions of some step.
separation -- Preventing the assignment of certain values to a variable.

The Modal Truth Criterion

Modal Truth Criterion A proposition p is necessarily true in a situation s iff two conditions hold: there is a situation t equal or necessarily previous to s in which p is necessarily asserted; and for every step C possibly before s and every proposition q possibly codesignating with p which C denies, there is a step W necessarily between C and s which asserts r, a proposition such that r and p codesignate whenever p and q codesignate.

(The criterion for possible truth is analagous, with the modalities switched -- read "possible" for "necessary" and vice versa.)

This criterion is the heart of the correctness of TWEAK. Intuitively, C is a clobberer and W is a white knight.

Top-Level Control Structure

TWEAK uses dependency-directed breadth-first search.

Actions in TWEAK

TWEAK's representation of actions is limited. It does not allow indirect side-effects; this is required for the truth criterion to succeed. Linear planners can use more powerful representations but they will probably exponentially increase the search. Another limitation is that all postconditions must be explicitly represented. TWEAK's propositions are unchanged by regression because of their simplicity so RSTRIP's problems don't appear.

Constraints on TWEAK

TWEAK cannot restrict a variable's range to a finite set. This is because constraint computations then become NP-complete (graph coloring) and the truth criterion fails.

Correctness/Completeness of TWEAK

Theorem 4 (Correctness/Completeness) If TWEAK terminates claiming a solution, the plan it produces does in fact solve the problem. If TWEAK returns signalling failure or does not halt, no solution exists.

Hierarchical Non-Linear Planners

NOAH

[Sacerdoti, 1975] NOAH (Nets Of Action Hierarchies) is a consultant system that advises a human amateur in a repair task. An advantage of its hierarchical nature is that it can specify abstract plans for trained experts (e.g., bolt the bracket to the frame) but could provide more explicit details for a novice.

NOAH's hierarchy is formed by abstractions of problem-solving operators. It initially plans with generalized operators, and then refines them.

NOAH uses procedural nets to represent procedural domain knowledge and declarative problem solving knowledge. (Procedural nets look rather like partially-ordered plans -- POD) There is a net for each level of abstraction in the plan.

At each level in the plan, a table of multiple effects (TOME) summarizes all the propositions asserted or denied by more than one node in the net. Interaction between subgoals could potentially occur if a proposition's value is changed by more than one node.

NOAH uses a set of procedures called critics that watch the TOME for effects of actions that would jeopardize a plan. Critics detect and fix interaction problems, eliminate redundancies, and so on. Specific critics include resolve conflict (by reordering), eliminate redundant preconditions (same operator appears twice in the plan), use existing objects (instantiate variables to help).

NOAH is least commitment in the sense that it doesn't order subgoals until it is necessary, thus avoiding interacting subgoals when possible.

NONLIN

[Austin Tate, 1977] An improved NOAH that operated in the domain of electric turbine overhaul. It added backtracking and plan-modification operators to NOAH. It uses heuristics to guide search, and later was improved to use dependency-directed backtracking.

MOLGEN

[Stefik, 1980] MOLGEN is a knowledge-driven program that helps geneticists plan molecular genetics experiments. It extends hierarchical planning work by introducing a layered control structure that also does meta-planning. Plans are constructed in one layer, while decisions about the plan design are made in a higher layer, and strategies for controlling design decisions are at the highest level.

A key idea in MOLGEN is to represent subproblem interactions explicitly and declaratively, so the system can reason about them and use them to guide planning. MOLGEN calls interaction restrictions constraints.

MOLGEN uses three kinds of operations on constraints. Constraint formulation identifies interactions between goal solutions. Constraint propagation creates new constraints from old ones, to help refine abstract parts of a plan. Lastly, constraint satisfaction tries to replace objects in order to satisfy existing constraints (replace bacterium with e. coli.).

Control in MOLGEN switches between three layers. The lowest is the planning space, which contains hierarchies of operations and objects used in gene-splicing experiments (e.g., bacteria, drugs, etc.) and generalizations of these things (e.g., genes, organisms, sorting operations). MOLGEN initially plans using the abstract entities from the hierarchy, and as it refines its plan it uses the constraints to instantiate the elements.

The middle layer of planning is called the design space. At this level MOLGEN reasons about plans using the objects and operators in the planning space, just as the planning space reasons about those things directly.

The top layer of planning is the strategy space. This contains four general operators that dictate strategy. These are FOCUS and RESUME, which propose new planning steps and reactivate old ones "put on hold," and GUESS and UNDO, which make heuristic planning decisions when there is inadequate information to focus or resume. UNDO is a backtracking operator that undoes overconstraining decisions. The strategy space prefers least-commitment to heuristics.

Of the three layers, only the planning layer is domain-specific.

Systematic Nonlinear Planning

[McAllester and Rosenbiltt, 1990 (?)] The Systematic Nonlinear planning algorithm is a sound, complete, and systematic nonlinear planning system. Systematicity is the property that no plan or partial plan is examined more than once.

This approach defines a nonlinear plan as symbol table (mapping from a finite set of step names to operators), a set of causal links, and a set of safety conditions (orderings on plan steps).

This system is based upon a simplified, ground version of the NONLIN system, together with a general lifting procedure. The procedure, roughly, is to add steps to a nonlinear, incomplete plan one at a time, that satisfy preconditions of other steps, adding safety conditions to prevent threats as necessary.

Lifting can be used to transform the ground procedure to deal with operator schemas and plan schemas involving variables. This is done simply by creating new instances of operator schemas during the process of adding a step to the incomplete plan.

(It's not entirely clear that they describe a hierarchical planner. Nothing to this effect is stated or implied by the paper, but it is based somewhat upon NONLIN, which certainly is hierarchical. I see nothing to prevent it from being used that way in any case. -- POD)

This algorithm is implemented in the SNLP (Systematic NonLinear Planner) system of Soderland and Weld [1991].

Case-Based Planners

Case-based planning systems rely upon skeletal plans that are already available. The premise is that experience is required to formulate new kinds of plans and debug old ones. There are very few completely new plan outlines, but many new instantiations of plan types. Planning consists of using and debugging existing plans, and storing the results of creating new, useful plans.

cases + transformations = plans + refinement strategies

MOLGEN / Skeletal Plans

[Peter Friedland, 1979] Skeletal plans are specified at many levels of generality; most plans for experiments are not invented from scratch, but instantiated from skeletal plans. Friedland's MOLGEN used this approach, differing from Stefik's.

Opportunistic Planning

[Hayes-Roth and Hayes-Roth] An approach based upon human planning. This approach uses a blackboard control structure on which suggestions are made for plan steps by specialists. Ordering is developed piecewise, and a plan "grows out" of concrete clusters of operators. Such opportunistic planning builds islands in plan space and attempts to connect them, while hierarchical planners try to develop an entire plan at once and then refine it.

Opportunistic planners essentially takes a bottom-up approach to planning, where hierarchical planners take a top-down approach.

OPM

[Hayes-Roth and Hayes-Roth, 1979] Journey-planning system that does opportunistic search on the HERESAY-II blackboard system.

Other Approaches and Issues

Resource-Sensitive Planning

These are systems which either reason about resources consumed as part of the execution of the plan, or systems that are aware that the activity of planning itself comes at a cost.

Interval Logic

[James Allen, 1983] A system for representing actions that take time to execute. If no interval is instantaneous, there are 13 possible relations between two intervals: before, meets, overlaps, starts, during, finishes, is equal to, plus the symmetric operators.

Reference intervals can be used to speed reasoning. Certain interval nodes can be singled out as references. To determine the possible labels on a link between two interval nodes, we look for paths between the nodes such that all links save the first and last must connect reference node pairs.

(This section needs a more detailed explanation of the point.)

DEVISER

[Steve Vere, 1983] Performs Voyager spacecraft mission sequencing, in which actions are temporal intervals with numerical endpoints. DEVISER was the first system that dealt with planning in continuous time.

PRS

[Georgeff and Lansky, 1987] PRS (Procedural Reasoning System) uses metareasoning to recognize problems that may require additional planning; the system is based on temporal logic.

Using Measures in Planning

One way to handle fluid resources (such as money or water) in planning is to use numeric-valued measures. A measure is an amount of something, such as money or volume. Expressions that describe the amount, such as GasLevel, are called measure fluents.

Planners that use measures usually require that they be declared initially, along with range bounds. Some measures are controlled by the environment (such as the price of gas) while others are resources that can be produced and consumed.

To use measures, inequality tests can be placed in the preconditions of operators, and changes can be described in the conditions.

Many modern systems (e.g., SIPE, O-PLAN, DEVISER) have mechanisms for handling measure, but there is still no clean formulation of a theory of resource allocation in planning.

Triangle Tables

[Fikes and Nilsson] Introduced in STRIPS. Triangle tables serve two functions. The first is as a stored MACROP that can later be used to solve a similar problem. The second is to respond (albeit in a limited way) to unexpected effects of actions in the world.

A triangle table is a lower-diagonal matrix whose columns are labeled by actions and numbers, ranging from 0 to n, where n is the size of the plan. The rows are numbered from the top from 1 to n+1.

The entries in cell (i, j) are the literals added to the state by the j-th rule, that survive as preconditions of the i-th rule. The 0th column contains literals in the initial state description that are preconditions of the i-th rule. The entries in the bottom row are those added by various rules that are components of the goal.

The entries in the row to the left of the i-th rule are precisely the preconditions of that rule. The entries in the column below the i-th rule are precisely the ADD formula literals of the rule that are needed by subsequent rules (or are components of the goal).

Kernels

The i-th kernel is the intersection of the i-th row and all rows below it with all the columns to the left of the i-th column. The entries in this kernel are precisely the conditions that a state description must meet in order to apply to sequence of rules in it to achieve the goal.

This property of the table is useful for execution monitoring. At each step, the system can look for the highest-numbered matching kernel. If an unexpected benefit places it closer to the goal, it will only execute the appropriate remaining actions. If there is a setback, it will re-execute necessary actions. To find the appropriate kernel, the system checks them in turn beginning with the last row in the table and working backward.

Generalizing Plans

When a plan is generalized, some constants may have to have specific values. The general idea is to "lift" the triangle table to its most general form by replacing every constant in the clauses of the left-most column with new parameters. The remainder is then filled in with appropriate ADD clauses. This lifted table is too general, so we redo the precondition proof of each operator using the support clauses in the lifted table.

Problems with overgeneralization include the fact that the same constant may be generalized to different values, and thus place an object in different rooms.

Universal and Reactive Planning

A universal plan [Schoppers, 1987] is one that generates appropriate behavior in unpredictable environments. Only a goal condtion is specified for all starting states. It always selects an action that would move the current situation toward the goal in a cooperative world. TMSs are used to encode world descriptions. As might be expected, universal plans are limited to small problem spaces. (Universal plans are in fact equivalent to policies in MDPs.)

Reactive approaches include Brooks's subsumption architecture, Rosenschein and Kaelbling's situated automata (REX and the compiler to circults, GAPPS), and Nils Nilsson's action nets.

Conditional Planning

Similar to universal planning, conditional planning deals with incomplete information by devising a plan that accounts for each possible contingency that may arise. Such plans are then directed at execution by the agent's sensors.

In conditional planning, there are sensing actions that will allow the agent to determine the truth or falsity of certain propositions. When executed, the plan will branch on the result of the sensing action.

During the planning process, conditional steps are added that involve sensing actions. These steps have conditional links to other steps whose preconditions they determine.

The other addition to traditional planners is context, which is associated with each step in the plan. Context is the set of conditionals that must be true in order for the step to be executed -- effectively it is a "branch" of the plan. Whenever conditional steps are added that can give rise to multiple contexts, new finish states are added to account for those contexts as well. Context also offers an additional way to deal with threats; threats can be conditioned so that their contexts are incompatible with those of the steps they threaten.

Uncertainty

Uncertain worlds complicate the planning problem, since it isn't generally possible to guarantee the effects of an action, or that the world's state won't change independently of any action. Traditional planning systems (such as STRIPS) rely largely upon predictable worlds, while more reactive systems (replanning, deferred planning, game playing, subsumption architecture) sacrifice planning ability in exchange for better response to a dynamic environment.

Issues in making this tradeoff include:

reactivity -- requires many sensors, and demands a short response time
predictability -- allows the system to make longer, more efficient plans
high-level planning -- demands that the environment be more predictable (not sure I buy this; a hierarchical planner should be better suited to handle dynamism, shouldn't it? -- POD)
correctness vs. robustness -- guaranteed productive actions as opposed to the capacity to deal with uncertainty; uncertainty can be reduced by taking actions that force the world into a known state (e.g., shaking the table to knock all blocks down)

Learning

Learning is becoming an important topic in planning and I know almost nothing about it. STRIPS [Fikes and Nilsson] could learn by generalizing triangle tables into MACROPs to resolve similar problems in the future. PRODIGY [Minton, 1985] used explanation-based learning techniques to generalize its plans as well. CHEF [Hammond, 1986] learns in a case-based planning environment.

Using TMSs in Planning Systems

When a planner is designing a plan to execute in a real environment, it makes assumptions about the conditions in that environment during the plan's execution. The choices it makes during planning can also be regarded as assumptions.

A TMS can be used to record these assumptions. Failure of the plan during execution can be seen as an undesirable property of the situation, and the TMS can be used to analyze which of the environmental or action assumptions of the TMS are invalid.

It is important to be careful about application of TMSs. If, for example, we use transitivity to prove that X is left of Z because X is to the left of Y which is to the left of Z, a TMS will retract this result when Y is moved.

Extentions and State of the Art

Planning today is engaged in trying to deal with these five main areas (it seems to me):

hierarchical planning
complex conditions -- preconditions depending upon state, effects that are universal, widespread, changing state, etc.
time -- any representation; representations over intervals, numerical or non-numerical
resources -- limits on consumption of fixed resources; distribution and generation of renewable ones
uncertainty -- unpredictability in the world and in the agent's actions

The main issue in dealing with these problems is to expand the power of the operator representations. Russell and Norvig describe a planner, POP-DUNC, that adds conditional effects, negated goals, disjunctive preconditions and universally quantified preconditions and effects. (Note that universal quantification is actually limited to a shorthand for enumeration, since the world is presumed to be finite and static.) Adding all of these is reasonably straightforward; however, adding disjunctive effects is much more complex, because it makes the environment a nondeterministic one.

Recent theoretical systems: CNLP [Peot and Smith, 1992] handles conditional planning; C-BURIDAN [Draper et al., 1994] handles conditional planning for actions that have probabilistic outcomes, relating to MDPs.

Some current systems in use: OPTIMUM-AIV helps in assembly, integration, and verificiation of spacecraft for ESA; quick replanning is a critical element of the system. It is based on O-PLAN.

PlanERS-1, another O-PLAN planner, schedules observations for the European Earth Resource Observation satellite.

The Hubble Space Telescope uses long-term (SPIKE) and short-term (SPSS) planners to decide how to assign jobs and how to carry them out.

Timeline and Accomplishments

Date	Planner	Authors	Significant Contributions
1956	Logic Theorist	Newell, Shaw, Simon	first serious use of heuristics; backward search
1959	Gelerntner's Geometry Theorem-Proving Machine	Gelerntner	first system to handle conjunctive subgoals
1957-1969	GPS	Newell, Shaw, Simon	introduced means-end analysis; separating domain knowledge from general search knowledge
1969	QA3	Green	planned via resolution theorem proving; introduced situations in propositions; required use of frame axioms
1971	STRIPS	Fikes and Nilsson	introduced STRIPS operators; triangle tables to learn MACROPs and cope with uncertainty
1973	HACKER	Sussman	idea of using debugging experts to revise a completed plan; introduces notion of protection; reorders goals to cope with protection violations
1973	WARPLAN	Warren	introduced (?) the idea of promotion; also an early use of the idea of skeleton plans; written in PROLOG
1974	INTERPLAN	Tate	like HACKER, but promotes subgoals if reordering fails
1974	ABSTRIPS	Sacerdoti	clearly introduces concept of abstraction hierarchy, criticality of operators
1975	Waldinger's Planner	Waldinger	introduced goal regression
1975	NOAH	Sacerdoti	introduced nonlinear planning; uses critics to fix/improve plans
1977	NONLIN	Tate	improved NOAH with dependency-directed backtracking, plan modification operators
1979	OPM	Hayes-Roth and Hayes-Roth	used blackboard structure and island-driven planning approach
1980	MOLGEN	Stefik	introduced layers of planning control: strategy, design, planning
1983	SIPE	Wilkins	concept of "resources" in preconditions; first to deal with replanning
1987	TWEAK	Chapman	provably correct and complete; introduces constraint posting, modal truth criterion
1991	SNLP	Soderland and Weld	sound, complete, systematic nonlinear planner; uses concept of lifting; basis of modern nonlinear planning systems
1991	O-PLAN	Currie and Tate	can handle time, resources, hierarchical planning, uses heuristics in search, justifies reasons for each choice made
1992	UCPOP	Penberthy and Weld	hierarchical, nonlinear, multiagent, conditional effects, et al.

Sources

Ronny Kohavi, AI Qual Note #14: Planning. Feigenbaum and Cohen, Handbook of Artificial Intelligence, Vol. 3, Chap. 15. Nilsson, Principles of Artificial Intelligence, Chapters 7-8. McAllester and Rosenblitt, Systematic Nonlinear Planning.

Russell and Norvig, Artificial Intelligence: A Modern Approach, Chapters 11-13.

See the AI Qual Reading List for further details on these sources.

Patrick Doyle pdoyle@cs.stanford.edu