Planning | ||
Patrick Doyle | AI Qual Summary | April 30, 1997 |
Specifically, the problem is as follows. In the Blocks World, the initial state consists of C on A on the table, and B on the table. The goal state is A on B on C.
Most classical planners made the following assumptions:
Note that all of these assumptions have given way in recent planning systems.
There are five basic approaches to planning:
There are two basic ways of viewing planning as search. The first is to look at planning as an operation of searching a space of world states, in which transitions from one state to another are actions the agent can perform.
The second and more general view is one in which the space consists of partial plans, and each node is a partially completed plan. Transitions in this space are accomplished by adding actions to plans. The first view can be seen as a particular instance of this view, in which a transition always adds an action to the beginning or end of a plan.
Theorems are taken from Chapman's paper on TWEAK.
Theorem 1 (First Undecidability) Any Turing machine with its input can be encoded as a planning problem in the TWEAK representation. Therefore, planning is undecidable, and no upper bound can be put on the amount of time required to solve a problem.
(Note that this theorem relies upon an infinite initial state. If the initial state is restricted to be finite, planning may be decidable.)
Theorem 2 (Intractability) The problem of determining whether a proposition is necessarily true in a nonlinear plan whose action representation is sufficiently strong to represent conditional actions, dependency of effects on input situations, or derived side effects is NP-hard.
Theorem 3 (Second Undecidability) Planning is undecidable even with a finite initial situation if the action representation is extended to represent actions whose effects are a function of their situation.
In extended representations of action, linear planning does not
require exponential amounts of work to verify the modal truth
criterion, but there will be exponential amounts of backtracking.
[Newell, Shaw, and Simon, 1956] One of the first programs to use heuristics, sacrificing completeness for efficiency. The problem domain was theorems in propositional calculus (the system proved 38 of the first 52 theorems in Russell and Whitehead's Principia).
The Logic Theorist operates by backward reasoning from the theorem to the axioms and proved theorems. It was seriously limited by its heuristics and could not prove p v !!!p, for example.
[Gelerntner, 1959] Written at IBM. The first program that could handle conjunctive subgoals. Used symmetry in reasoning, and would reuse introduced points to build new constructions.
New subgoals were checked for consistency with the existing diagram; if false in the diagram it could not be a theorem and was pruned in the search. The diagram and rules for avoiding circular reasoning eliminated about 99% of subgoals.
[Slagle, 1961] SAINT -- Symbolic Automatic INTegrator.
[Moses, 1967] SIN -- Symbolic INtegrator. SIN was "capable of
solving integration problems as difficult as ones found in the largest
tables."
[Newell, Shaw, Simon, 1957-1969] The General Problem Solver (GPS) was designed to solve problems requiring human intelligence, as well as to develop a theory of how humans solve problems. Although it was meant to be general it solved problems only in highly-structured domains (propositional calculus proofs, puzzles, symbolic integration, etc.).
GPS was the first planner to distinguish between general
problem-solving knowledge and domain knowledge (hence the name). It
introduced means-end analysis in its search. This approach
tried to find a difference between the current object and the goal,
and then used a lookup table to invoke an action that would minimize
that difference. If the action could not be applied to the current
object, it recursively tried to change the object into one that
was appropriate.
Logic planning systems are based on Robinson's method of refutation resolution in theorem proving.
[Green, 1969] Green was attempting to solve robot planning problems. His approach was to formulate the problem so that a theorem prover could solve it.
Green introduced a "state" or "situation" variable into each predicate. The goal condition then consisted of a formula with an existentially quantified state. The system's job was to prove that the goal condition was satisfiable; that is, there existed some state in which the condition was true.
A special function do(action, state) denoted the function mapping a state into the state resulting from acting.
Green's formulation also used frame assertion to state what did not change as the result of an action. A serious drawback of the system was that a frame assertion was required for every relation unaffected by an action.
An example of a rule in Green's formulation:
Clear(x, s) ^ Clear(z, s) ^ On(x, z, s) ^ Different(x, z) -> [Clear(x, do[trans(x, y, z), s]) ^ Clear(y, do[trans(x, y, z), s]) ^ On(x, z, do[trans(x, y, z), s])]
Green implemented a system called QA3 that employed this formulation.
An attempt to simplify the frame assertions of Green's formulation, Kowalski's approach introduced the Holds literal. This allowed the results of each action to be framed by a single axiom, e.g.:
Holds(v, s) ^ Different(v, clear(z)) ^ Different(v, On(x, y)) ->
PACT(trans(x, y, z), s).
The earliest planning systems were non-hierarchical. This means that they made no distinction between more important (such as "go to the supermarket") and less important ("open the door") parts of the plan, so they could be bogged down dealing with unimportant details.
[Fikes and Nilsson, 1971] Devised as an improvement upon situation calculus and Green's formalism. STRIPS represents the world as a set of formulae in first-order logic. Each state in the search space consists of a world model and set of goals to be achieved. A resolution-based theorem prover used means-end analysis to search; the theorem prover was also used to verify operator preconditions and establishing the validity of the goal formula. A separate system, PLANEX, supervised execution.
Goals are maintained in a stack. Whenever the top goal on the stack matches the current state description, it is eliminated, and any match substitution in the goal is propagated down the stack. Otherwise, if the top goal is a compound goal, each component literal is added as a separate goal above it on the stack. If solving one component goal undoes an already solved component of the same compound goal, the undone goal is reconsidered and solved again if needed.
If the top goal on the stack is an unsolved single-literal goal, STRIPS finds a production that has that literal in its ADD list, and replaces the goal with the instantiation of that production, placing the production's preconditions above it on the stack.
If the top goal on the stack is actually a rule, then all of the preconditions of the rule have been met, and the
STRIPS had a mechanism for learning, which consisted of generalizing found plans and storing them. This was essentially a form of explanation-based learning in which constants were changed to variables whenever possible. The resulting plan was called a MACROP. Plan generalization reduced planning time by a considerable amount (2/3).
STRIPS can't handle the problem of switching two registers. Note that the technical formulation of STRIPS can handle the Sussman anomaly by reconsidering the subgoals of the final goal, and executing additional actions to reach it. It simply won't do it optimally.
Planning in standard STRIPS is PSPACE-complete. [Canny, 1985]
An example of a STRIPS rule:
UNSTACK(x, y) has Precondition HandEmpty, Clear(x), On(x, y), x != y Delete-List (same as Precondition) Add-List Holding(x), Clear(y)
(The precondition is sometimes different from the delete list; for example, if we can only move red blocks, we don't delete the precondition that they're red.)
(What are the exact requirements on the expressivity of STRIPS operators?)
[Nilsson?] A modification of STRIPS that uses goal regression to deal with goal interaction problems. (It isn't clear whether this was an implemented system, or just one that Nilsson describes in his book.)
In RSTRIPS, when a conjunctive goal is added to the stack, and the individual components of the goal are added above it, they are all bound together with vertical parentheses. When goals are achieved within a vertical parenthesis, or rules would be invoked, they are not removed from the stack, but instead are protected. Protection means that RSTRIPS will ensure that no subsequent action within the vertical parenthesis undoes these achieved goals. When the bottom of the stack is reached, the actual sequence of operators to execute are exactly those in still in the goal stack.
If there is a protection violation within a parenthesis, RSTRIPS first checks whether any subsequent actions within the parenthesis will reachieve the violated goal, since the goal only has to be true at the end of the parenthesis. This constitutes a temporary violation of the goal and is acceptable.
If no subsequent action will reachieve the violated goal, the protection violating component of the parenthesis (the goal whose solution would violate the achieved goal) is regressed back through the productions in the parenthesis unitl it has been regressed through the production that achieved the protected subgoal. Then RSTRIPS attempts to solve the regressed version of the protection violating component at that point in the plan.
[Sussman, 1973] HACKER was meant as a model of skill acquisition. A skill is a set of procedures each of which solves some problem in its domain. If a procedure does not exist to solve the problem, a new one must be created.
This approach held that achieving one subgoal and preventing another being accomplished is such a common problem, a system for skill aquisition should be able to debug the procedures it designs. If a subgoal interaction problem arises, it attempts to reorder the initial goals to find a solution.
Compared to PLANNER, which backtracks, HACKER instead designs plans and then corrects errors using debugging experts. It is not complete and solutions are not always optimal, an example being its solution of the Sussman anomaly.
[David Warren, 1973] A case-based reasoning system that used a "skeleton model" similar to one later proposed by Waldinger. It uses promotion and chronological backtracking to solve the Sussman anomaly optimally. Notable because it was written in about 100 lines of PROLOG, making it far smaller than other planners of the time.
[Fahlman, 1974] Used influential actions as secondary relations. Secondary relationships are defined in terms of primary relationships; they can be deduced from them and added only as needed.
Example: two blocks touching.
[Austin Tate, 1974] INTERPLAN uses a declarative tick list to make it easy to detect protection violations, and determine how to recover from them. Like HACKER, it creates complete plans and then repairs subgoal interference.
Like HACKER, INTERPLAN will reorder subgoals at single level of the hierarchy if there is a protection violation. However, if that does not solve the problem, it will promote a subgoal to a higher level of the hierarchy, above its superordinate goal and possibly other goals as well. Thus it considers a larger space of goal orderings than HACKER, but it can then optimize plans HACKER could not.
[Waldinger, 1975] This planner introduced goal regression, the basic idea of which is, rather than trying to guarantee goal G holds after operator O, instead find a goal G' such that if G' holds before O, then G holds after O.
The basic planning algorithm is to achieve the first of the conjunctive subgoals of the problem, and then regress the other subgoals through the solution to a point where their achievement will not violate those already achieved.
The main advantage of this approach is that it is constructive; plan steps can be added one at a time. The planner does not check whether regression results in redundant operations, so it can generate sub-optimal plans.
Example of failure: When A is on B, and C is adjacent on the table, and the goal is A on B on C. If the system starts by protecting "A on B," it can't remove A to achieve the other subgoals.
Such violations are allowable if a contract is made to
re-achieve the protected relations. This can generate inefficient
plans, of course.
Hierarchical planners create vague plans, and refine the details incrementally until a fully specified plan is constructed.
The earliest hierarchical planner was GPS; it planned in an abstraction space in which all logical connectives were represented by a single symbol.
Hierarchical planners hopefully possess both the upward solution property and the downward solution property. At least one is required for guaranteed pruning.
Example: If we want to take a trip to Europe, we shouldn't begin our plan by worrying about how to get to the airport. Clearly finding a flight to Europe is the most important thing to consider first.
[Sacerdoti, 1974] An improvement on STRIPS intended to deal with the combinatorial explosion problem.
ABSTRIPS is like STRIPS except that it plans in a hierarchy of abstraction spaces. The different levels of abstraction uses the same operators and objects, but certain preconditions are judged as more critical than others (it's more important to book the flight than figure out how to get to the airport). Thus more operators are considered in each successive refinement. This permits early recognition of dead-ends and saves pointless search.
Preconditions are partially ordered by a human being. ABSTRIPS then modifies the order as follows. All preconditions whose truth values can't be changed by an operator are given maximum criticality. For others, if a short plan can be found to achieve it (assuming previous preconditions are true) it is treated as a detail and the criticality given in the partial order is used. If no such plan can be found it is given a criticality just above the highest in the partial order. Planners that take this approach are now commonly said to use approximation hierarchies as opposed to abstraction hierarchies.
One problem with ABSTRIPS is that backtracking provides no
information to higher levels about why the plan failed.
[David Wilkins, 1983] Used MOLGEN-like constraints and a new technique for detecting clobbering; a common precondition is a resource, which is a binary variable that must be available to be used, and which an operator changes to some other value during use. If two unordered steps try to use the same resource, they clobber each other. SIPE applies promotion in these cases, though it uses only heuristic means of detecting resource clobbering.
SIPE also allows derived effects and dependency of effects upon the input situation, though this treatment is incomplete and not generally correct. SIPE has the ability to handle reasoning with resources as well.
Planning domains for SIPE have included aircraft carrier operations, multi-story building construction, and beer factory job-shop scheduling.
[Chapman, 1987] A non-linear planner that uses constraint posting as its approach to problem solving. It is capable of solving any nonlinear planning problem.
Constraint posting is the process of defining an object (such as a plan) by incrementally specifying partial constraints it must fit. The search space is pruned as constraints are added, until all remaining alternatives satisfy the constraints. This approach minimizes backtracking.
The planning procedure consists of repeatedly choosing a goal and then making a plan to achieve it. It uses the modal truth criterion to do this; the criterion shows all the ways a proposition could be nessarily true; the procedure chooses one of them and modifies the plan accordingly.
If a set of constraints is inconsistent, TWEAK backtracks. The number of completions of a plan is exponential in its size, so computing whether something is possible or necessary is extremely expensive. The heart of TWEAK is a polynomial-time algorithm (polynomial in the number of steps in the plan) that computes possible and necessary properties of an incomplete plan (see Modal Truth Criterion, below).
TWEAK always has an incomplete plan while working on a problem; this is a partial specification of a plan that may solve the problem. Since it could be solved in many ways it represents a class of complete solutions. A complete plan is a total order on some finite set of plan steps.
Modal Truth Criterion A proposition p is necessarily true in a situation s iff two conditions hold: there is a situation t equal or necessarily previous to s in which p is necessarily asserted; and for every step C possibly before s and every proposition q possibly codesignating with p which C denies, there is a step W necessarily between C and s which asserts r, a proposition such that r and p codesignate whenever p and q codesignate.
(The criterion for possible truth is analagous, with the modalities switched -- read "possible" for "necessary" and vice versa.)
This criterion is the heart of the correctness of TWEAK. Intuitively, C is a clobberer and W is a white knight.
TWEAK uses dependency-directed breadth-first search.
TWEAK's representation of actions is limited. It does not allow indirect side-effects; this is required for the truth criterion to succeed. Linear planners can use more powerful representations but they will probably exponentially increase the search. Another limitation is that all postconditions must be explicitly represented. TWEAK's propositions are unchanged by regression because of their simplicity so RSTRIP's problems don't appear.
TWEAK cannot restrict a variable's range to a finite set. This is because constraint computations then become NP-complete (graph coloring) and the truth criterion fails.
Theorem 4 (Correctness/Completeness) If TWEAK terminates
claiming a solution, the plan it produces does in fact solve the
problem. If TWEAK returns signalling failure or does not
halt, no solution exists.
[Sacerdoti, 1975] NOAH (Nets Of Action Hierarchies) is a consultant system that advises a human amateur in a repair task. An advantage of its hierarchical nature is that it can specify abstract plans for trained experts (e.g., bolt the bracket to the frame) but could provide more explicit details for a novice.
NOAH's hierarchy is formed by abstractions of problem-solving operators. It initially plans with generalized operators, and then refines them.
NOAH uses procedural nets to represent procedural domain knowledge and declarative problem solving knowledge. (Procedural nets look rather like partially-ordered plans -- POD) There is a net for each level of abstraction in the plan.
At each level in the plan, a table of multiple effects (TOME) summarizes all the propositions asserted or denied by more than one node in the net. Interaction between subgoals could potentially occur if a proposition's value is changed by more than one node.
NOAH uses a set of procedures called critics that watch the TOME for effects of actions that would jeopardize a plan. Critics detect and fix interaction problems, eliminate redundancies, and so on. Specific critics include resolve conflict (by reordering), eliminate redundant preconditions (same operator appears twice in the plan), use existing objects (instantiate variables to help).
NOAH is least commitment in the sense that it doesn't order subgoals until it is necessary, thus avoiding interacting subgoals when possible.
[Austin Tate, 1977] An improved NOAH that operated in the domain of electric turbine overhaul. It added backtracking and plan-modification operators to NOAH. It uses heuristics to guide search, and later was improved to use dependency-directed backtracking.
[Stefik, 1980] MOLGEN is a knowledge-driven program that helps geneticists plan molecular genetics experiments. It extends hierarchical planning work by introducing a layered control structure that also does meta-planning. Plans are constructed in one layer, while decisions about the plan design are made in a higher layer, and strategies for controlling design decisions are at the highest level.
A key idea in MOLGEN is to represent subproblem interactions explicitly and declaratively, so the system can reason about them and use them to guide planning. MOLGEN calls interaction restrictions constraints.
MOLGEN uses three kinds of operations on constraints. Constraint formulation identifies interactions between goal solutions. Constraint propagation creates new constraints from old ones, to help refine abstract parts of a plan. Lastly, constraint satisfaction tries to replace objects in order to satisfy existing constraints (replace bacterium with e. coli.).
Control in MOLGEN switches between three layers. The lowest is the planning space, which contains hierarchies of operations and objects used in gene-splicing experiments (e.g., bacteria, drugs, etc.) and generalizations of these things (e.g., genes, organisms, sorting operations). MOLGEN initially plans using the abstract entities from the hierarchy, and as it refines its plan it uses the constraints to instantiate the elements.
The middle layer of planning is called the design space. At this level MOLGEN reasons about plans using the objects and operators in the planning space, just as the planning space reasons about those things directly.
The top layer of planning is the strategy space. This contains four general operators that dictate strategy. These are FOCUS and RESUME, which propose new planning steps and reactivate old ones "put on hold," and GUESS and UNDO, which make heuristic planning decisions when there is inadequate information to focus or resume. UNDO is a backtracking operator that undoes overconstraining decisions. The strategy space prefers least-commitment to heuristics.
Of the three layers, only the planning layer is domain-specific.
[McAllester and Rosenbiltt, 1990 (?)] The Systematic Nonlinear planning algorithm is a sound, complete, and systematic nonlinear planning system. Systematicity is the property that no plan or partial plan is examined more than once.
This approach defines a nonlinear plan as symbol table (mapping from a finite set of step names to operators), a set of causal links, and a set of safety conditions (orderings on plan steps).
This system is based upon a simplified, ground version of the NONLIN system, together with a general lifting procedure. The procedure, roughly, is to add steps to a nonlinear, incomplete plan one at a time, that satisfy preconditions of other steps, adding safety conditions to prevent threats as necessary.
Lifting can be used to transform the ground procedure to deal with operator schemas and plan schemas involving variables. This is done simply by creating new instances of operator schemas during the process of adding a step to the incomplete plan.
(It's not entirely clear that they describe a hierarchical planner. Nothing to this effect is stated or implied by the paper, but it is based somewhat upon NONLIN, which certainly is hierarchical. I see nothing to prevent it from being used that way in any case. -- POD)
This algorithm is implemented in the SNLP (Systematic NonLinear Planner) system of Soderland and Weld [1991].
Case-based planning systems rely upon skeletal plans that are already available. The premise is that experience is required to formulate new kinds of plans and debug old ones. There are very few completely new plan outlines, but many new instantiations of plan types. Planning consists of using and debugging existing plans, and storing the results of creating new, useful plans.
[Peter Friedland, 1979] Skeletal plans are specified at many levels
of generality; most plans for experiments are not invented from
scratch, but instantiated from skeletal plans. Friedland's MOLGEN
used this approach, differing from Stefik's.
[Hayes-Roth and Hayes-Roth] An approach based upon human planning. This approach uses a blackboard control structure on which suggestions are made for plan steps by specialists. Ordering is developed piecewise, and a plan "grows out" of concrete clusters of operators. Such opportunistic planning builds islands in plan space and attempts to connect them, while hierarchical planners try to develop an entire plan at once and then refine it.
Opportunistic planners essentially takes a bottom-up approach to planning, where hierarchical planners take a top-down approach.
[Hayes-Roth and Hayes-Roth, 1979] Journey-planning system that
does opportunistic search on the HERESAY-II blackboard system.
These are systems which either reason about resources consumed as part of the execution of the plan, or systems that are aware that the activity of planning itself comes at a cost.
[James Allen, 1983] A system for representing actions that take time to execute. If no interval is instantaneous, there are 13 possible relations between two intervals: before, meets, overlaps, starts, during, finishes, is equal to, plus the symmetric operators.
Reference intervals can be used to speed reasoning. Certain interval nodes can be singled out as references. To determine the possible labels on a link between two interval nodes, we look for paths between the nodes such that all links save the first and last must connect reference node pairs.
(This section needs a more detailed explanation of the point.)
[Steve Vere, 1983] Performs Voyager spacecraft mission sequencing, in which actions are temporal intervals with numerical endpoints. DEVISER was the first system that dealt with planning in continuous time.
[Georgeff and Lansky, 1987] PRS (Procedural Reasoning System) uses metareasoning to recognize problems that may require additional planning; the system is based on temporal logic.
One way to handle fluid resources (such as money or water) in planning is to use numeric-valued measures. A measure is an amount of something, such as money or volume. Expressions that describe the amount, such as GasLevel, are called measure fluents.
Planners that use measures usually require that they be declared initially, along with range bounds. Some measures are controlled by the environment (such as the price of gas) while others are resources that can be produced and consumed.
To use measures, inequality tests can be placed in the preconditions of operators, and changes can be described in the conditions.
Many modern systems (e.g., SIPE, O-PLAN, DEVISER) have mechanisms for handling measure, but there is still no clean formulation of a theory of resource allocation in planning.
[Fikes and Nilsson] Introduced in STRIPS. Triangle tables serve two functions. The first is as a stored MACROP that can later be used to solve a similar problem. The second is to respond (albeit in a limited way) to unexpected effects of actions in the world.
A triangle table is a lower-diagonal matrix whose columns are labeled by actions and numbers, ranging from 0 to n, where n is the size of the plan. The rows are numbered from the top from 1 to n+1.
The entries in cell (i, j) are the literals added to the state by the j-th rule, that survive as preconditions of the i-th rule. The 0th column contains literals in the initial state description that are preconditions of the i-th rule. The entries in the bottom row are those added by various rules that are components of the goal.
The entries in the row to the left of the i-th rule are precisely the preconditions of that rule. The entries in the column below the i-th rule are precisely the ADD formula literals of the rule that are needed by subsequent rules (or are components of the goal).
The i-th kernel is the intersection of the i-th row and all rows below it with all the columns to the left of the i-th column. The entries in this kernel are precisely the conditions that a state description must meet in order to apply to sequence of rules in it to achieve the goal.
This property of the table is useful for execution monitoring. At each step, the system can look for the highest-numbered matching kernel. If an unexpected benefit places it closer to the goal, it will only execute the appropriate remaining actions. If there is a setback, it will re-execute necessary actions. To find the appropriate kernel, the system checks them in turn beginning with the last row in the table and working backward.
When a plan is generalized, some constants may have to have specific values. The general idea is to "lift" the triangle table to its most general form by replacing every constant in the clauses of the left-most column with new parameters. The remainder is then filled in with appropriate ADD clauses. This lifted table is too general, so we redo the precondition proof of each operator using the support clauses in the lifted table.
Problems with overgeneralization include the fact that the same constant may be generalized to different values, and thus place an object in different rooms.
A universal plan [Schoppers, 1987] is one that generates appropriate behavior in unpredictable environments. Only a goal condtion is specified for all starting states. It always selects an action that would move the current situation toward the goal in a cooperative world. TMSs are used to encode world descriptions. As might be expected, universal plans are limited to small problem spaces. (Universal plans are in fact equivalent to policies in MDPs.)
Reactive approaches include Brooks's subsumption architecture, Rosenschein and Kaelbling's situated automata (REX and the compiler to circults, GAPPS), and Nils Nilsson's action nets.
Similar to universal planning, conditional planning deals with incomplete information by devising a plan that accounts for each possible contingency that may arise. Such plans are then directed at execution by the agent's sensors.
In conditional planning, there are sensing actions that will allow the agent to determine the truth or falsity of certain propositions. When executed, the plan will branch on the result of the sensing action.
During the planning process, conditional steps are added that involve sensing actions. These steps have conditional links to other steps whose preconditions they determine.
The other addition to traditional planners is context, which is associated with each step in the plan. Context is the set of conditionals that must be true in order for the step to be executed -- effectively it is a "branch" of the plan. Whenever conditional steps are added that can give rise to multiple contexts, new finish states are added to account for those contexts as well. Context also offers an additional way to deal with threats; threats can be conditioned so that their contexts are incompatible with those of the steps they threaten.
Uncertain worlds complicate the planning problem, since it isn't generally possible to guarantee the effects of an action, or that the world's state won't change independently of any action. Traditional planning systems (such as STRIPS) rely largely upon predictable worlds, while more reactive systems (replanning, deferred planning, game playing, subsumption architecture) sacrifice planning ability in exchange for better response to a dynamic environment.
Issues in making this tradeoff include:
Learning is becoming an important topic in planning and I know almost nothing about it. STRIPS [Fikes and Nilsson] could learn by generalizing triangle tables into MACROPs to resolve similar problems in the future. PRODIGY [Minton, 1985] used explanation-based learning techniques to generalize its plans as well. CHEF [Hammond, 1986] learns in a case-based planning environment.
When a planner is designing a plan to execute in a real environment, it makes assumptions about the conditions in that environment during the plan's execution. The choices it makes during planning can also be regarded as assumptions.
A TMS can be used to record these assumptions. Failure of the plan during execution can be seen as an undesirable property of the situation, and the TMS can be used to analyze which of the environmental or action assumptions of the TMS are invalid.
It is important to be careful about application of TMSs. If, for
example, we use transitivity to prove that X is left of
Z because X is to the left of Y which is to
the left of Z, a TMS will retract this result when Y
is moved.
Planning today is engaged in trying to deal with these five main areas (it seems to me):
The main issue in dealing with these problems is to expand the power of the operator representations. Russell and Norvig describe a planner, POP-DUNC, that adds conditional effects, negated goals, disjunctive preconditions and universally quantified preconditions and effects. (Note that universal quantification is actually limited to a shorthand for enumeration, since the world is presumed to be finite and static.) Adding all of these is reasonably straightforward; however, adding disjunctive effects is much more complex, because it makes the environment a nondeterministic one.
Recent theoretical systems: CNLP [Peot and Smith, 1992] handles conditional planning; C-BURIDAN [Draper et al., 1994] handles conditional planning for actions that have probabilistic outcomes, relating to MDPs.
Some current systems in use: OPTIMUM-AIV helps in assembly, integration, and verificiation of spacecraft for ESA; quick replanning is a critical element of the system. It is based on O-PLAN.
PlanERS-1, another O-PLAN planner, schedules observations for the European Earth Resource Observation satellite.
The Hubble Space Telescope uses long-term (SPIKE) and short-term
(SPSS) planners to decide how to assign jobs and how to carry them out.
Date | Planner | Authors | Significant Contributions |
---|---|---|---|
1956 | Logic Theorist | Newell, Shaw, Simon | first serious use of heuristics; backward search |
1959 | Gelerntner's Geometry Theorem-Proving Machine | Gelerntner | first system to handle conjunctive subgoals |
1957-1969 | GPS | Newell, Shaw, Simon | introduced means-end analysis; separating domain knowledge from general search knowledge |
1969 | QA3 | Green | planned via resolution theorem proving; introduced situations in propositions; required use of frame axioms |
1971 | STRIPS | Fikes and Nilsson | introduced STRIPS operators; triangle tables to learn MACROPs and cope with uncertainty |
1973 | HACKER | Sussman | idea of using debugging experts to revise a completed plan; introduces notion of protection; reorders goals to cope with protection violations |
1973 | WARPLAN | Warren | introduced (?) the idea of promotion; also an early use of the idea of skeleton plans; written in PROLOG |
1974 | INTERPLAN | Tate | like HACKER, but promotes subgoals if reordering fails |
1974 | ABSTRIPS | Sacerdoti | clearly introduces concept of abstraction hierarchy, criticality of operators |
1975 | Waldinger's Planner | Waldinger | introduced goal regression |
1975 | NOAH | Sacerdoti | introduced nonlinear planning; uses critics to fix/improve plans |
1977 | NONLIN | Tate | improved NOAH with dependency-directed backtracking, plan modification operators |
1979 | OPM | Hayes-Roth and Hayes-Roth | used blackboard structure and island-driven planning approach |
1980 | MOLGEN | Stefik | introduced layers of planning control: strategy, design, planning |
1983 | SIPE | Wilkins | concept of "resources" in preconditions; first to deal with replanning |
1987 | TWEAK | Chapman | provably correct and complete; introduces constraint posting, modal truth criterion |
1991 | SNLP | Soderland and Weld | sound, complete, systematic nonlinear planner; uses concept of lifting; basis of modern nonlinear planning systems |
1991 | O-PLAN | Currie and Tate | can handle time, resources, hierarchical planning, uses heuristics in search, justifies reasons for each choice made |
1992 | UCPOP | Penberthy and Weld | hierarchical, nonlinear, multiagent, conditional effects, et al. |
Ronny Kohavi, AI Qual Note #14: Planning.
Feigenbaum and Cohen, Handbook of Artificial Intelligence, Vol. 3, Chap. 15.
Nilsson, Principles of Artificial Intelligence, Chapters 7-8.
McAllester and Rosenblitt, Systematic Nonlinear Planning.
Russell and Norvig, Artificial Intelligence: A Modern Approach, Chapters 11-13.
See the AI Qual Reading List for further details on these sources.
Patrick Doyle | pdoyle@cs.stanford.edu |