AI Qual Summary: Robotics

Robotics
Patrick Doyle	AI Qual Summary	May 6, 1997

Table of Contents

Definitions
Motivations
Task Planning
Motion Planning
Grasp Planning
Computational Complexity
- Reducing Complexity
Robot Architectures
Sources

Definitions

robot: A versatile mechanical device equipped with actuators and sensors under the control of a computing system. Russell and Norvig define it as "an active, artificial agent whose environment is the physical world."
task planner: A program that converts task-level specifications into manipulator-level specifications. The task planner must have a description of the objects being manipulated, the task environment, the robot, and the initial and desired final states of the environment. The output should be a robot program that converts the initial state into the desired final state.
guarded motion: A motion of the form move along path until sensory-condition.
compliant motion: A motion of the form move along direction d with force = 0 perpendicularly to d. More intuitively, moving along a surface while maintaining a fixed pressure, such as when scraping paint off of a window with a razor.
effector: The bits the robot does stuff with. That is, arms, legs, hands, feet. An end-effector is a functional device attached to the end of a robot arm (e.g., grippers).
actuator: A device that converts software commands into physical motion, typically electric motors or hydraulic or pneumatic cylinders.
degree of freedom: A dimension along which the robot can move itself or some part of itself. Free objects in 3-space have 6 degrees of freedom, three for position and three for orientation.
sensors: Devices that monitor the environment. There are contact sensors (touch and force), and non-contact (e.g., sonar).
sonar: Sensing system that works by measuring the time of flight of a sound pulse to be generated, reach an object, and be reflected back to the sensor. Wide angle but reasonably accurate in depth (the wide angle is the disadvantage).
infrared: Very accurate angular resolution system but terrible in depth measurement.
recognizable set: An envelope of possible configurations a robot may be in at present; recognizable sets are to continuous domains what multiple state sets are to discrete ones. A recognizable set with respect to a sensor reading is the set of all world states in which the robot might be upon receiving that sensor reading.
landmark: An easily recognizable, unique element of the environment that the robot can use to get its bearings.

Motivations

The important incentives for building robots are social, replacing humans in undesirable or dangerous jobs, and economic, reducing the cost of manufacturing while improving its quality.

The real world has the following qualities, that any robot design must take into account:

inaccessible -- sensors are imperfect, and can only perceive local stimuli
nondeterministic -- the robot can never be certain an action will work as expected, since wheels slip, batteries run down, etc.
nonepisodic -- the effects of an action change over time, so robots should handle sequential decision problems and learning
dynamic -- a robot has to know when to think and when to act right away
continuous -- states and actions are drawn from a continuum of physical configurations and motions

In general, robots should have the following qualities:

high reliability -- if a robot fails, it should be able to recover or to call for help
high speed -- a robot should perform its functions as quickly as needed
programmability -- the robot should be flexible and easily adaptable to various tasks
low cost

The ultimate goal is to build autonomous robots that accept commands telling them what to do, without needing to specify exactly how.

By virtue of their versatility, robots can be difficult to program, especially for tasks requiring complex motions involving sensory feedback. In order to simplify programming, task-level languages exist that specify actions in terms of their effects on objects.

Example: pin A programmer should be able to specify that the robot should put a pin in a hole, without telling it what sequence of operators to use, or having to think about its sensory or motor operators.

Task planning is divided into three phases: modeling, task specification, and manipulator program synthesis.

There are three approaches to specifying the model state:

Using a CAD system to draw the positions of the objects in the desired configuration.
Using the robot itself to specify its configurations and to locate the object features.
Using symbolic spatial relationships between object features (such as (face1 against face2). This is the most common method, but must be converted into numerical form to be used.

One problem is that these configurations may overconstrain the state. Symmetry is an example; it does not matter what the orientation of a peg in a hole is. The final state may also not completely specify the operation; for example, it may not say how hard to tighten a bolt.

The three basic kinds of motions are free motion, guarded motion, and compliant motion.

An important part of robot program synthesis should be the inclusion of sensor tests for error detection.

Motion Planning

The fundamental problem in robotics is deciding what motions the robot should perform in order to achieve a goal arrangement of physical objects. This turns out to be an extremely hard problem.

Motion Planning Definitions

basic motion planning problem: Let A be a single rigid object (the robot) moving in a Euclidean space W, called the workspace, represented as Rⁿ (where n = 2 or 3). Let B₁, ..., B_q be fixed rigid objects distributed in W. These are called obstacles.

Assume that the geometry of A, and the geometries and locations of the B_i's are accurately known. Assume also that no kinematic constraints limit the motions of A (so that A is a free-flying object).

Given an initial position and orientation and a goal position and orientation of A in W, the problem is to generate a path t specifying a continuous sequence of positions and orientations of A avoiding contact with the B_i's.

(Basically, given a robot, a bunch of objects, a start state and a goal state, find a path for the robot to reach the goal state.)
configuration of object A: A specification of the position of every point in the object, relative to a fixed frame of reference. To specify the configuration of a rigid object A, it is enough to specify the position and orientation of the frame F_A with respect to F_W. The subset of W occupied by A at configuration q is denoted by A(q).
configuration space of object A: The space C of all configurations of A. The idea is to represent the robot as a point and thus reduce the motion planning problem to planning for a point.
dimension of C: The dimension of a configuration space is the number of independent parameters required to represent it as R^m. This is 3 for 2-D, and 6 for 3-D.
chart: A representation of a local portion of the configuration space. C can be decomposed into a finite union of slightly overlapping patches called charts, each represented as a copy of R^m.
distance between configurations: The distance between configurations q and q' should decrease and tend to zero when the regions A(q) and A(q') get closer and tend to coincide.
path: A path from a configuration q_init to configuration q_goal is a continuous map t : [0,1] -> C with t(0) = q_init and t(1) = q_goal.
free-flying object: An object for which, in the absence of any obstacles, any path is feasible.
C-obstacle: An obstacle mapped into configuration space. Every obstacle B_i is mapped to the following region in the workspace called the C-obstacle: CB_i = { q in C : A(q) intersected with B_i != empty set }.
C-obstacle region: The union of all the C-obstacles.
Free space: All of the configuration space less the C-obstacle region, called C_free. A configuration in C_free is called a free configuration and a free path is a path where t maps to C_free instead of to C. A semi-free path maps to the closure of C_free.
kinematics: Motions in the abstract, without reference to force or mass. Russell and Norvig define it as the study of the correspondence between the actuator motions in a mechanism and the resulting motion of its parts.
dynamics: Motions of material bodies under the action of forces.
force-compliant motion: Motions in which the robot may touch obstacle surfaces and slide along them. These are more accurate than position-controlled commands.
rotary motion: Rotation around a fixed hub.
prismatic motion: Linear movement, as with a piston in a cylinder.

Skeletonization (Roadmap) Methods

Skeletonization methods collapse the configuration space into a one-dimensional subset, or skeleton. They then require that paths lie along the skeleton. If the initial and goal points do not lie on the skeleton, short connecting paths are added to join them to the nearest points on the skeleton.

To be complete for motion planning, skeletonization methods must satisfy two properties:

If S is a skeleton of free space F, then S should have a single connected piece within each connected region of F.
For any point p in F, it should be "easy" to compute a path from p to the skeleton.

Roadmaps are a global approach to motion planning. It consists of modeling the connectivity of the robot's free space as a network of one-dimensional curves, called the roadmap, which lies in the free space or its closure. (NOTE that weird things happen at the boundaries depending on which option you choose.)

Once a roadmap R has been generated, it is used as a set of standardized paths. Path planning is then reduced to connecting the initial and goal configurations to points in R, and searching R for a path to connect those points.

The kinds of skeletonizations are visibility graphs, Voronoi diagrams and roadmaps, which consist of silhouette curves and linking curves. (This formulation due to Russell and Norvig; note that Latombe calls all of these things "roadmap methods.")

Visibility Graphs

The original roadmap method, visibility graphs apply to 2D configuration spaces with polygonal C-obstacle regions. The graph is nondirected and the nodes are the initial and goal configurations plus all the vertices of the C-obstacles. The edges are all the straight line segments connecting two nodes that do not intersect the interior of the C-obstacle region (all lines that don't go through obstacles). The graph can be searched for the shortest semi-free path; this path will always be polygonal. Visibility graphs are complete.

Voronoi diagram

A roadmap method based on retraction. A continuous function of C_free is defined onto a one-dimensional subset of itself. The Voronoi diagram consists of those curves in the space that are equidistant from two or more obstacles; these curves form the skeleton.

Put poorly in another way, he Voronoi diagram is the set of all free configurations whose minimal distance to the C-obstacle region is achieved with at least two points in the boundary of the region. This method yields free paths which tend to maximize the clearance between the robot and the obstacles.

When the C-obstacles are polygons, the Voronoi diagram consists of straight and parabolic segments. The initial and goal configurations are mapped to q_init and q_goal by drawing the line along which the distance to the boundary of the C-obstacles increases the fastest.

Cell Decomposition

A global approach to motion planning. The intuitive idea is to break the space down into a finite number of discrete chunks.

Cell decomposition breaks free space down into simple regions called cells such that any path between any two configurations in a cell can easily be generated. A non-directed graph representing the adjacency relations between cells is then constructed and searched. This is called the connectivity graph or adjacency graph. The outcome of the search is a sequence of cells called a channel. There are two cell decomposition methods:

approximate cell decomposition -- Cells have predefined shapes (such as triangles or trapezoids) whose union is strictly included in the free space. The boundary of a cell does not characterize a discontinuity of some sort and has no physical meaning. The quad-tree method recursively decomposes a rectangular cell into four smaller cells until the cells lie entirely in the free space or reach some resolution. These methods are sound but not incomplete.
exact cell decomposition -- The free space is decomposed into cells whose union is exactly the free space; this is a complete method. Cells in this form of decomposition are called cylinders.

Potential Fields

A local approach to motion planning. The goal configuration generates an attractive potential that pulls the robot toward it; C-obstacles generate repulsive potential that pushes the robot away. The negated gradient of the total potential is treated as an artificial force applied to the robot, considered the most promising direction.

Potential fields are very efficient but suffer from local minima. Two approaches overcome this:

design potential functions with no local minima other than the goal configuration
complement the basic potential field approach with mechanisms to escape from local minima

Landmark-Based Navigation

Landmark-based navigation assumes that the environment contains easily recognizable, unique landmarks. A landmark is modeled as a point surrounded by a field of influence. Within this field of influence, the robot knows its position exactly. Outside of all fields of influence, the robot has no direct position information.

Landmarks can be used to deal with uncertainty in motion so, for example, if a robot knows it is apt to move with slight uncertainty as it travels, it can aim for a central point in a field of influence, knowing that the actual cone it maps out will intersect with that field no matter how off its projections it is.

Configuration Space

For a robot with k degrees of freedom, the state or configuration of the robot can be described by k real values. These values can be considered as a point p in a k-dimensional configuration space of the robot.

Configuration space can be used to determine if there is a path by which a robot can move from one place to another. Real obstacles in the world are mapped to configuration space obstacles, and the remaining free space is all of configuration space except the part occupied by those obstacles.

Having made this mapping, suppose there are two points in configuration space. The robot can move between the two corresponding real world points exactly when there is a continuous path between them that lies entirely in configuration free space.

Generalized Configuration Space

The term generalized configuration space is used to describe systems in which other objects are included as part of the configuration. These may be movable, and their shapes may vary.

There are several ways of dealing with robot planning when there are several moving or movable objects. These are:

Partition the generalized configuration space into finitely many states. The planning problem then becomes a logical one, like the blocks world. No general method for partitioning space has yet been found.
Plan object motions first, and then motions for the robot.
Restrict object motions to simplify planning.

Extensions of the Basic Problem

Multiple Moving Objects

Obstacles may be moving. There may be multiple robots. The robots themselves may be articulated (i.e., made of rigid objects connected by joints).

The first extension, and possibly the second, requires time to be explicitly considered, which gives rise to configuration-time space in which a path must never go back along the time axis, and some constraints on the slope and curvature of the path are given due to constraints on velocity and acceleration.

The second and third extensions yield configuration spaces of arbitrarily large dimensions. Planning can be done in a composite configuration space which is the cross-product of the individual configuration spaces (this is called centralized planning), or another method called decoupled planning can be used to plan the motions more or less independently and interactions are only considered in the second phase of planning.

The decoupling method fails when interaction is critical -- in the example of two robots trying to switch places in a narrow corridor, for example. When dealing with articulated robots, the dimension of the configuration space grows as the number of joints.

Kinematic Constraints

A robot's motions may be restricted by kinematic constraints. There are two kinds:

holonomic constraints: A holonomic equality constraint is an equality relation among the parameters of the minimally-represented configuration space that can be solved for one of the parameters. Such a relation reduces the dimension of the actual configuration space of the robot by one. A set of k holonomic constraints reduces it by k. For example, a robot limited to rotating around a fixed axis has a configuration space of dimension 4 instead of 6 (since revolute joints impose 2 holonomic constraints).
nonholonomic constraints: A nonholonomic equality constraint is a non-integrable equation involving the configuration parameters and their derivatives (velocity parameters). Such a constraint does not reduce the dimension of the configuration space, but instead reduces the dimension of the space of possible differential motions.

For example, a car-like robot has 3 dimensions: two for translation and one for rotation. However, the velocity of R is required to point along the main axis of A. (This is written as - sin T dx + cos T dy = 0.)

The instantaneous motion of the car is determined by two parameters: the linear velocity along the main axis, and the steering angle. However, when the steering angle is non-zero, the robot changes orientation, and its linear velocity with it, allowing the robot's configuration to span a three-dimensional space. Restricting the steering angle to pi/2 restricts the set of possible differential motions without changing its dimension.

Nonholonomic constraints restrict the geometry of the feasible free paths between two configurations. They are much harder to deal with in a planner than holonomic constraints.

In general, a robot with nonholonomic constraints has fewer controllable degrees of freedom than it has actual degrees of freedom; these are equal in a holonomic robot.

Uncertainty

A robot may have little or no prior knowledge about its workspace. The more incomplete the knowledge, the less important the role of planning. A more typical situation is when there are errors in robot control and in the initial models, but these errors are contained within bounded regions.

Grasp Planning

Many typical robot operations require the robot to grasp an object. The rest of the operation is strongly influenced by choices made during grasping.

Grasping requires positioning the gripper on the object, which requires generating a path to this position. The grasp position must be accessible, stable, and robust enough to resist some external force. Sometimes a satisfactory position can only be reached by grasping an object, putting it down, and re-grasping it. The grasp planner must choose configurations so that the grasped objects are stable in the gripper, and it should also choose operations that reduce or at least do not increase the level of uncertainty in the configuration of the object.

The object to be grasped is the target object. The gripping surfaces are the surfaces on the robot used for grasping.

There are three principal considerations in gripping an object. They are:

safety -- the robot must be safe in the initial and final configurations
reachability -- the robot must be able to reach the initial grasping configuration and, with the object in hand, reach the final configuration
stability -- the grasp should be stable in the presence of forces exerted on the grasped object during transfer and parts-mating motions

Example: peg placement. By tilting a peg the robot can increase the likelihood that the initial approach conditions will have the peg part way in the hole. Other solutions are chamfers (a widening hole, producing the same effect as tilting), search along the edge, and biased search (introduce bias so that search can be done in at most one motion and not two, if the error direction is not known).

Computational Complexity

These theorems are taken from Latombe's book (see Sources).

Theorem 1 (Polyhedral bodies) Planning a free path for a robot made of an arbitrary number of polyhedral bodies connected together at some joint vertices, among a finite set of polyhedral obstacles, between any two given configurations is a PSPACE-hard problem.

Put another way, planning a free path in a configuration space of arbitrary dimension among fixed obstacles is PSPACE-hard.

Theorem 2 (Joints) In the absence of obstacles, deciding whether a planar linkage in some initial configuration can be moved so that a certain joint reaches a given point in the plane is PSPACE-hard.

Theorem 3 (Rectangular robots) Planning a path in the configuration space of a multi-bodied robot consisting of all rectangles is PSPACE-hard.

Theorem 4 (Planar arm) Consider a planar arm consisting of arbitrarily many links serially connected by revolute joints such that all the links are constrained to move in the plane. Planning a free path for such an arm among a finite number of polygonal obstacles, between any two given configurations is PSPACE-hard.

Theorem 5 (Upper bound for fixed dimension) A free path in a configuration space of any fixed dimension m, when the free space is a set defined by n polynomial constraints of maximal degree d, can be computed by an algorithm whose time complexity is exponential in m and polynomial in both n and d.

This theorem is based on reducing the planning problem to the problem of deciding the satisfiability of sentences in the first-order theory of the reals. The important observation is that the complexity of path planning increases exponentially with the dimension of the configuration space.

Theorem 6 (Velocity-bounding) Planning the motion of a rigid object translating without rotation in three dimensions among arbitrarily many moving obstacles that may both translate and rotate is a PSPACE-hard problem if the velocity modulus of the object is bounded, and an NP-hard problem otherwise.

Theorem 7 (Uncertainty) Planning compliant motions for a point in the presence of uncertainty, in a three-dimensional polyhedral configuration space with an arbitrarily large number of faces, is an NEXPTIME-hard problem.

Reducing Complexity

It is possible to approximate the real problem before giving it to the motion planner; it is reasonable to trade generality for improved time performance.

One way of reducing complexity is by reducing the dimension of the configuration space. This can be done by replacing the robot by the surface or volume it sweeps out when it moves along r independent axes. This corresponds to projecting the m-dimensional configuration space along r of its dimensions. The projected configuration space has only m - r dimensions. (As an example, imagine a robot with an extended arm as a disk; then you don't need to worry about rotations.)

Simplification heuristics make only local plans, by breaking the problem into subproblems. For example, many localities are stereotyped situations, such as moving through a door or turning in a corridor.

Robot Architectures

Brooks' Subsumption Architecture

This approach breaks the problem down into task-achieving behaviors (such as wandering, avoiding obstacles, or making maps) rather than decomposing it functionally (into sensing, planning, acting).

In subsumption architectures, levels of competence are stacked one on top of another, ranging from the lowest level (object avoidance) to higher levels for planning and map-making. Higher levels may interfere with lower levels, but each level's architecture is built, tested and completed before the next level is added. In this way the system is robust and incrementally more powerful.

Individual levels consist of augmented finite state machines connected by message-passing wires. Higher levels may inhibit signals on these wires, or replace them with their own signals; this is how they exercise control over more basic functions.

(For more information, see the AI Qual Summary on Agent Architectures.)

Shakey

Shakey was the forerunner of many intelligent robot projects. The Shakey system was controlled by PLANEX, which accepted goals from the user, called a STRIPS subsystem to generate plans, and then executed them via intermediate-level actions. These ILA's were translated into complex routines of low-level actions that had some error detection and correction capabilities.

After each action was executed, PLANES would execute the shortest plan subsequence that led to a goal and whose preconditions were satisfied. In this way, actions that failed would be retried and serendipity would lead to reduced effort. If no subsequence applied, PLANEX called STRIPS to make a new plan.

Situated Automata

An approach pioneered by Stan Rosenschein that begins with an explicit representation and reasoning system, but compiles it into a finite-state machine whose inputs come from the environment and whose outputs connect to effectors.

This compilation approach distinguishes between the use of explicit knowledge representation by the designers (the "grand strategy") and the use of explicit knowledge within the agent architecture (the "grand tactic"). Rosenschein's compiler generates FSMs that can be proved to correspond to logical propositions about the environment, provided the compiler has the correct initial state and physics.

The FLAKEY system at SRI used situated automata to navigate, run errands, and ask questions, and had no explicit representation.

Other Architectures

TCA [Simmons and Mitchell, 1989]. This architecture combines reactive systems with traditional AI planning. TCA is a distributed system with centralized control, designed to control autonomous robots for long periods in unstructured environments, such as the surface of Mars.

Robo-SOAR [Laird et al., 1980] is an extension of SOAR to a robot architecture.

PRS [Georgeff and Lansky, 1987] is a symbolic robot planning system that interleaves planning and execution. In PRS, goals represent robot behaviors rather than world states. PRS contains procedures for turning goals into subgoals or iterations of subgoals.

Sources

Ronny Kohavi, AI Qual Note #20: Robotics. Russell and Norvig, Artificial Intelligence: A Modern Approach, Chapter 25.

Latombe, Chapter 1.

See the AI Qual Reading List for further details on these sources.

Patrick Doyle pdoyle@cs.stanford.edu