Robotics | ||

Patrick Doyle | AI Qual Summary | May 6, 1997 |

- Definitions
- Motivations
- Task Planning
- Motion Planning
- Grasp Planning
- Computational Complexity
- Robot Architectures
- Sources

**robot**- A versatile mechanical device equipped with actuators and sensors under the control of a computing system. Russell and Norvig define it as "an active, artificial agent whose environment is the physical world."
**task planner**- A program that converts task-level specifications into manipulator-level specifications. The task planner must have a description of the objects being manipulated, the task environment, the robot, and the initial and desired final states of the environment. The output should be a robot program that converts the initial state into the desired final state.
**guarded motion**- A motion of the form
**move along**.*path*until*sensory-condition* **compliant motion**- A motion of the form
**move along direction**. More intuitively, moving along a surface while maintaining a fixed pressure, such as when scraping paint off of a window with a razor.*d*with force = 0 perpendicularly to*d* **effector**- The bits the robot does stuff with. That is, arms, legs, hands,
feet. An
*end-effector*is a functional device attached to the end of a robot arm (e.g., grippers). **actuator**- A device that converts software commands into physical motion, typically electric motors or hydraulic or pneumatic cylinders.
**degree of freedom**- A dimension along which the robot can move itself or some part of itself. Free objects in 3-space have 6 degrees of freedom, three for position and three for orientation.
**sensors**- Devices that monitor the environment. There are contact sensors (touch and force), and non-contact (e.g., sonar).
**sonar**- Sensing system that works by measuring the time of flight of a sound pulse to be generated, reach an object, and be reflected back to the sensor. Wide angle but reasonably accurate in depth (the wide angle is the disadvantage).
**infrared**- Very accurate angular resolution system but terrible in depth measurement.
**recognizable set**- An envelope of possible configurations a robot may be in at present; recognizable sets are to continuous domains what multiple state sets are to discrete ones. A recognizable set with respect to a sensor reading is the set of all world states in which the robot might be upon receiving that sensor reading.
**landmark**- An easily recognizable, unique element of the environment that the robot can use to get its bearings.

The important incentives for building robots are social, replacing humans in undesirable or dangerous jobs, and economic, reducing the cost of manufacturing while improving its quality.

The real world has the following qualities, that any robot design must take into account:

**inaccessible**-- sensors are imperfect, and can only perceive local stimuli**nondeterministic**-- the robot can never be certain an action will work as expected, since wheels slip, batteries run down, etc.**nonepisodic**-- the effects of an action change over time, so robots should handle sequential decision problems and learning**dynamic**-- a robot has to know when to think and when to act right away**continuous**-- states and actions are drawn from a continuum of physical configurations and motions

In general, robots should have the following qualities:

**high reliability**-- if a robot fails, it should be able to recover or to call for help**high speed**-- a robot should perform its functions as quickly as needed**programmability**-- the robot should be flexible and easily adaptable to various tasks**low cost**

The ultimate goal is to build *autonomous* robots that
accept commands telling them *what* to do, without needing to
specify exactly *how*.

By virtue of their versatility, robots can be difficult to program,
especially for tasks requiring complex motions involving sensory
feedback. In order to simplify programming, *task-level*
languages exist that specify actions in terms of their effects on
objects.

*Example: pin* A programmer should be able to specify that
the robot should put a pin in a hole, without telling it what sequence
of operators to use, or having to think about its sensory or motor
operators.

Task planning is divided into three phases: modeling, task specification, and manipulator program synthesis.

There are three approaches to specifying the model state:

- Using a CAD system to draw the positions of the objects in the desired configuration.
- Using the robot itself to specify its configurations and to locate the object features.
- Using symbolic spatial relationships between object features (such
as
`(face1 against face2)`. This is the most common method, but must be converted into numerical form to be used.

One problem is that these configurations may *overconstrain*
the state. Symmetry is an example; it does not matter what the
orientation of a peg in a hole is. The final state may also not
completely specify the operation; for example, it may not say how hard
to tighten a bolt.

The three basic kinds of motions are free motion, guarded motion, and compliant motion.

An important part of robot program synthesis should be the
inclusion of sensor tests for error detection.

The fundamental problem in robotics is deciding what motions the robot should perform in order to achieve a goal arrangement of physical objects. This turns out to be an extremely hard problem.

**basic motion planning problem**- Let
**A**be a single rigid object (the robot) moving in a Euclidean space**W**, called the*workspace*, represented as**R**(where n = 2 or 3). Let^{n}**B**, ...,_{1}**B**be fixed rigid objects distributed in_{q}**W**. These are called*obstacles*.

Assume that the geometry of**A**, and the geometries and locations of the**B**'s are accurately known. Assume also that no kinematic constraints limit the motions of_{i}**A**(so that**A**is a*free-flying object*).

Given an initial position and orientation and a goal position and orientation of**A**in**W**, the problem is to generate a path**t**specifying a continuous sequence of positions and orientations of**A**avoiding contact with the**B**'s._{i}

(Basically, given a robot, a bunch of objects, a start state and a goal state, find a path for the robot to reach the goal state.) **configuration of object A**- A specification of the position of every point in the object,
relative to a fixed frame of reference. To specify the configuration
of a rigid object
**A**, it is enough to specify the position and orientation of the frame**F**with respect to_{A}**F**. The subset of_{W}**W**occupied by**A**at configuration*q*is denoted by**A**(*q*). **configuration space of object A**- The space
**C**of all configurations of**A**. The idea is to represent the robot as a point and thus reduce the motion planning problem to planning for a point. **dimension of C**- The dimension of a configuration space is the number of
independent parameters required to represent it as
**R**. This is 3 for 2-D, and 6 for 3-D.^{m} **chart**- A representation of a local portion of the configuration space.
**C**can be decomposed into a finite union of slightly overlapping*patches*called*charts*, each represented as a copy of**R**.^{m} **distance between configurations**- The distance between configurations
*q*and*q'*should decrease and tend to zero when the regions**A**(*q*) and**A**(*q'*) get closer and tend to*coincide*. **path**- A path from a configuration
*q*to configuration_{init}*q*is a continuous map_{goal}**t**: [0,1] ->**C**with**t**(0) =*q*and_{init}**t**(1) =*q*._{goal} **free-flying object**- An object for which, in the absence of any obstacles, any path is feasible.
**C-obstacle**- An obstacle mapped into configuration space. Every obstacle
**B**is mapped to the following region in the workspace called the C-obstacle:_{i}**CB**= {_{i}*q*in**C**:**A**(*q*) intersected with**B**!= empty set }._{i} **C-obstacle region**- The union of all the C-obstacles.
**Free space**- All of the configuration space less the C-obstacle region, called
**C**. A configuration in_{free}**C**is called a_{free}*free configuration*and a*free path*is a path where**t**maps to**C**instead of to_{free}**C**. A*semi-free path*maps to the closure of**C**._{free} **kinematics**- Motions in the abstract, without reference to force or mass. Russell and Norvig define it as the study of the correspondence between the actuator motions in a mechanism and the resulting motion of its parts.
**dynamics**- Motions of material bodies under the action of forces.
**force-compliant motion**- Motions in which the robot may touch obstacle surfaces and slide along them. These are more accurate than position-controlled commands.
**rotary motion**- Rotation around a fixed hub.
**prismatic motion**- Linear movement, as with a piston in a cylinder.

Skeletonization methods collapse the configuration space into a
one-dimensional subset, or *skeleton*. They then require that
paths lie along the skeleton. If the initial and goal points do not
lie on the skeleton, short connecting paths are added to join them to
the nearest points on the skeleton.

To be complete for motion planning, skeletonization methods must satisfy two properties:

- If
**S**is a skeleton of free space**F**, then**S**should have a single connected piece within each connected region of**F**. - For any point
*p*in**F**, it should be "easy" to compute a path from*p*to the skeleton.

Roadmaps are a *global* approach to motion planning. It
consists of modeling the connectivity of the robot's free space as a
network of one-dimensional curves, called the *roadmap*, which
lies in the free space or its closure. (**NOTE** that weird things
happen at the boundaries depending on which option you choose.)

Once a roadmap **R** has been generated, it is used as a set of
standardized paths. Path planning is then reduced to connecting the
initial and goal configurations to points in **R**, and searching
**R** for a path to connect those points.

The kinds of skeletonizations are *visibility graphs*, *Voronoi
diagrams* and *roadmaps*, which consist of *silhouette
curves* and *linking curves*. (This formulation due to
Russell and Norvig; note that Latombe calls all of these things
"roadmap methods.")

The original roadmap method, visibility graphs apply to 2D configuration spaces with polygonal C-obstacle regions. The graph is nondirected and the nodes are the initial and goal configurations plus all the vertices of the C-obstacles. The edges are all the straight line segments connecting two nodes that do not intersect the interior of the C-obstacle region (all lines that don't go through obstacles). The graph can be searched for the shortest semi-free path; this path will always be polygonal. Visibility graphs are complete.

A roadmap method based on *retraction*. A continuous
function of **C _{free}** is defined onto a one-dimensional
subset of itself. The Voronoi diagram consists of those curves in the
space that are equidistant from two or more obstacles; these curves
form the skeleton.

Put poorly in another way, he Voronoi diagram is the set of all free configurations whose minimal distance to the C-obstacle region is achieved with at least two points in the boundary of the region. This method yields free paths which tend to maximize the clearance between the robot and the obstacles.

When the C-obstacles are polygons, the Voronoi diagram consists of
straight and *parabolic segments*. The initial and goal
configurations are mapped to *q _{init}* and

A global approach to motion planning. The intuitive idea is to break the space down into a finite number of discrete chunks.

Cell decomposition breaks free space down into simple regions
called *cells* such that any path between any two
configurations in a cell can easily be generated. A non-directed
graph representing the adjacency relations between cells is then
constructed and searched. This is called the *connectivity
graph* or *adjacency graph*. The outcome of the search is
a sequence of cells called a *channel*. There are two cell
decomposition methods:

**approximate cell decomposition**-- Cells have predefined shapes (such as triangles or trapezoids) whose union is strictly included in the free space. The boundary of a cell does not characterize a discontinuity of some sort and has no physical meaning. The*quad-tree*method recursively decomposes a rectangular cell into four smaller cells until the cells lie entirely in the free space or reach some resolution. These methods are*sound*but not*incomplete*.

**exact cell decomposition**-- The free space is decomposed into cells whose union is exactly the free space; this is a complete method. Cells in this form of decomposition are called*cylinders*.

A local approach to motion planning. The goal configuration generates an attractive potential that pulls the robot toward it; C-obstacles generate repulsive potential that pushes the robot away. The negated gradient of the total potential is treated as an artificial force applied to the robot, considered the most promising direction.

Potential fields are very efficient but suffer from local minima. Two approaches overcome this:

- design potential functions with no local minima other than the
goal configuration

- complement the basic potential field approach with mechanisms to escape from local minima

Landmark-based navigation assumes that the environment contains
easily recognizable, unique *landmarks*. A landmark is modeled
as a point surrounded by a *field of influence*. Within this
field of influence, the robot knows its position exactly. Outside of
all fields of influence, the robot has no direct position information.

Landmarks can be used to deal with uncertainty in motion so, for example, if a robot knows it is apt to move with slight uncertainty as it travels, it can aim for a central point in a field of influence, knowing that the actual cone it maps out will intersect with that field no matter how off its projections it is.

For a robot with *k* degrees of freedom, the state or
configuration of the robot can be described by *k* real values.
These values can be considered as a point *p* in a
*k*-dimensional *configuration space* of the robot.

Configuration space can be used to determine if there is a path
by which a robot can move from one place to another. Real obstacles
in the world are mapped to *configuration space obstacles*, and
the remaining *free space* is all of configuration space except
the part occupied by those obstacles.

Having made this mapping, suppose there are two points in configuration space. The robot can move between the two corresponding real world points exactly when there is a continuous path between them that lies entirely in configuration free space.

The term *generalized configuration space* is used to
describe systems in which other objects are included as part of the
configuration. These may be movable, and their shapes may vary.

There are several ways of dealing with robot planning when there are several moving or movable objects. These are:

- Partition the generalized configuration space into finitely many states. The planning problem then becomes a logical one, like the blocks world. No general method for partitioning space has yet been found.
- Plan object motions first, and then motions for the robot.
- Restrict object motions to simplify planning.

Obstacles may be moving. There may be multiple robots. The robots themselves may be articulated (i.e., made of rigid objects connected by joints).

The first extension, and possibly the second, requires time to be
explicitly considered, which gives rise to *configuration-time*
space in which a path must never go back along the time axis, and some
constraints on the slope and curvature of the path are given due to
constraints on velocity and acceleration.

The second and third extensions yield configuration spaces of
arbitrarily large dimensions. Planning can be done in a *composite
configuration space* which is the cross-product of the individual
configuration spaces (this is called *centralized planning*),
or another method called *decoupled planning* can be used to
plan the motions more or less independently and interactions are only
considered in the second phase of planning.

The decoupling method fails when interaction is critical -- in the example of two robots trying to switch places in a narrow corridor, for example. When dealing with articulated robots, the dimension of the configuration space grows as the number of joints.

A robot's motions may be restricted by kinematic constraints. There are two kinds:

**holonomic constraints**- A holonomic equality constraint is an equality relation among the
parameters of the minimally-represented configuration space that can
be solved for one of the parameters. Such a relation reduces the
dimension of the actual configuration space of the robot by one. A
set of
*k*holonomic constraints reduces it by*k*. For example, a robot limited to rotating around a fixed axis has a configuration space of dimension 4 instead of 6 (since revolute joints impose 2 holonomic constraints). **nonholonomic constraints**- A nonholonomic equality constraint is a non-integrable equation
involving the configuration parameters and their derivatives (velocity
parameters). Such a constraint does not reduce the dimension of the
configuration space, but instead reduces the dimension of the space of
possible differential motions.

For example, a car-like robot has 3 dimensions: two for translation and one for rotation. However, the velocity of**R**is required to point along the main axis of**A**. (This is written as*- sin T dx + cos T dy = 0*.)

The instantaneous motion of the car is determined by two parameters: the linear velocity along the main axis, and the steering angle. However, when the steering angle is non-zero, the robot changes orientation, and its linear velocity with it, allowing the robot's configuration to span a three-dimensional space. Restricting the steering angle to pi/2 restricts the set of possible differential motions without changing its dimension.

Nonholonomic constraints restrict the geometry of the feasible free paths between two configurations. They are much harder to deal with in a planner than holonomic constraints.

In general, a robot with nonholonomic constraints has fewer
*controllable* degrees of freedom than it has actual degrees of
freedom; these are equal in a holonomic robot.

A robot may have little or no prior knowledge about its workspace.
The more incomplete the knowledge, the less important the role of
planning. A more typical situation is when there are errors in robot
control and in the initial models, but these errors are contained
within bounded regions.

Many typical robot operations require the robot to grasp an object. The rest of the operation is strongly influenced by choices made during grasping.

Grasping requires positioning the gripper on the object, which requires generating a path to this position. The grasp position must be accessible, stable, and robust enough to resist some external force. Sometimes a satisfactory position can only be reached by grasping an object, putting it down, and re-grasping it. The grasp planner must choose configurations so that the grasped objects are stable in the gripper, and it should also choose operations that reduce or at least do not increase the level of uncertainty in the configuration of the object.

The object to be grasped is the *target object*. The
*gripping surfaces* are the surfaces on the robot used for
grasping.

There are three principal considerations in gripping an object. They are:

**safety**-- the robot must be safe in the initial and final configurations**reachability**-- the robot must be able to reach the initial grasping configuration and, with the object in hand, reach the final configuration**stability**-- the grasp should be stable in the presence of forces exerted on the grasped object during transfer and parts-mating motions

*Example: peg placement*. By tilting a peg the robot can
increase the likelihood that the initial approach conditions will have
the peg part way in the hole. Other solutions are chamfers (a widening
hole, producing the same effect as tilting), search along the edge,
and biased search (introduce bias so that search can be done in at
most one motion and not two, if the error direction is not known).

These theorems are taken from Latombe's book (see Sources).

**Theorem 1 (Polyhedral bodies)** *Planning a free path for a
robot made of an arbitrary number of polyhedral bodies connected
together at some joint vertices, among a finite set of polyhedral
obstacles, between any two given configurations is a PSPACE-hard
problem.*

Put another way, planning a free path in a configuration space of arbitrary dimension among fixed obstacles is PSPACE-hard.

**Theorem 2 (Joints)** *In the absence of obstacles, deciding
whether a planar linkage in some initial configuration can be moved so
that a certain joint reaches a given point in the plane is
PSPACE-hard.*

**Theorem 3 (Rectangular robots)** *Planning a path in the
configuration space of a multi-bodied robot consisting of all
rectangles is PSPACE-hard.*

**Theorem 4 (Planar arm)** *Consider a planar arm consisting
of arbitrarily many links serially connected by revolute joints such
that all the links are constrained to move in the plane. Planning a
free path for such an arm among a finite number of polygonal
obstacles, between any two given configurations is PSPACE-hard.*

**Theorem 5 (Upper bound for fixed dimension)** *A free path
in a configuration space of any fixed dimension m, when the free space
is a set defined by n polynomial constraints of maximal degree d, can
be computed by an algorithm whose time complexity is exponential in m
and polynomial in both n and d.*

This theorem is based on reducing the planning problem to the
problem of deciding the satisfiability of sentences in the first-order
theory of the reals. The important observation is that **the
complexity of path planning increases exponentially with the dimension
of the configuration space**.

**Theorem 6 (Velocity-bounding)** *Planning the motion of a
rigid object translating without rotation in three dimensions among
arbitrarily many moving obstacles that may both translate and rotate
is a PSPACE-hard problem if the velocity modulus of the object is
bounded, and an NP-hard problem otherwise.*

**Theorem 7 (Uncertainty)** *Planning compliant motions for a
point in the presence of uncertainty, in a three-dimensional
polyhedral configuration space with an arbitrarily large number of
faces, is an NEXPTIME-hard problem.*

It is possible to approximate the real problem before giving it to the motion planner; it is reasonable to trade generality for improved time performance.

One way of reducing complexity is by reducing the dimension of the
configuration space. This can be done by replacing the robot by the
surface or volume it sweeps out when it moves along *r*
independent axes. This corresponds to projecting the
*m*-dimensional configuration space along *r* of its
dimensions. The projected configuration space has only *m - r*
dimensions. (As an example, imagine a robot with an extended arm as a
disk; then you don't need to worry about rotations.)

Simplification heuristics make only local plans, by breaking the problem into subproblems. For example, many localities are stereotyped situations, such as moving through a door or turning in a corridor.

This approach breaks the problem down into *task-achieving
behaviors* (such as wandering, avoiding obstacles, or making maps)
rather than decomposing it functionally (into sensing, planning,
acting).

In subsumption architectures, *levels of competence* are
stacked one on top of another, ranging from the lowest level (object
avoidance) to higher levels for planning and map-making. Higher
levels may interfere with lower levels, but each level's architecture
is built, tested and completed before the next level is added. In
this way the system is robust and incrementally more powerful.

Individual levels consist of augmented finite state machines connected by message-passing wires. Higher levels may inhibit signals on these wires, or replace them with their own signals; this is how they exercise control over more basic functions.

(For more information, see the AI Qual Summary on *Agent
Architectures*.)

Shakey was the forerunner of many intelligent robot projects. The
Shakey system was controlled by PLANEX, which accepted goals from the
user, called a STRIPS subsystem to generate plans, and then executed
them via *intermediate-level actions*. These ILA's were
translated into complex routines of *low-level actions* that
had some error detection and correction capabilities.

After each action was executed, PLANES would execute the shortest plan subsequence that led to a goal and whose preconditions were satisfied. In this way, actions that failed would be retried and serendipity would lead to reduced effort. If no subsequence applied, PLANEX called STRIPS to make a new plan.

An approach pioneered by Stan Rosenschein that begins with an explicit representation and reasoning system, but compiles it into a finite-state machine whose inputs come from the environment and whose outputs connect to effectors.

This compilation approach distinguishes between the use of explicit
knowledge representation by the designers (the "grand strategy") and
the use of explicit knowledge within the agent architecture (the
"grand tactic"). Rosenschein's compiler generates FSMs that can be
*proved* to correspond to logical propositions about the
environment, provided the compiler has the correct initial state and
physics.

The FLAKEY system at SRI used situated automata to navigate, run errands, and ask questions, and had no explicit representation.

TCA [Simmons and Mitchell, 1989]. This architecture combines reactive systems with traditional AI planning. TCA is a distributed system with centralized control, designed to control autonomous robots for long periods in unstructured environments, such as the surface of Mars.

Robo-SOAR [Laird et al., 1980] is an extension of SOAR to a robot architecture.

PRS [Georgeff and Lansky, 1987] is a symbolic robot planning system
that interleaves planning and execution. In PRS, goals represent
robot behaviors rather than world states. PRS contains procedures for
turning goals into subgoals or iterations of subgoals.

Ronny Kohavi, *AI Qual Note #20: Robotics*.

Russell and Norvig, *Artificial Intelligence: A Modern
Approach*, Chapter 25.

`Latombe, Chapter 1.
`

See the AI Qual Reading List for further details on these sources.

Patrick Doyle | pdoyle@cs.stanford.edu |