All Hail Geometric Algebra!

Rotate a cube by 90 degrees around the \(x\)-axis, then rotate it by 90 degrees around the \(z\)-axis. This is equivalent to a single rotation. What is the axis and angle of this single rotation?

If we hold a physical cube in our hands, we can turn it a couple of times and intuitively find the answer. Is there a similarly straightforward mathematical method that rigorously solves this problem?

What if we generalize? That is, given any two rotations that fix the origin, what is the single rotation that is equivalent to their composition? Can we solve this just as easily?

Yes! With geometric algebra, we can solve the first problem comfortably with pen and paper, and apply the same method to its generalization.

We’ll take a whirlwind tour of A Survey of Geometric Algebra and Geometric Calculus by Alan Macdonald. See also:

We’ll explain some ideas in code, for which we need a few declarations. (The source of this page is literate Haskell.)

{-# LANGUAGE CPP #-}
{-# LANGUAGE OverloadedLists #-}
import Control.Monad
import Data.List
import Data.Map (Map)
import qualified Data.Map as M
#ifdef __HASTE__
import Data.IORef
import Haste.DOM
import Haste.Events
import Haste.Graphics.Canvas
import Haste.Graphics.AnimationFrame
#endif

A number is worth 1000 pictures

I was spellbound by Cartesian coordinates as a child. Humble numbers tame the wild world of geometry. Instead of dubiously hoping a hand-drawn picture represents the truth, we can solve geometry problems rigorously by mechanically applying the laws of algebra. (To be fair, it turns out Euclid’s Elements can be formalized without losing too much of its flavour.)

However, in practice, coordinates are unwieldy. For example, try finding the area of a triangle by determining its side lengths with Pythagoreas' Theorem, then plugging them into Heron’s formula. Or try finding the point of intersection between a line and a plane by solving simultaneous equations. (A computer can manage it, though little can be gleaned from the resulting proofs.)

Cartesian coordinates are the assembly language of geometry. Unimportant details clutter our reasoning. High-level ideas are difficult to express.

Nonetheless, coordinates suffice for solving our first problem in a rough manner: we take a cube centered at the origin and figure out where its corners go. We then determine the axis by finding which corners are fixed, and determine the angle by counting rotations until the cube returns to the original position. Of course, this method fails for arbitrary axes and angles.

I could not have invented vector algebra!

Years later, I learned about vectors, which simplified calculations. For example, the area of a triangle is basically a cross product. Though grateful for the labour saved, the mysterious ad hoc nature of these new tools bothered me. How did they figure it all out?

The dot product almost looks like it could be discovered algebraically: we might consider applying the multiplication from the direct sum of rings, that is, zipWith (*) v w. We may notice that the squared norm of a vector v is sum (zipWith (*) v v), which suggests we may eventually realize sum (zipWith (*) v w) has geometric significance.

However, the cross product is baffling. Without a superstructure such as quaternions (which themselves took some effort to discover), the only plausible route to discovering it is to derive the algebra from the geometry. Lagrange did this in 1773.

Even if we manage to derive them somehow, these products are pale shadows of the products of abstract algebra. The dot product is a scalar, a type different to the inputs. Calling it a product feels like an abuse of terminology. The cross product only works in precisely three dimensions, and although it satisfies some identities, it is far from elementary, For starters, it’s non-associative. Is a product with nice properties too much to ask?

And how about Plücker coordinates? They seem underappreciated even by those well-acquainted with cross products, likely because they are abstruser still.

Worst of all, despite all the above machinery, low-level details remain. We would like to write down a handful of symbols describing geometric concepts, then manipulate them with the laws of algebra to perform amazing feats. Vector algebra only partially succeeds.

If we’re skilled with matrices, we can solve our rotation problem by following along the proof of Euler’s rotation theorem. We multiply the matrices for the two given rotations to produce a single matrix, then find an eigenvector to determine the axis of rotation. As for the angle, we could study the effect of the matrix on a vector perpendicular to the axis.

This is disproportionately tedious. Why must we work so hard to handle a simple idea like rotation?

In contrast, geometric algebra gives a clean and simple mathematical framework for rotations, built from a clean and simple vector product. We really can scribble down a few symbols and apply the laws of algebra to do geometry.

Additionally, the weirdest parts of vector algebra, including the cross product and Plücker coordinates, arise naturally in geometric algebra.

Complex numbers simplify

Thanks to polar coordinates, complex numbers excel at describing 2D rotations. See also phasors for a completely different problem that also benefits from complex numbers and polar coordinates.

Complex numbers should be more widely used, even for discrete problems. For example, when navigating a 2D maze, we might need to rotate our orientation left or right by 90 degrees. If we index the array with Gaussian integers, 90-degree rotations are simply multiplications by \(\pm i\).

To go beyond 2D, we can augment complex numbers so they become quaternions, which excel at describing 3D rotations.

A contentious article advocates vectors over quaternions. While the vigorous prose is admirable, the argument is weak. The claim that quaternions are "very complicated" is false, but more risible is the howler: "But why has nobody looked beyond quaternions for a simpler solution until now?". In fact, a simpler solution was found long ago: geometric algebra.

The real problems with quaternions are that translations and rotations mix poorly, and that they limit us to 3D.

Geometric algebra is a common framework for vector algebra and quaternions. Despite its power, geometric algebra is simple, and rotations in any dimension closely resemble the elegant descriptions of 2D rotations with complex numbers and 3D rotations with quaternions.

Thus the conclusion of the article is right for the wrong reasons. We should skip learning quaternions, but not because vectors are better. Rather, it’s because we should study the more powerful geometric algebra, which teaches us both quaternions and vector algebra along the way.

Geometric algebra for (almost) free

We can trivially define a product with nice properties on any set: just take the free monoid, or as programmers call it, string concatenation. (Here, each character represents an element of the set and the empty string is the identity element.)

Similarly, we can define a vector product with nice properties thanks to free algebras. Again, we concatenate strings: given vectors \( \newcommand{\n}{\mathbf{n}} \newcommand{\u}{\mathbf{u}} \newcommand{\v}{\mathbf{v}} \newcommand{\w}{\mathbf{w}} \newcommand{\i}{\mathbf{i}} \newcommand{\e}{\mathbf{e}} \newcommand{\f}{\mathbf{f}} \newcommand{\vv}[1]{\mathbf{#1}} \u, \v, \w\), we can form products such as \(\vv{www}\), \(\vv{vuwvwuuvw}\), and so on. Then we mix in scalars, and add and multiply them together to get polynomial-like expressions, which we call multivectors. For example:

\[ (2 \vv{vuvw} + 1)(3\v - 2) = 6\vv{vuvwv} - 4\vv{vuvw} + 3\v - 2 \]

In Haskell:

type Multivector = Map String Double

freeMul :: Multivector -> Multivector -> Multivector
freeMul x y = M.filter (/= 0) $
  M.fromListWith (+) $ f <$> M.assocs x <*> M.assocs y where
    f (s, m) (t, n) = (s ++ t, m * n)

prop_exampleFreeMul = freeMul [("vuvw", 2), ("", 1)] [("v", 3), ("", -2)]
  == [("vuvwv", 6), ("vuvw", -4), ("v", 3), ("", -2)]

This product is nice by definition. There is an identity element. It’s associative. It’s distributive over vector addition. What a shame it’s also useless: how do we geometrically interpret \(\vv{uvwuuv}\)?

There’s no way. But by constraining our free algebra with one simple relation, we obtain the vaunted geometric algebra. We define the geometric product by subjecting the free product to the following constraint: for any vector \(\v\),

\[ \v\v = \v\cdot\v \]

We could generalize by replacing the dot product with any quadratic form, and by considering any vector space over any field, but we focus on the standard dot product on \(\mathbb{R}^n\), where the resulting geometric algebra is denoted \(\mathbb{G}^n\).

We could have instead written \(\v^2 = |\v|^2\), but this hides our need for the polarization identity:

\[ \u \cdot \v = \frac{1}{2} ((\u + \v) \cdot (\u + \v) - \u \cdot \u - \v \cdot \v) \]

From our constraint, this is:

\[ \u \cdot \v = \frac{1}{2} ((\u + \v)^2 - \u^2 - \v^2) \]

By elementary algebra, this implies:

\[ \u \cdot \v = \frac{1}{2} (\u\v + \v\u) \]

Hence when \(\u\) and \(\v\) are orthogonal, we have \(\u\v = -\v\u\).

As its name suggests, the geometric product is useful for geometry, and also possesses nice algebraic properties. Even division is possible sometimes: the inverse of a nonzero vector \(\v\) is \(\v / |\v|^2\).

[By the way, this section’s title is unoriginal. Also, while I could not have invented geometric algebra either, at least it’s simpler than vector algebra.]

The canonical basis

To recap:

  1. We can replace \(\v\v\) with \(|\v|^2\).

  2. We can replace \(\u\v\) with \(-\v\u\) when \(\v\) and \(\u\) are orthogonal.

Thus if \(\e_1, …​, \e_n\) is an orthonormal basis for \(\mathbb{R}^n\), then every multivector of \(\mathbb{G}^n\) can be written as a linear combination of products of the form:

\[ \e_{i_1}…​\e_{i_k} \]

where \(i_1,…​,i_k\) is a strictly increasing subsequence of \([1..n]\). These products are a canonical basis for \(\mathbb{G}^n\), which implies the basis for an \(n\)-dimensional vector space yields a canonical basis of size \(2^n\) for its geometric algebra.

Here’s a function showing the details. It computes the geometric product in canonical form, assuming each character that appears in the input represents an orthonormal basis vector of the space. For clarity, we employ the inefficient gnome sort:

-- Computes geometric product of multivectors written in terms of some
-- orthonormal basis. Returns result in canonical form.
mul :: Multivector -> Multivector -> Multivector
mul x y = M.filter (/= 0) $
  M.fromListWith (+) $ basisMul <$> M.assocs x <*> M.assocs y

basisMul (s, m) (t, n) = gnome [] (s ++ t) (m * n) where
  gnome pre (a:b:rest) c

    -- Vectors are in order: move on.
    | a < b  = gnome (pre ++ [a]) (b:rest) c

    -- Same vector: remove both copies since magnitude = 1.
    | a == b = back pre rest c

    -- Wrong order: flip and back up one step. Since the vectors are
    -- orthogonal, we flip the sign of the geometric product.
    | a > b  = back pre (b:a:rest) (-c)

    where
      back []  rest c = gnome []         rest            c
      back pre rest c = gnome (init pre) (last pre:rest) c

  gnome pre rest c = (pre ++ rest, c)

one = [("", 1)]
canon = mul one

prop_exampleOrthonormal = canon [("uvwuuv", 1)] == [("uw", -1)]

-- Returns canonical basis given orthonormal basis.
canonicalBasis :: String -> [String]
canonicalBasis = subsequences

We glossed over a potential problem. While it is clear we can reach a product of basis vectors with certain indices that strictly increase from left to right, why do we always get the same sign no matter how we apply the rules? This is easy to see for those familiar with alternating groups. In any case, we present a proof later.

If the \(k\) is the same for every term when an multivector is written canonically, then we say the multivector is a \(k\)-vector. A 0-vector is also called a scalar, a 1-vector a vector, a 2-vector a bi-vector, and a 3-vector a tri-vector:

-- Assumes input is in canonical form.
name :: Multivector -> String
name x
  | null x    = "zero multivector"
  | all ((== k) . length) ss = show k ++ "-vector" ++ t
  | otherwise = "mixed multivector"
  where
    s:ss = M.keys x
    k    = length s
    t | null (aka k) = ""
      | otherwise    = " or " ++ aka k

aka 0 = "scalar"
aka 1 = "vector"
aka 2 = "bivector"
aka 3 = "trivector"
aka _ = ""

prop_exampleNames = and
  ([ name []                     == "zero multivector"
  ,  name [("", 42)]             == "0-vector or scalar"
  ,  name [("x", -5), ("y", 98)] == "1-vector or vector"
  ,  name [("ab", 2), ("bc", 3)] == "2-vector or bivector"
  ,  name [("abc", 4)]           == "3-vector or trivector"
  ,  name [("abcde", 0.01)]      == "5-vector"
  ,  name [("", 2), ("bc", 3)]   == "mixed multivector"
  ] :: [Bool])

Lastly, \(n\)-vectors are also called pseudoscalars. We write \(\vv{I}\) for the pseudoscalar \(\e_1 …​ \e_n\), which is unique up to sign for any choice of basis. (This is implied by the omitted proof.) Then \(\vv{I}^{-1} = \e_n …​ \e_1 = (-1)^{n-1}\vv{I}\).

The fundamental identity

Let’s try a couple of 2D examples. Let \(\e, \f\) be an orthonormal basis of a plane.

prop_2Dsquare = join mul [("e", 3), ("f", 4)] == [("", 25)]

prop_2Dmul = mul
  [("e", 3), ("f", 4)] [("e", 5), ("f", 6)] == [("", 39), ("ef", -2)]

prop_2DmulOrth = mul
  [("e", 3), ("f", 4)] [("e", -4), ("f", 3)] == [("ef", 25)]

We see the geometric product of two vectors is a scalar and a bivector. Furthermore, appearances of \(0\) and \(25 = 5^2 = 3^2 + 4^2\) hints at a deeper truth. Expanding \((a \e + b \f)(c \e + d \f)\) proves the fundamental identity: for any vectors \(\u\) and \(\v\) in our plane,

\[ \u\v = \u\cdot\v + \u\wedge\v \]

where \(\u\wedge\v\) is the outer product or wedge product, the bivector component of the geometric product of two vectors, and satisfies:

\[ \u\wedge\v = |\u| |\v| \sin \theta \i \]

where \(\theta\) is the angle between \(\u\) and \(\v\), and \(\i\) is the bivector \(\e \f\).

We interpret \(\i\) to mean the plane generated by \(\e\) and \(\f\). With vector algebra, we represent planes with normal vectors, which only makes sense in exactly three dimensions. With geometric algebra, we represent a plane by multiplying two orthonormal vectors that generate them; this works in any dimension.

We find \(\i^2 = -1\), which explains the choice of notation. In fact, for any given plane we can construct a complex number system in this manner, that is, multivectors of the form \(a + b\i\). Such numbers are called geometric algebra complex numbers.

We can also write:

\[ \u\v = |\u| |\v| e^{\i\theta} \]

which resembles a standard complex number in polar form.

Suppose \(|\u| = |\v|\). Then left multiplying by \(u\) gives:

\[ \v = \u e^{\i\theta} \]

Thus the geometric algebra interpretation of \(e^{\i\theta}\) is a rotation by \(\theta\) in the plane \(\i\), rather than a particular point on a unit circle measured from some predefined zero bearing. This contrast is a rotational analogue of the difference between vectors and Cartesian points.

Rotations, in any dimension

We noted complex numbers excel at describing rotations in two dimensions, and quaternions in three. Geometric algebra complex numbers excel at describing rotations in any dimension.

Let \(\i\) be the product of two orthonormal vectors representing the plane in which we wish to rotate. Let \(\theta\) be the angle about the origin we wish to rotate. Let \(\u\) be a vector.

Decompose \(\u\) with respect to \(\i\):

\[ \u = \u_{\perp} + \u_{\parallel} \]

That is, we find vectors \(\u_{\perp} \cdot \i = 0\) and \(\u_{\parallel} \wedge \i = 0\) satisfying the above summation. These are unique and readily computed, but for this section we only need their existence.

Rotation only affects the second term, so \(\u\) rotates to:

\[ \begin{array}{ccl} \u_\perp + \u_\parallel e^{\i\theta} &=& \u_\perp e^{-\i\theta/2} e^{\i\theta/2} + \u_\parallel e^{\i\theta/2} e^{\i\theta/2} \\ &=& e^{-\i\theta/2} \u_\perp e^{\i\theta/2} + e^{-\i\theta/2} \u_\parallel e^{\i\theta/2} \\ &=& e^{-\i\theta/2} \u e^{\i\theta/2} \end{array} \]

where we have used \(\u_\perp \i = \i \u_\perp\) and \(\u_\parallel \i = - \i \u_\parallel\), identities that can be verified by setting \(\i = \e\f\) for orthonormal vectors \(\e,\f\) and a touch of algebra.

In 3D, choose a unit normal \(\n\) to \(\i\) so that \(\e\f\n =\vv{I}\). Then we can phrase a rotation of a vector \(\u\) about the axis \(\n\) by the angle \(\theta\) as:

\[ e^{-\n\vv{I}\theta/2} \u e^{\n\vv{I}\theta/2} \]

This formula implies Euler’s rotation theorem, as composing two rotations results in an expression of the same form, but our argument is incomplete until we deliver the missing proofs we’ve been promising.

The multivector \(e^{-\n\vv{I}\theta/2}\) is called a rotor.

We’ve glossed over more issues. We originally proved a result about rotations for a particular basis \(\e,\f\); why can we immediately apply the results to any two orthonormal vectors? Can we even be sure that a \(k\)-vector is always a \(k\)-vector after switching to a different orthonormal basis? Similarly, we defined \(\v\v = \v \cdot \v\) in one particular orthonormal basis. Why should it then hold in all of them? Even if it does, do extra relations materialize? Again, we postpone the proof.

Problem solved

We now solve our opening problem with a minimum of paperwork. Let \(\newcommand{\x}{\vv{x}} \newcommand{\y}{\vv{y}} \newcommand{\z}{\vv{z}} \x, \y, \z\) be an orthonormal basis of \(\mathbb{R}^3\), so \(\vv{I} = \x\y\z\). The following code takes coordinates to a multivector and back:

toGA :: [Double] -> Multivector
toGA = M.fromList . zip (pure <$> "xyz")

fromGA :: Multivector -> [Double]
fromGA m = (m!) . pure <$> "xyz"

We focus on the right half of the rotation formula. For a rotation of 90 degrees around the \(x\)-axis followed by a rotation of 90-degrees around the \(z\)-axis, it reads:

\[ \begin{array}{ccc} e^{\x\vv{I}45^\circ} e^{\z\vv{I}45^\circ} &=& (\cos 45^\circ + \y\z \sin 45^\circ)(\cos 45^\circ + \x\y \sin 45^\circ) \\ &=& \frac{1}{2} + \frac{\y\z + \x\y - \x\z}{2} \\ &=& \cos 60^\circ + c(\x + \z + \y)\vv{I} \sin 60^\circ \end{array} \]

for some constant \(c > 0\). Therefore the combined rotation is 120 degrees around the axis \(\x + \y + \z\), or in Cartesian coordinates, the axis through the origin and (1, 1, 1).

We can easily write code to compute the axis and angle of the composition of any two 3D rotations with axes through the origin, and use it to check our work:

rotr :: [Double] -> Double -> Multivector
rotr axis theta = canon [("", c), ("yz", nx*s), ("zx", ny*s), ("xy", nz*s)]
  where
    c            = cos (theta / 2)
    s            = sin (theta / 2)
    [nx, ny, nz] = map (/ sqrt(sum $ map (^2) axis)) axis

m ! s = M.findWithDefault 0 s m

axisAngle :: Multivector -> ([Double], Double)
axisAngle m = ([m!"yz", -(m!"xz"), m!"xy"], acos (m!"") * 2)

prop_rotationExample =
  (theta * 180 / pi) ~== 120 && all (~== 1) (map (/ head axis) axis)
  where
    a ~== b       = abs (a - b) < 10**(-6)
    x90           = rotr [1, 0, 0] $ pi/2
    z90           = rotr [0, 0, 1] $ pi/2
    (axis, theta) = axisAngle $ mul x90 z90

Let’s go for a spin!

We animate the rotations in question on this very webpage using our geometric algebra routines:

cubeVertices = replicateM 3 [-1, 1] :: [[Double]]
faces = [[0, 1, 3, 2], [4, 6, 7, 5],
      [0, 4, 5, 1], [2, 3, 7, 6],
      [1, 5, 7, 3], [0, 2, 6, 4]] :: [[Int]]

conjugate r u = rotInv r `mul` u `mul` r
rotInv = M.mapWithKey $ \k v -> if null k then v else -v

#ifdef __HASTE__
-- Darken, so the white side is visible.
rgbs :: [[Int]]
rgbs = map (\c -> c * 220 `div` 255) <$>
-- http://the-rubiks-cube.deviantart.com/journal/Using-Official-Rubik-s-Cube-Colors-268760351
  [[0, 0x51, 0xba], [0, 0x9e, 0x60],
   [0xff, 0xd5, 0], [0xff, 0xff, 0xff],
   [0xff, 0x58, 0], [0xc4, 0x1e, 0x3a]]

rx t = rotr [1, 0, 0] $ t * pi / 2
rz t = rotr [0, 0, 1] $ t * pi / 2

vshadeQ tRaw = conjugate r where
  r | even sec = rbase
    | odd  sec = rbase `mul` ([rx, rz]!!(div sec 2 `mod` 2)) t
  (sec, msec) = (tRaw `mod` 12000) `divMod` 1000
  rs = scanl mul one $ cycle [rx 1, rz 1]
  rbase = rs!!div sec 2
  t = fromIntegral msec / 1000

r111 t = rotr [1, 1, 1] $ t * 2 * pi / 3

vshadeA tRaw = conjugate r where
  r | even sec2 = rbase
    | otherwise = rbase `mul` r111 t
  (sec2, msec2) = (tRaw `mod` 12000) `divMod` 2000
  rs = scanl mul one $ repeat $ r111 1
  rbase = rs!!div sec2 2
  t = fromIntegral msec2 / 2000

paint vshade canvas t = renderOnTop canvas . f <$> filter vis (zip faces rgbs)
  where
  f (face, c) = color (toColor c) . fill . path $ project . (ws!!) <$> face
  ws = nudge . fromGA . vshade (round t) . toGA <$> cubeVertices
  toColor :: [Int] -> Color
  toColor [r, g, b] = RGB r g b
  -- Arrange signs so that x-axis goes right, y-axis goes up,
  -- and z-axis goes out of the screen.
  project :: [Double] -> (Double, Double)
  project [x, y, z] = ((-x/z + 0.5) * 320 + 160, (y/z - 0.5) * 320 + 160)
  -- Move the up, right, and into the screen.
  nudge = zipWith (+) [2, 2, 5]
  -- Cull faces pointing away from us.
  vis (face, _) = d!"" <= 0 where
    -- Use GA to compute a cross product and dot product.
    dual = (`mul` M.singleton "zyx" 1)
    (u:v:_) = zipWith (zipWith (-)) (tail ts) ts
    d = mul (dual (mul (toGA u) (toGA v))) $ toGA (head ts)
    ts = (ws!!) <$> face

main = withElems ["canvasQ", "canvasA"] $ \[qElem, aElem] -> do
  Just canvasQ <- fromElem qElem
  Just canvasA <- fromElem aElem
  paused <- newIORef False
  let
    anim t = do
      render canvasQ $ pure ()
      render canvasA $ pure ()
      sequence_ $ paint vshadeQ canvasQ t
      sequence_ $ paint vshadeA canvasA t
      b <- readIORef paused
      unless b $ void $ requestAnimationFrame anim
    pause = do
      b <- readIORef paused
      writeIORef paused $ not b
      when b $ void $ requestAnimationFrame anim
  _ <- qElem `onEvent` MouseDown $ const pause
  _ <- aElem `onEvent` MouseDown $ const pause
  requestAnimationFrame anim
#endif

We’ve barely scratched the surface. With the geometric algebra homogeneous model of \(\mathbb{R}^n\), we get:

  • 4x4 matrices for manipulating points in 3D, just like with vector algebra: rotations, translations, perspective projection, and so on.

  • Simple formulas for subspaces (lines, planes, and beyond). for example, a line is defined by two points \(p, q\), so its equation is \(p \wedge q\); a plane is defined by three points \(p, q, r\) , so its equation is \(p \wedge q \wedge r\).

  • Simple formulas for transforming subspaces, aka outermorphisms: \(f(p \wedge q) = f(p) \wedge f(q)\).

  • Simple formulas for finding the intersection of subspaces: \(a^* \cdot b\).

The proofs not given

At last we fill the gaps in our reasoning, though we’ll be a little informal. See also Alan Macdonald, An elementary construction of the geometric algebra. Intriguingly, he positions geometric algebra as the rightful successor to the reals in mathematical education: after \(\mathbb{N}\), we should learn \(\mathbb{Z}\), then \(\mathbb{Q}\), then \(\mathbb{R}\), and then geometric algebra, because it includes \(\mathbb{C}\), \(\mathbb{H}\) (quaternions), and even more exotic species like dual quaternions if we can tolerate a little degeneracy.

Our proof differs in that we use rotations rather than reflections. We painstakingly transform one orthonormal basis into another step by step, via rotations in planes spanned by two vectors of the current basis.

The road not taken

Let \(B = \{\e_1,…​,\e_n\}\) be an orthonormal basis of our vector space. Recall we can repeatedly apply \(\e_i \e_i = 1\) and \(\e_i \e_j = -\e_j \e_i\) for \(i \ne j\) to reduce any given product of basis vectors to the form:

\[ \pm \e_{i_1}…​\e_{i_k} \]

where \(i_1 < …​ < i_k\). To avoid double subscripts, we suppress \(\e\) in our notation for the time being: \(\pm i_1…​i_k\).

Why does canonicalization work? We see an index \(i\) is present if and only if there were an odd number of them to begin with, so our only worry is the sign: why do we always wind up with the same sign no matter how we apply the rules?

The trick is to consider the inversion number of \(i_1 …​ i_k\), which is a measure of unsortedness. An inversion is a pair \(m, n\) where \(i_m > i_n\) but \(m < n\). For example, a sorted list has no inversions. The inversion number is the total number of inversions. (It also counts the swaps performed by an unmodified gnome sort, which is forbidden to destroy parts of its input, unlike our variant of gnome sort!)

inversions :: Ord a => [a] -> [(a, a)]
inversions (x:xt) = [(x, y) | y <- xt, y < x] ++ inversions xt
inversions _ = []
inversionNumber :: Ord a => [a] -> Int
inversionNumber = length . inversions

prop_inversionsExample =
  inversions [1,4,3,2,5] == [(4,3),(4,2),(3,2)]
prop_inversionNumberExample =
  inversionNumber (reverse [1..10]) == 10*9`div`2

One rule flips the sign and swaps adjacent unequal indices, which changes the inversion number by \(\pm 1\). The other rule leaves the sign unchanged and introduces or eliminates two adjacent identical indices, which changes the inversion number by an even amount. Thus the sign faithfully records the parity of the inversion number. For example, if the inversion number was odd at the beginning and even at the end, the sign must have changed an odd number of times. Since we wind up sorting the indices, the inversion number is always zero at the end, that is, it is always winds up even, hence all roads lead to the same sign.

Recall we likened the geometric product to string concatenation, which is associative, and the product remains well-defined under our rules of reduction because they are reversable. If we concatenate two reduced strings, we may unreduce to get the concatenation of the unreduced strings. Reduction commutes with concatenation.

We leave it as an exercise to generalize to the case when each \(\e_i \e_i\) equals one of \(-1, 0, 1\). We will need this for conformal geometric algebra and projective geometric algebra.

Incidentally, permutations are called odd or even according to their inversion number, and the even permutations on \(n\) objects form the group \(A_n\), called the alternating group of order \(n\).

The orthonormal basis not taken

Given any orthonormal basis \(B' = \{\e'1,…​,\e'_n\}\), write \(\e'_1 = \sum_i a_i \e_i\) and \(\e'_2 = \sum_i b_i \e_i\). We find \(\e'_1 \e'_1 = 1\) and \(\e'_1 \e'_2 = -\e'_2 \e'_1\) using orthogonality, normality, and the relations between the \(\e_i\). This generalizes to the other \(\e'_i\) vectors, so we have at most \(2^n\) linearly independent products of the \(\e'_i\) vectors, which can all be written as \(\e'{i_1}…​\e'_{i_k}\) for some \(i_1 < …​ < i_k\).

Conversely, writing \(\e_i\) in terms of \(\e'_i\) shows the products of \(\e'_i\) span the entire geometric algebra vector space, so at least \(2^n\) of them are linearly independent. As the upper and lower bounds coincide, the products of vectors we just described form a geometric algebra basis.

Next, if \(B'\) is the same as \(B\) up to permutation and sign, then we can view them as the same basis by relabeling and appropriate negations.

Otherwise, for some \(\e' \in B'\), with suitable relabeling we have \(\e' = a \e_1 + b \e_2 + \v\) for nonzero \(a, b\), where \(\v\) is the rest of the decomposition with respect to \(B\). Rewrite the first two terms in polar form:

\[ a \e_1 + b \e_2 = (r \cos \theta)\e_1 + (r \sin \theta)\e_2 \]

Define \(\vv{R} = e^{-\i \theta /2} = \cos(\theta/2) + \e_1\e_2\sin(\theta/2)\) and let \(\vv{R}'\) be its inverse. We find:

\[ \vv{R}\e_1\vv{R}' = \e_1 \cos \theta + \e_2 \sin \theta \]

and

\[ \vv{R}\e_2\vv{R}' = - \e_1 \sin \theta + \e_2 \cos \theta \]

which we can check are orthonormal. Also, \(\vv{R}\e_i\vv{R}' = \e_i\) for the other basis vectors. Hence \(\vv{R}B\vv{R}' = \{ \vv{R}\e_1\vv{R}', …​, \vv{R}\e_n\vv{R}' \}\) is an orthonormal basis, and by construction:

\[ \e' = r (\vv{R}\e_1\vv{R}') + 0 (\vv{R}\e_2\vv{R}') + \v \]

That is, \(\e'\) has one fewer nonzero coefficient with respect to \(\vv{R}B\vv{R}'\). By repeating this process, we can eliminate all but one nonzero coefficient from \(\e'\) with respect to some basis \(T\); the sole nonzero coefficient must be \(\pm{1}\) by normality, and the other basis vectors must have a corresponding zero coefficient by orthogonality.

We iterate to eliminate all but one nonzero coefficient from all basis vectors. Once reduced to one nonzero coefficient, a basis vector stays that way; it is never involved in future rotations because all other vectors have a zero in its direction. Eventually our basis will change to a permutation of \(B'\) up to sign, and by construction there is some rotor \(\vv{R}\) such that \(\x \mapsto \vv{R}' \x \vv{R}\) describes \(\x\) with respect to the basis \(B'\).

Let \(O\) be the linear map representing the change of orthonormal basis from \(B\) to \(B'\). We extend it to the geometric algebra by defining:

\[O(ab) = O(a) O(b)\]

and extending linearly. This is well-defined, that is, \(ab = cd\) implies \(O(a)O(b) = O(c)O(d)\) because \(O(\e_i) = \vv{R}' \e_i \vv{R}\) for some rotor \(\vv{R}\) for all \(\e_i\). Thus \(O\) is an algebra isomorphism.

We prove \(O\) maps \(k\)-vectors to \(k\)-vectors by multiplying out all the cases. Let \(\v = \e_{i_1}…​\e_{i_k}\), and \(\vv{R} = e^{-\e_1\e_2\theta/2}\), so \(\vv{R}'\v\vv{R}\) describes \(\v\) with respect to the basis \(\vv{R}B\vv{R}' = \{\vv{R}\e_1\vv{R}',\vv{R}\e_2\vv{R}',\e_3,…​,\e_n\}\).

Then:

  • If neither \(\e_1\) or \(\e_2\) appear, then \(\vv{R}'\v\vv{R} = \v\).

  • If exactly one of them appear, say \(i_1 = 1\), then \(\vv{R}'\v\vv{R} = (\cos\theta \vv{R}\e_1\vv{R}' - \sin\theta \vv{R}\e_2\vv{R}') \e_{i_2} …​ \e_{i_k}\); the case \(i_1 = 2\) is similar.

  • If both appear, namely \(i_1 = 1, i_2 = 2\), then \(\vv{R}'\v\vv{R} = \v\).

We get \(k\)-vectors in all cases.

For the general situation when \(\e_i \e_i\) can also be 0 or -1, we use Sylvester’s law of inertia to show the \(\e'_i\) behave the same way.


Ben Lynn blynn@cs.stanford.edu 💡