{-# LANGUAGE CPP #-}
import Data.Function (on)
import Data.Either (partitionEithers)
import Data.List (minimumBy, intersperse, nub, sortOn)
import Data.Map (Map)
import qualified Data.Map as M
import Data.Maybe (catMaybes)
#ifdef __HASTE__
import Text.Parsec
import Control.Monad (void)
import Haste.DOM
import Haste.Events
import Haste.Foreign (ffi)
import Haste.JSString (pack)
lookupMax m = if M.null m then Nothing else Just $ M.findMax m
type Parser = Parsec String ()
digitChar = digit; letterChar = letter
anySingleBut = noneOf . pure; some = many1
#else
import Text.Megaparsec
import Text.Megaparsec.Char
lookupMax = M.lookupMax
type Parser = Parsec () String
#endif
Euclid’s Algorithm vs. Euclid’s Geometry
Euclid alone has looked on Beauty bare. On the other hand, Euclid’s algorithm alone shrouds the beauty of Euclidean geometry in a fog of algebra, yet is surprisingly effective at providing answers:
Theorem:
Divide and conquer
Euclid’s algorithm is powered by a division that finds a remainder smaller than the divisor. Traditionally, we start from two natural numbers \(s\) and \(p \ne 0\), and find \(q, r\) such that:
\[ s = p q + r \]
with \(r < p\). We iterate until we reach the smallest possible remainder, namely zero, and in doing so we find the answer seek, which in this case is the greatest common divisor of \(s\) and \(p\).
Euclid’s algorithm also works for polynomials if the coefficients are rational, or from some other field. In this setting, we measure size by looking at the degree of a polynomial, and once again the remainder polynomial is guaranteed to decrease until we reach zero. Namely, given the dividend and divisor polynomials \(s(x)\) and \(p(x) \ne 0\), we find quotient and remainder polynomials \(q(x), r(x)\) such that:
\[ s(x) = p(x) q(x) + r(x) \]
with \(\deg r(x) < \deg p(x)\). (Here, we consider the degree of 0 to be negative.)
But what if the coefficients are integers, or themselves polynomials in other variables? Then the first step of polynomial long division only works if the leading coefficient of the dividend is an exact multiple of the leading coefficient of the divisor. Even then, we run into the same issue in the next step of the division, and again require compatible leading coefficients.
This suggests loosening the rules. We permit multiplying the dividend by some constant, that is, we allow the choice of some constant factor \(a\) in the following pseudo-division:
\[ a s(x) = p(x) q(x) + r(x) \]
As before \(p \ne 0\) and we want \(\deg r < \deg p\).
If we choose \(a\) to be a sufficiently high power of the leading coefficient of the divisor \(p(x)\), then pseudo-division succeeds.
Wu’s method uses the Euclidean algorithm with pseudo-division to solve Euclidean geometry problems. See chapter 5 of John Harrison, Handbook of Practical Logic and Automated Reasoning.
Multivariate Madness
We represent a monomial (product of variables) with a map of variables to their multiplicities:
type Var = String
type Mo = Map Var Int
We represent a polynomial with a map of monomials to their coefficients:
newtype Poly = Poly (Map Mo Integer) deriving Eq
In some applications, the monomials should be ordered by (combined) degree. Fortunately, we won’t need this.
Unfortunately, we do need the ability to view a multivariate polynomial as a
univariate polynomial in some given variable x
. We represent a univariate
polynomial with a map from the degree of a nonzero term to its coefficient,
which is a multivariate polynomial from which x
should be absent.
type Univ = Map Int Poly
We now have three varieties of related maps bouncing around our code. It might
be best to hide the maps behind data types and use a Ring
typeclass so we can
reuse various mathematical operators. However, this requires more declarations
than I’d prefer for a toy demo, so I settled on only hiding the map for the
multivariate polynomials.
We give the multivariate polynomials pretty binary operators. In contrast, operations on monomials are mostly inlined.
noZ :: (Num n, Eq n) => Map k n -> Map k n
noZ = M.filter (/= (fromIntegral 0))
pZero = Poly mempty
pOne = Poly $ M.singleton mempty 1
pIntegral n = Poly $ M.singleton mempty $ fromIntegral n
pVar v = Poly $ M.singleton (M.singleton v 1) 1
pNull (Poly p) = M.null p
pNegate (Poly p) = Poly $ M.map negate p
(Poly p) .+ (Poly q) = Poly $ noZ $ M.unionWith (+) p q
p .- q = p .+ pNegate q
(Poly p) .* (Poly q) = Poly $ noZ $ M.fromListWith (+)
[(k1 `moMul` k2, v1 * v2) | (k1, v1) <- M.toList p, (k2, v2) <- M.toList q]
where moMul xs ys = noZ $ M.unionWith (+) xs ys
pSetZero v (Poly p) = Poly $ M.filterWithKey (\xs _ -> M.notMember v xs) p
Operations on univariate polynomials are similar. One difference arises in multiplication: instead of multiplying monomials, we add degrees.
uCanon = M.filter $ \(Poly p) -> not $ M.null p
uConst a = uCanon $ M.singleton 0 a
uAdd p q = uCanon $ M.unionWith (.+) p q
uNegate p = M.map pNegate p
uSub p q = uAdd p $ uNegate q
uMul p q = uCanon $ M.fromListWith (.+)
[(k1 + k2, v1 .* v2) | (k1, v1) <- M.toList p, (k2, v2) <- M.toList q]
Now for pseudo-division on polynomials, the star of the show.
Let s
and p
be univariate polynomials. Let a
be the leading coefficient
of the divisor. Then our next function returns k
and r
such that
\[ a^k s = p q + r \]
for some quotient q
.
Each step of the long division involves \(x^{m-n}\) times the divisor times an appropriate constant, where \(m\) is the degree of the dividend and \(n\) is the degree of the divisor.
We optimize slightly: if the leading coefficients happen to match then we
can skip one step of scaling up the dividend by a
.
euclid :: Univ -> Univ -> (Int, Univ)
euclid s p = go 0 s where
(n, a) = M.findMax p
go k s
| M.null s = (k, s)
| m < n = (k, s)
| a == b = go k $ s `uSub` p'
| otherwise = go (k + 1) $ (uConst a `uMul` s) `uSub` (uConst b `uMul` p')
where
(m, b) = M.findMax s
p' = M.mapKeysMonotonic (+ (m - n)) p
The respect
function views a multivariate polynomial as a univariate
polynomial with respect to a given variable. I only used its inverse
disrespect
during testing.
respect :: Var -> Poly -> Univ
respect v (Poly p) = M.map Poly $ M.fromListWith M.union $ go <$> M.toList p
where
go (xs, coeff) = case M.lookup v xs of
Nothing -> (0, M.singleton xs coeff)
Just k -> (k, M.singleton (M.delete v xs) coeff)
disrespect :: Var -> Univ -> Poly
disrespect v u = Poly $ M.fromList $
[(if k > 0 then M.insert v k xs else xs, r)
| (k, Poly p) <- M.toList u, (xs, r) <- M.toList p]
Triangles in Algebra
Cartesian coordinates transform questions of geometry into questions of algebra. Consider the theorem: if M is the midpoint of AC, and BM is perpendicular to AC, then AB = AC. We translate it as follows.
If:
-
\(A_x + C_x - 2 M_x = 0\)
-
\(A_y + C_y - 2 M_y = 0\)
-
\((A_x - C_x) (M_x - B_x) + (A_y - C_y) (M_y - B_y) = 0\)
Then:
-
\((A_x - B_x)^2 + (A_y - B_y)^2 - (B_x - C_x)^2 - (B_y - C_y)^2 = 0\)
More generally, theorems of Euclidean geometry often take the following form: when the multivariate polynomials \(p_1, …, p_n\) are all zero, the multivariate polynomial \(p\) is zero.
Suppose the variable \(x\) appears in at least two of the \(p_1,…,p_n\). We view them as univariate polynomials \(f(x)\) and \(g(x)\), labeled so that \(\deg f \ge \deg g\). By pseudo-division, we can scale up \(f\) and subtract a multiple of \(g\) to produce a polynomial \(f'\) with \(\deg f' < \deg g\).
The zeros of \(f\) and \(g\) are also zeros of \(f'\), so we may replace \(f\) with \(f'\). This is like Gaussian elimination, but each step is irreversible as the converse may be untrue. Recall the scaling factor possibly contains variables that we are temporarily viewing as constants, so it may really be a polynomial with its own zeros.
Iterating this removes \(x\) from all but one polynomial in \(p_1,…,p_n\). We have thus accomplished a noteworthy feat simply by chasing smaller and smaller remainders. Euclid’s algorithm strikes again!
We repeat on the remaining \(n - 1\) polynomials: we pick some other variable \(y\) and eliminate it from all but one of them, and so on. After \(n\) repetitions we reach a triangular form \(p_1,…,p_k\): for some variables \(x_1,…,x_n\), if we view all other variables as constants, then the polynomial \(p_k\) only contains the variables \(x_1,…,x_k\).
Our triangulate
function eliminates variables in the order they appear in
a given list. It returns a list of polynomials viewed as univariate polynomials
in the given variables. The variable names are recorded with their
corresponding polynomials.
For pseudo-division we choose a divisor from the candidates of lowest degree as heuristically this ought to reduce the degrees of the other polynomials most rapidly.
triangulate :: [Var] -> [Poly] -> [(Var, [Univ])]
triangulate [] [] = []
triangulate [] ps = error "leftovers"
triangulate (v:vt) ps = go [] $ respect v <$> ps where
findConstant p = case lookupMax p of
Nothing -> Left pZero
Just (0, c) -> Left c
_ -> Right p
go done todo
| length ncs <= 1 = (v, ncs) : triangulate vt (cs ++ done)
| otherwise = go (cs ++ done) $ p : filter (not . M.null) (map (\s -> snd $ euclid s p) ncs)
where
(cs, ncs) = partitionEithers $ findConstant <$> todo
p = minimumBy (compare `on` (fst . M.findMax)) ncs
Let us state our transformed problem: given a triangular set of polynomials \(p_1,…,p_n\) with variables \(x_1,…,x_n\) where all other variables are viewed as constants, our goal is to show that when they are all zero, \(p\) is also zero.
Once again, we call upon pseudo-division and recursion. Due to triangulation, only \(p_n\) can possibly shed any light on the effect of \(x_n\) on our answer. Viewing our polynomials as univariate in \(x_n\), we pseudo-divide:
\[ a p(x_n) = p_n(x_n) q(x_n) + r(x_n) \]
When \(p_n(x_n) = 0\), we have \(p(x_n) = 0\) if \(a \ne 0\) and \(r(x_n)\) is the zero polynomial, that is, each coefficient of \(r(x_n)\) is zero. Again, the implication is one-way; these conditions are sufficient, but may not be necessary.
We record \(a \ne 0\), and as for the coefficients, each is a polynomial not containing \(x_n\), so we can recursively generate more conditions with the triangular set \(p_1,…p_{n-1}\).
We bottom out when nothing remains in the triangular set. At this point, we have no choice but to simply generate the condition \(r = 0\). As an extreme example, if asked to prove \(AB = CD\) with no premises, then the triangular set is initially empty, and we simply return the condition \(AB = CD\).
Typically, Wu’s method generates conditions that prevent degeneracy. For example, an equation might state that a certain length is nonzero.
genDegens :: [(Var, [Univ])] -> Poly -> [(Poly, Bool)] -> [(Poly, Bool)]
genDegens [] p acc = (p, False):acc
genDegens ((v, qs):tri) p acc
| pNull p = acc
| null qs = genDegens tri p acc
| k == 0 = acc'
| otherwise = (snd $ M.findMax q, True):acc'
where
q = head qs
(k, r) = euclid (respect v p) q
acc' = foldr (genDegens tri) acc $ M.elems r
Popularity Contest
The order we process variables during triangulation affects the speed of the
algorithm, thus our wu
function takes a vars
list so the caller can choose
the order.
The results of Euclidean geometry are invariant under translation and rotation,
so we may choose one of the points to lie at the origin, and another one of the
points to the lie on one of the axes, or more generally, two points to lie on
one axis, and a point to lie on the other. If the theorem has certain
constraints, we may zero even more variables. Accordingly, our wu
function
also accepts a list of variables to replace with zero, which should be disjoint
from the vars
list.
Scale-invariance suggests we could replace another variable with unity, but we must take care when points can possibly coincide. We pass on this idea.
We allow multiple polynomials in the conclusion. We prove each is zero separately.
wu vars zeros antes concls =
(flip (,) False . pVar <$> zeros) ++
concat [genDegens tri p [] | p <- unpop concls]
where
tri = triangulate vars $ unpop antes
unpop = map $ \p -> foldr pSetZero p zeros
Heuristically, for a general problem, it seems we should always select the variable that appears in the fewest polynomials, so there is less to eliminate, and we should zero out the most frequently appearing variables to reduce our workload. We ensure the three winners of the popularity contest contain at least one \(x\)-coordinate and at least one \(y\)-coordinate to avoid losing generality.
wuPop antes concls = wu (reverse rest) pops antes concls where
fvs = map (\(Poly p) -> nub $ concatMap M.keys $ M.keys p) antes
freqs = foldr (M.unionWith (+)) mempty $
map (M.fromList . map (flip (,) 1)) fvs
byPop = map fst $ sortOn snd $ M.toList freqs
(pre0, (popx:post0)) = break ((== 'x') . last) $ reverse byPop
(pre1, (popy:post1)) = break ((== 'y') . last) $ pre0 ++ post0
(pop3:rest) = pre1 ++ post1
pops = [popx, popy, pop3]
Fermat and Descartes
We provide one grammar for polynomials, and a higher-level macro grammar so we can type things like "line A B C", which automatically expands to a polynomial representing that these points are collinear.
The last line is the conclusion of the theorem, and the other lines are the antecedents. A single line may expand into multiple polynomials.
macroDefs =
[ ("line", ["(1_x - 2_x) * (2_y - 3_y) - (1_y - 2_y) * (2_x - 3_x)"])
, ("para", ["(1_x - 2_x) * (3_y - 4_y) - (1_y - 2_y) * (3_x - 4_x)"])
, ("perp", ["(1_x - 2_x) * (3_x - 4_x) + (1_y - 2_y) * (3_y - 4_y)"])
, ("mid", ["2_x + 3_x - 2 * 1_x", "2_y + 3_y - 2 * 1_y"])
, ("intersect",
[ "(1_x - 2_x) * (2_y - 3_y) - (1_y - 2_y) * (2_x - 3_x)",
"(1_x - 4_x) * (4_y - 5_y) - (1_y - 4_y) * (4_x - 5_x)"])
, ("len", ["(1_x - 2_x)^2 + (1_y - 2_y)^2 - (3_x - 4_x)^2 - (3_y - 4_y)^2"])
, ("sumsq", ["(1_x - 2_x)^2 + (1_y - 2_y)^2 + (3_x - 4_x)^2 + (3_y - 4_y)^2 - (5_x - 6_x)^2 - (5_y - 6_y)^2"])
, ("same", ["1_x - 2_x", "1_y - 2_y"])
]
Fermat and Descartes both pioneered analytic geometry. Fermat started from algebraic equations, while Descartes started from geometric curves, so we name our parsers accordingly.
Thus fermat
parses a language representing multivariate polynomials and
returns a Poly
, while descartes
parses a macro language describing
geometric constraints and returns strings of our polynomial language.
spc = many $ char ' '
fermat :: Parser Poly
fermat = expr where
sp = (<* spc)
spch = sp . char
expr = foldl (flip ($)) <$> term <*> many
( (.+) <$> (spch '+' *> term)
<|> (flip (.-)) <$> (spch '-' *> term)
)
term = foldl (.*) <$> pwr <*> many (spch '*' *> pwr)
pwr = do
a <- atom
option a $ (iterate (a .*) pOne !!) <$> sp (spch '^' *> (read <$> some digitChar))
atom = between (spch '(') (spch ')') expr
<|> pIntegral . read <$> sp (some digitChar)
<|> pVar <$> var
var = sp $ some $ letterChar <|> char '_'
macro :: Parser ([String], [String])
macro = go . catMaybes <$> some line where
go ls = (concat $ init ls, last ls)
line = spc *> (newline *> pure Nothing <|> nonEmpty <* newline)
nonEmpty = do
mac <- some (anySingleBut ' ') <* spc
case mac of
"raw" -> Just . (:[]) <$> some (anySingleBut '\n')
"#" -> some (anySingleBut '\n') *> pure Nothing
_ -> Just <$> do
args <- some $ some letterChar <* spc
let
sub (n:t@('_':_)) = args !! (read [n] - 1) ++ sub t
sub (h:t) = h:sub t
sub x = x
maybe (fail mac) (pure . map sub) $ lookup mac macroDefs
descartes :: String -> Either String ([Poly], [Poly])
descartes s = either (Left . show) Right $ do
(antes, concls) <- parse macro "" s
(,) <$> mapM (parse fermat "") antes <*> mapM (parse fermat "") concls
Who’s a Pretty Polynomial?
We pretty-print equations in LaTeX format, so MathJax can render them above.
prMo xs = foldr (.) id $ intersperse (' ':) $ (\(x, k) -> (x++) . prPow k) <$> M.toList xs
where
prPow 1 = id
prPow k = showString "^" . shows k
prettyPoly (Poly p) = (if snd h < 0 then ('-':) else id) . go h .
foldr (.) id (map (\(k, v) -> ((if v < 0 then " - " else " + ")++) . go (k, v)) t)
where
go (k, v)
| M.null k = shows $ abs v
| otherwise = prCoeff (abs v) . prMo k
prCoeff 1 = id
prCoeff n = shows n . (' ':)
(h:t) = M.toList p
prettyEqn (p, b) = prettyPoly p . showString (if b then " \\ne 0" else " = 0")
I called the following function from GHCi extensively during development:
solve s = putStr $ either show (($ "") . foldr (.) id .
map ((. ('\n':)) . prettyEqn) . uncurry wuPop) $ descartes s
Frontend
Barebones code for a barebones user interface:
#ifdef __HASTE__
renderEqns es = ("<ul>"++)
. foldr (.) id [("<li>\\("++) . prettyEqn e . ("\\)</li>"++) | e <- es]
. ("</ul>"++)
main :: IO ()
main = withElems ["theorem", "out", "wu"] $ \[iEl, oEl, wuB] -> do
let
setTheorem s = do
setProp oEl "innerHTML" "<p><i>Push the button!</i></p>"
setProp iEl "value" s
handle buttonId s = do
Just b <- elemById buttonId
void $ b `onEvent` Click $ const $ setTheorem s
handle "isosceles" isosceles
handle "centroid" centroid
handle "oops" centroidOops
handle "two_one" twoToOne
handle "orthocenter" orthocenter
handle "simson" simson
handle "pythag" pythag
handle "desargues" desargues
handle "area" area
void $ wuB `onEvent` Click $ const $ do
s <- getProp iEl "value"
case descartes s of
Left e -> setProp oEl "innerHTML" $ "<pre>" ++ show e ++ "</pre>"
Right (as, cs) -> do
setProp oEl "innerHTML"
$ ("<p>"++)
. ("If:"++)
. renderEqns (flip (,) False <$> as)
. ("then:"++)
. renderEqns (flip (,) False <$> cs)
. ("provided"++)
. renderEqns (wuPop as cs)
. ("(up to translation and rotation)</p>"++)
$ ""
ffi $ pack "MathJax.typeset"
setTheorem isosceles
#endif
Geometric Algebra
Having been wowed by conformal geometric algebra as well as Wu’s method, I naturally wondered what might happen if we mixed the two. Could we tackle more problems? Would proofs be shorter and simpler? Could we look on Beauty bare?
It turns out others thought of this years ago. For example, see Li, Automated Theorem Proving in the Homogeneous Model with Clifford Bracket Algebra and other papers.