{-# LANGUAGE CPP #-}
#ifdef __HASTE__
import Control.Monad
import Haste
import Haste.DOM
import Haste.Events
import Haste.Foreign
import Haste.JSString (pack)
import System.Random
#else
import System.Environment
import System.IO
#endif
import Control.Arrow
import Data.Tree
import Text.Parsec
Lovely as a Tree
⚂
Of the many ways to draw binary trees, computer scientists often choose level-based layout, that is, nodes have the same y-coordinate if and only if they belong to the same level. By convention, deeper levels appear below shallower levels, and levels are evenly spaced.
The Reingold-Tilford algorithm efficiently produces some of the prettiest pictures of this style:
-
Left nodes appear to the left of right nodes.
-
Parent nodes are centered above their children.
-
No edges cross.
-
Drawing the reverse of a tree results in its mirror image.
-
A subtree appears the same no matter where it occurs on the tree.
-
The width of the drawing is minimal.
These slides show the Reingold-Tilford algorithm in action, which helped me understand it.
I found a summary of level-based algorithms for drawing trees, which is nice apart from a few errors of varying severity. It seems "Figure 7" should be "Figure 8" in some places; "right" should be "left" in one sentence; "n/3" should be "n/4"; "Principle 6" is misleading because the algorithm of Buchheim et al. merely opportunistically spreads out the siblings between two others at a certain point in the algorithm without moving any other siblings. (Insisting all siblings be evenly spaced results in wider drawings.)
See also the chapter on drawing general trees from the Handbook of Graph Drawing and Visualization.
Preliminaries
Let’s walk through an implementation of the Reingold-Tilford algorithm. We need a bunch of imports:
We use parser combinators to read a simple language that lets us easily supply trees to our program. The language is modeled on the notation for composing functions. The right-associative operator "." connects subtrees, and any alphanumeric string is a leaf.
data Expr = Fun String | Com Expr Expr
expr :: Parsec String () Expr
expr = spaces >> atm `chainr1` com where
atm = fun <|> between (sp $ char '(') (sp $ char ')') expr
fun = sp $ Fun <$> many1 alphaNum
com = const Com <$> sp (char '.')
sp = (<* spaces)
The naive approach
The idea behind the algorithm is simple and elegant. We build the drawing bottom-up. The y-coordinate of a node is implicit, as it is determined by the level. We only need to pick good x-coordinates.
The base case places a leaf node at x = 0.
simpleDraw :: Expr -> Tree (Double, String)
simpleDraw (Fun s) = Node (0, s) []
For the inductive step, we first find the contours of the two subtrees. The right contour of a tree is the sequence of the x-coordinates of the rightmost node in each level, and the left contour is the same for the leftmost nodes.
simpleDraw (Com l r) = Node (m, "") xs where
[ll, rr] = simpleDraw <$> [l, r]
[lCont, rCont] = [fst . head <$> levels rr, fst . last <$> levels ll]
We compare the right contour of the left subtree layout against the left contour of the right subtree layout. We intend to indent one of them so the minimum distance between the contours is 1 unit.
d = 1 - minimum (zipWith (-) lCont rCont)
To ensure the drawing’s leftmost point has x = 0, we indent the right subtree when they are too close, and the left subtree when they are too far apart.
xs | d >= 0 = [ ll, first (+ d) <$> rr]
| otherwise = [first (+(-d)) <$> ll, rr]
Lastly, the parent is given the x-coordinate halfway between its children.
m = (sum $ fst . rootLabel <$> xs) / 2
Though concise, the running time of the above code is quadratic in the number of nodes because:
-
Many nodes might be indented many times.
-
Tracing a contour requires traversing the whole tree.
Thanks to lazy evaluation and zipWith
, we only completely traverse the
shorter of the two subtrees. This behaviour becomes important when we improve
our algorithm.
From quadratic to linear
To fix the first issue, we record indent values in subtree roots. A final top-down pass recursively applies the indents cumulatively to compute the x-coordinates of each node. In essence, we are using:
fmap (x +) . fmap (y +) = fmap ((x + y) +)
(GHC has fusion optimizations that do this automatically for simpler cases.)
To fix the second issue, we maintain auxiliary edges on the tree so we can follow contours without traversing its entirety.
Instead of Data.Tree
we define a custom RT
tree data structure.
The shift
field records the indent amount for the first optimization, and
the link
field may hold an auxiliary edge for the second optimization.
Because of our first optimization, we need to record an indent modifier with
each link.
We parameterize so the node data structure can hold data of any given type.
data RT a = RT { xpos :: Int
, shift :: Int
, hold :: a
, link :: Maybe (Int, RT a)
, kids :: [RT a]
}
The following applies all indents of an RT
tree to produce a Data.Tree
representing the final drawing:
addx :: Int -> RT a -> Tree (Int, a)
addx i rt = Node (xpos rt + shift rt + i, hold rt) $
addx (i + shift rt) <$> kids rt
As for the link
field, observe we can mostly figure out the right contour by
starting at the root and recursively following the last child. Trouble arises
when we reach a leaf but there are still more levels to go.
To solve this problem, for rightmost leaf nodes except those in the last level,
we set link
to point at the rightmost node of the next level. When first
placing a node, it either lies on the deepest level or is guaranteed to be
an internal node, so link
is initially Nothing
.
When combining 2 subtrees, the weave
function traverses the rightmost
children and links of both subtrees until it bottoms out on at least one of
them. Then if needed, it follows at most one more edge or link on the other
subtree to create a new link. It also stores the difference between the total
indent of the link destination and that of the link source so we can update the
indent value accordingly when following links.
Left contours are similarly handled. In fact, to avoid code duplication, our
contour
function takes an f
argument that should be head
for left
contours and last
for right contours, and our weave
function calls a
helper function with id
or reverse
depending on whether it’s acting on
the left or the right side of the subtrees.
We could optimize further. For example, we know only one link on the shallower subtree needs adjustment. However, the asymptotic time complexity is unaffected.
Typical implementations modify data in place, but as our code is pure, we
create new RT
nodes instead.
contour :: ([RT a] -> RT a) -> (Int, RT a) -> [Int]
contour f (acc, rt) = h : case kids rt of
[] -> maybe [] (contour f . first (+ acc')) (link rt)
ks -> contour f (acc', f ks)
where acc' = acc + shift rt
h = acc'+ xpos rt
weave :: RT a -> RT a -> [RT a]
weave l r = [weave' id (0, l) (0, r), weave' reverse (0, r) (0, l)]
weave' :: ([RT a] -> [RT a]) -> (Int, RT a) -> (Int, RT a) -> RT a
weave' f (accL, l) (accR, r)
| Nothing <- follow = l
| Just (n, x) <- link l = l { link = Just (n, weave' f (n + accL', x) y) }
| (k:ks) <- f $ kids l = l { kids = f $ weave' f (accL', k) y : ks }
| otherwise = l { link = first (+(-accL')) <$> follow }
where
accL' = accL + shift l
accR' = accR + shift r
follow | (k:_) <- f $ kids r = Just (accR', k)
| otherwise = first (accR' +) <$> link r
Just y = follow
This time, we want a configurable minimum gap between siblings, as well as integral x-coordinates. So if necessary, we bump up the indent value so the average of the x values of the sibling roots is a whole number.
We also change our API to take any Tree a
instead of our Expr
data
structure.
padding :: Int -- Minimum horizontal gap between nodes.
padding = 50
placeRT :: Tree a -> RT a
placeRT (Node a []) = RT 0 0 a Nothing []
placeRT (Node a [l, r]) = RT m 0 a Nothing xs where
[ll, rr] = placeRT <$> [l, r]
g = padding - minimum (zipWith (-)
(contour head (0, rr)) (contour last (0, ll)))
s = xpos ll + xpos rr
gap = abs g + mod (abs g + s) 2 -- Adjust so midpoint is whole number.
m = div (s + gap) 2
xs = if g >= 0 then weave ll rr { shift = gap }
else weave ll { shift = gap } rr
placeRT _ = error "full binary trees only please"
drawRT :: Tree a -> Tree (Int, a)
drawRT = addx 0 . placeRT
drawExpr :: Expr -> Tree (Int, String)
drawExpr = drawRT . fromExpr where
fromExpr (Fun s) = Node s []
fromExpr (Com l r) = Node "" $ fromExpr <$> [l, r]
Web version
Each time the "Draw!" button is clicked, we read the tree in the input text area and draw it.
For some reason that is probably important to some committee, JavaScript’s
createElement
creates elements that can be added to the SVG, but on my
browser they are never rendered even though they appear in the DOM. Instead,
we must call createElementNS
to create SVG elements. Haste lacks a wrapper
for this function so we define our own.
#ifdef __HASTE__
newElemSVG :: String -> IO Elem
newElemSVG = ffi $ pack $
"(x => document.createElementNS('http://www.w3.org/2000/svg', x))"
We add a function that determines the maximum x- and y-coordinates of a
tree drawing so we can set an appropriate viewBox
for the SVG element.
maxXY :: Tree (Int, a) -> (Int, Int)
maxXY t = (maximum xs, length xs) where xs = fst . last <$> levels t
For the random tree feature, we have a function that generates a random expression with a given number of leaf nodes. We shall give it a random length whose expected value is 12.
randomExpr :: RandomGen g => g -> Int -> String
randomExpr g 1 = pure $ ['a'..'z'] !! mod (fst $ next g) 26
randomExpr g n = concat ["(", randomExpr gl m, ".", randomExpr gr (n - m), ")"]
where (r, g1) = next g
(gl, gr) = split g1
m = 1 + mod r (n - 1)
The rest of the code is mostly tedious web stuff.
draw :: Elem -> Int -> Tree (Int, String) -> IO ()
draw soil y (Node (x, s) ks) = do
forM_ ks $ \(Node (x2, _) _) -> appendChild soil =<< newElemSVG "line" `with`
[ attr "x1" =: show x
, attr "y1" =: show (40*y)
, attr "x2" =: show x2
, attr "y2" =: show (40*(y + 1))
, attr "stroke" =: "black"
]
if null ks then appendChild soil =<< newElemSVG "rect" `with`
[ attr "x" =: show (x - 12)
, attr "y" =: show (40*y - 12)
, attr "width" =: "24"
, attr "height" =: "24"
, attr "fill" =: "white"
, attr "stroke" =: "black"
]
else appendChild soil =<< newElemSVG "circle" `with`
[ attr "r" =: "3"
, attr "cx" =: show x
, attr "cy" =: show (40*y)
]
e <- newElemSVG "text" `with`
[ attr "x" =: show x
, attr "y" =: show (40*y)
, attr "text-anchor" =: "middle"
, attr "alignment-baseline" =: "central"
]
setProp e "textContent" s
appendChild soil e
mapM_ (draw soil $ y + 1) ks
main :: IO ()
main = withElems ["soil", "input", "drawB", "randB"] $
\[soil, input, drawB, randB] -> do
let
drawInput = do
s <- getProp input "value"
case parse expr "" s of
Left err -> alert $ pack $ "Parse error: " ++ show err
Right t -> do
clearChildren soil
let
drawing = drawExpr t
(x, y) = maxXY drawing
setAttr soil "viewBox" $
"-15 -15 " ++ show (x + 30) ++ " " ++ show (40*y + 30)
draw soil 0 drawing
drawInput
void $ drawB `onEvent` Click $ const drawInput
void $ randB `onEvent` Click $ const $ do
g <- getStdGen
let
(gn, g') = split g
(ge, g'') = split g'
n = 1 + length (takeWhile (> 0) $ randomRs (0 :: Int, 10) gn)
s = randomExpr ge n
setProp input "value" s
drawInput
setStdGen g''
Command-line version
We read the first argument as a tree and print an SVG drawing of it. If no arguments are provided then a default tree is used.
#else
render :: Int -> Tree (Int, String) -> String
render depth (Node (x, s) ks) = concat $
((\(Node (x2, _) _) -> concat
[ "<line x1='", show $ x + x0, "' y1='", show $ depth*40 + y0
, "' x2='", show $ x2 + x0, "' y2='", show $ (depth + 1)*40 + y0
, "' stroke='black'/>"
]) <$> ks) ++
(if null ks then
[ "<rect x='", show $ x - 12 + x0, "' y='", show $ depth*40 - 12 + y0
, "' width='24' height='24' stroke='black' fill='white'/>\n"
]
else
[ "<circle r='3' cx='", show $ x + x0, "' cy='", show $ depth*40 + y0
, "' stroke='black'/>\n"
]) ++
(if null s then [] else
[ "<text text-anchor='middle' alignment-baseline='central'"
, " x='", show $ x + x0, "' y='", show $ depth*40 + y0
, "'>", s, "</text>\n"
]) ++
(render (depth + 1) <$> ks)
where (x0, y0) = (15, 15)
main :: IO ()
main = do
args <- getArgs
let
(simpleMode, args1) = case args of
("-s":t) -> (True, t)
_ -> (False, args)
s = case args1 of (h:_) -> h
_ -> "(((1.2.3.4).5).(x.y)).(a.(b.((c.d).e).f))"
case parse expr "" s of
Left err -> hPutStrLn stderr $ show err
Right t -> putStrLn $ concat
[ "<svg xmlns='http://www.w3.org/2000/svg'>"
, render 0 $ if simpleMode
then first (round . (50*)) <$> simpleDraw t
else drawExpr t
, "</svg>"
]
#endif