sq x = x * x; main = sq (sq 3)
Laziness in Action
Let’s write an program that visualizes computation for a Haskell-like language. For example, the input:
begins as the graph:
The internal nodes depict function application: we apply the function
of the left child to the right child.
Lazy evaluation means we expand the outermost sq
first.
Eager evaluation would expand the innermost sq
first.
The outermost function is now the built-in multiplication function, which
forces evaluation of its arguments. Here, both arguments are the same, so
we wind up expanding sq 3
once and using it twice. This requires a pure
language: no such optimization is possible in a language where any
statement may have side effects.
We perform the bottommost multiplication:
Then the remaining multiplication yields the final answer: 81.
Slideshows of other programs (with stack nodes highlighted in red):
-
id
example; our interpreter only computes a reused subexpression once. -
straightfoward factorial recursion: lazy evaluation causes the expression to grow until the last minute.
-
factorial recursion with tail call: lazy evaluation still causes the expression to grow until the last minute.
-
factorial recursion, forcing evalution with case statement: at last, the expression stays the same size.
How to be lazy
See "Implementing Functional Languages: a tutorial" by Simon L Peyton Jones and David R Lester. Some of the above examples were taken from this free online book, and we will use its terminology. The talk "Into The Core" is also illuminating.
Our goal is a self-contained program that can lazily interpret recursive functions operating on integers. On the one hand, we want to get something working quickly, and on the other hand, we want something interesting. Recursion on integers means we can test our program on fun computations such as the Fibonacci numbers, yet our program will still be relatively small, because we’re only dealing with integers.
Our journey naturally meanders through two gems of functional programming: parser combinators, which we explore separately, and inductive data structures for graphs. Apart from these two, we’ll stay within the base Haskell system:
import Control.Monad
import Data.Char
import Data.Graph.Inductive.Graph
import Data.Graph.Inductive.PatriciaTree
import Data.Graph.Inductive.Query.DFS (reachable)
import Data.List
import qualified Data.Map as M
import Data.Tree
import Text.ParserCombinators.Parsec
Parser
The following parses a Core-like language featuring integers, infix operators
of different precedences and associativities, and case statements.
For simplicity, we parse expressions into Tree String
structures: this makes
it easier to reuse this parser in other projects.
program :: Parser [([String], Tree String)]
program = spaces >> sc `sepBy` want ";" <* eof where
sc = do
lhs <- many1 var
want "="
rhs <- expr
return (lhs, rhs)
expr = caseExpr <|>
molecule `chainr1` chop "^" `chainl1` chop "*/" `chainl1` chop "+-"
caseExpr = do
x <- between (want "case") (want "of") expr
as <- alt `sepBy` want ";"
return $ Node "case" (x:as)
alt = do
p <- lif num <|> lif var
want "->"
x <- expr
return $ Node "->" [p, x]
chop :: String -> Parser (Tree String -> Tree String -> Tree String)
chop xs = try $ do
s <- foldl1' (<|>) $ map (want . pure) xs
return $ \x y -> Node "" [Node "" [Node s [], x], y]
molecule = foldl1' (\x y -> Node "" [x, y]) <$> many1 atom
atom = between (want "(") (want ")") expr <|> lif var <|> lif num
var = try $ do
s@(h:_) <- tok
when (not (isLetter h) || s `elem` words "case of _") $ fail ""
return s
num = try $ do
s <- tok
unless (all isDigit s) $ fail ""
return s
lif = (flip Node [] <$>)
want t = try $ do
s <- tok
unless (s == t) $ fail $ "expected " ++ t
return s
tok = do
s <- many1 (alphaNum <|> char '_') <|> many1 (oneOf "+-/*^><=") <|>
string ";" <|> string "(" <|> string ")"
spaces
return s
Inductive Graphs
While data structures like lists and trees have been viewed recursively for years, graph algorithms are traditionally imperative. For example, descriptions of graph traversal may describe marking a node to prevent revisiting it.
Instead, if we define graphs with an inductive data type, we can write graph algorithms in a functional style.
Using such a library leads to succinct code for modifying the graph.
Instead of introducing indirection nodes to save work when the id
function
is present, we can simply reroute edges. This is probably slower, but
our goal is visualization, not an industrial-strength compiler.
We define the Snapshot
type to describe the interpeter state. It consists of
a graph and stack of nodes. Each node holds a function that describes how it
changes a Snapshot
, along with a string. We use the string for printing the
node, and also for storing integer values. The stack of nodes holds the spine,
and also temporarily holds the results of expressions.
The details are a little finicky. We label all the edges to distinguish left children from right children. Nodes are referenced by integers, and we must devise a numbering scheme. A self-loop trick ensures the stack nodes work correctly when dereferencing variables introduced in case alternatives. We could easily remove self-loops from our slideshows, but again, we want to keep our code short.
Since we want to show the changes to the graph over time, functions operate on lists of snapshots, so older states are preserved for display later.
We show expanding a supercombinator (top-level definition) in a single step. Finer resolution, such as stepping through case expressions, is only a bit more work, but we want to keep our code short. We skip garbage collection for the same reason.
data Funk = Funk String ([Snapshot] -> [Snapshot])
instance Show Funk where show (Funk s _) = s
type Prograph = Gr Funk Int
type Snapshot = (Prograph, [Int])
nsuc i g n = fst $ head $ filter ((== i) . snd) $ lsuc g n
intern i xs = foldl' (\m (id, n) -> M.insert id n m) M.empty $ zip xs [i..]
nip ((g, x:_:sp):rest) = (g, x:sp):rest
run :: [([String], Tree String)] -> [Snapshot]
run t = do
let
scs = intern 0 $ head . fst <$> t
funky s = Funk s f where
f | null s = \h@((g, n:_):_) -> whnf h $ nsuc 0 g n
| all isDigit s = id
| Just n <- M.lookup s scs = graft $ t!!n
| otherwise = case s of
"+" -> binOp (+)
"*" -> binOp (*)
"-" -> binOp (-)
"^" -> binOp (^)
"div" -> binOp div
"mod" -> binOp mod
"min" -> binOp min
"max" -> binOp max
bloom h@((g, sp):_) (vars, Node s kids)
| s == "case" = let
h1@((gc, nc:fn:ns):_) = step $ bloom h (vars, head kids)
findAlt (Node "->" [Node p@(c:_) [], e]:as)
| isLetter c = nip $
bloom ((insEdge (nc, nc, 1) gc, fn:nc:ns):h1) (p:vars, e)
| p == show (intAt gc nc) = bloom ((gc, fn:ns):h1) (vars, e)
| otherwise = findAlt as
in findAlt $ tail kids
| Just n <- elemIndex s vars = (g, nsuc 1 g (sp!!(n + 1)):sp):tail h
| otherwise = let
((gg, ns):hh, _) =
foldl' f ((([], n, funky s, []) & g, sp):tail h, 0) kids
n = if isEmpty g then 0 else 1 + snd (nodeRange g)
f (h1, i) kid = ((insEdge (n, m, i) ga, sp):h2, i + 1) where
(ga, m:sp):h2 = bloom h1 (vars, kid)
in (gg, n:ns):h
graft (vars, t) h@((g, spine):_) = let
h1@((g1, sp@(n:_)):_) = bloom h (tail vars, t)
(Just (ins, _, _, _), g2) = match (sp!!length vars) g1
g3 = insEdges ((\(lbl, src) -> (src, n, lbl)) <$> ins) g2
in step $ (g3, n : drop (length vars + 1) sp):h
step [(mkGraph [(0, funky "main")] [], [0])]
step h@((g, n:_):_) = f h where Just (Funk _ f) = lab g n
whnf ((g, sp):rest) n = step ((g, n:sp):rest)
intAt g n = read s :: Integer where Just (Funk s _) = lab g n
binOp f h0@((g, _:x:_):_) = (gz, y2:rest):h2 where
h1@((gx, _:_:_:y1:_):_) = whnf h0 $ nsuc 1 g x
h2@((gy, yv:xv:_:_:y2:rest):_) = whnf h1 $ nsuc 1 gx y1
z = f (intAt gy xv) (intAt gy yv)
(Just (ins, _, _, _), g1) = match y2 gy
gz = (ins, y2, Funk (show z) id, []) & g1
Output
The output is a slideshow in Asciidoc format that uses the graphviz plugin to draw graphs. The HTML slides above were generated by running:
asciidoc --backend=slidy OUTPUT_FILE
on the output of our program, which generates a DOT description of a graph:
dotty (g, sp) = unlines [
"== " ++ show (length sub) ++ " nodes ==",
"",
"[\"graphviz\"]", "----------------", "digraph program {",
"node [fontname=\"Inconsolata\", shape=box]", "ordering=out",
unlines $ dump <$> sub,
"}", "----------------"]
where
sub = reachable (last sp) g
dump n = let Just s = lab g n in concat [show n,
if null (show s) then "[label=\"\", shape=circle, width=0.4, height=0.4" ++ co ++ "]" else "[label=\"" ++ show s ++ "\"" ++ co ++ "]",
"\n", unlines $ dumpEdge n . fst <$> sortOn snd (lsuc g n)]
where co | n `elem` sp = ", style=filled, fillcolor=red"
| otherwise = ""
dumpEdge n t = show n ++ " -> " ++ show t
main = do
s <- getContents
case parse program "" s of
Left err -> putStrLn $ "parse error: " ++ show err
Right t -> do
putStr $ unlines ["= Graph Reduction =", "",
"----------------", s, "----------------"]
putStr $ unlines $ map dotty $ reverse $ run t