Code That Counts

Combinatorial calculations with combinator calculus.

Several explanations in Enumerative Combinatorics Volume 1 by Richard P. Stanley are outlines of algorithms. We follow along parts of the first chapter, but replace descriptions of algorithms with real code. [I began with the first edition; over time I’ll update this with material from the second edition.]

We use lists to represent sets. Unlike sets, lists can contain duplicate elements, and are ordered. This works to our advantage when we wish to consider multisets or ordered sets. Otherwise we rely on the “honour system”: we trust the relevant lists contain distinct elements and consider two lists to represent the same set if they consist of the same elements.

Outside of the base install, our code below depends on Data.List.Split:

import Control.Monad
import Data.Function
import Data.List
import Data.List.Split
import Data.Ord
import Data.Tree

Subsets

Subsets are also called combinations. The power set is the set of all subsets. Since we’re using lists to represent sets, the subsequences function lists all subsets of set. Or we can bust out a well-known mind-blowing trick:

powerset = filterM $ const [False, True]

The power set of an \(n\)-set has size \(2^n\), as suggested by the length of the list [False, True].

Subsets of size k are called k-subsets or k-combinations. Write \({n \choose k}\) for the number of k-subsets of a set of size n. We read this as “n choose k”. These numbers are called the binomial coefficients and can be computed as follows:

choose n k = product [n,n-1..n-k+1] `div` product [1..k]

The above falling factorials arise because there are \(n(n-1)..(n-k+1)\) ways of picking \(k\) elements from \(n\) where ordering matters, which we divide by \(k!\), the number of ways to order \(k\) elements.

The explanation for the falling factorial can be seen in the following recursion that generates the permutations of a list:

perms [] = [[]]
perms xs = concatMap (\x -> (x:) <$> perms (delete x xs)) xs

We wrote this code only to illustrate a point; the function Data.List.permutations should be used in practice.

To generate all combinations, instead of an approach based on the above formula, we use a recursion based on the recursion \({n \choose k} = {n - 1 \choose k - 1} + {n - 1 \choose k }\) of Pascal’s triangle:

subsets xs 0 = [[]]
subsets xs k
  | length xs < k = []
  | otherwise     = subsets t k ++ map (h:) (subsets t (k - 1))
  where (h:t) = xs

By considering multiplication of polynomials, we find:

\[ (1+x)^n = \sum_k {n\choose k} x^k \]

To translate this to code, we represent polynomials with lists of integers, with \([a_0, a_1, ...]\) representing \(a_0 + a_1 x + ...\). Then addition and multiplication of polynomials are given by:

p .+. q = dropWhileEnd (== 0) $ zipWith (+) (pad p) (pad q)
  where pad = take (on max length p q) . (++ repeat 0)
p      .*. [] = []
[]     .*. q  = []
(a:as) .*. q  = map (a *) q .+. (0:(as .*. q))

Now we can write a little test such as:

prop_binomial n =
  (choose n <$> [0..n]) == foldr1 (.*.) (replicate n [1, 1])

It is advantageous to redefine \({n\choose k}\) to be the coefficient of \(x^k\) in \((1 + x)^n\), so we can generalize beyond positive integral \(n\).

A composition of \(n\) is an ordered list of positive integers that sum to \(n\). A k-composition of \(n\) is a composition of \(n\) into \(k\) parts.

A weak composition of \(n\) is an ordered list of nonnegative integers that sum to \(n\). A weak k-composition of \(n\) is a weak composition of \(n\) into \(k\) parts.

We can generate all \(k\)-compositions and weak \(k\)-compositions of \([1..n]\) as follows:

compositions n k = f <$> subsets [1..n-1] (k-1)
  where f s = zipWith (-) (s ++ [n]) (0:s)

weakCompositions n k = map (+(-1)) <$> compositions (n+k) k

Thus the number of \(k\)-compositions of \(n\) is \({n-1\choose k-1}\), and the number of weak \(k\)-compositions of \(n\) is \({n+k-1\choose k-1}\).

Informally, a multiset is a set where elements may be repeated. More rigorously, a finite multiset \(M\) on a set \(S\) is a function \(v:S\to\Bbb{N}\) with \(\sum_{s\in S} v(s) < \infty\).

A k-combination with repetition of a set \(S\) is a multiset on \(S\) of size \(k\). The number of \(k\)-combinations with repetition of an \(n\)-set is denoted by \(\left({n \choose k}\right)\).

We can generate all \(k\)-combinations with repetition of \([1..n]\) as follows:

multisets n k = zipWith (+) [0,-1..] <$> subsets [1..n + k - 1] k

which implies \(\left({n \choose k}\right) = {n + k - 1 \choose k}\):

multichoose n k = choose (n + k - 1) k

We also have:

\[ (1 + x + x^2 + ...)^n = \sum x^{|M|} [M \text{ is a multiset on } [1..n]] = \sum_{k\ge 0} \left({n \choose k}\right) x^k \]

Thus:

\[ (1 + x + x^2 +... )^n = (1 - x)^{-n} = \sum_{k\ge 0} {-n \choose k}(-1)^k x^k \]

and therefore \(\left({n\choose k}\right) = (-1)^k {-n\choose k} = {n + k - 1\choose k}\), which shows how to extend binomial coefficients to negative numbers. The formula:

\[ \left({n\choose k}\right) = (-1)^k {-n\choose k} \]

is an example of a combinatorial reciprocity theorem.

Binomial cofficients can be generalized to multinomial coefficients. If \(a_1,...,a_m\) are nonnegative integers that sum to \(n\), that is, a weak \(m\)-composition of \(n\), then define \({n\choose a_1,...,a_m}\) to be number of ways to divide \([1..n]\) into \(m\) different categories. We can also view this as the number of permutations of the multiset where there are \(a_i\) copies of the \(i\)th element. Additionally, we have:

\[ (x_1 + ... + x_m)^n = \sum {n\choose a_1,...,a_m} x_1^{a_1}...x_m^{a_m} \]

When \(m = 2\), we have binomial coefficients. Conventionally, we write \({n\choose k}\) instead of \({n\choose k,n-k}\). The latter notation does have advantages: we immediately have \({n\choose k} = {n\choose n-k}\), and we’re mindful that choosing \(k\) elements out of an \(n\)-set is the same as choosing to omit \(n-k\) elements. The recurrence of Pascal’s triangle also becomes symmetric: \({n\choose a,b} = {n-1\choose a-1, b} + {n-1\choose a,b-1}\).

Multinomials can be computed by:

\[ {n\choose a_1,...,a_m} = \frac{n!}{a_1!...a_m!} \]

and satisfy the recurrence:

\[ {n\choose a_1,...,a_m} = {n\choose a_1-1,...,a_m} + ... + {n\choose a_1,...,a_m-1} \]

which forms the basis of the following program to generate all permutations of a multiset (or ways to categorize \(n\) elements so there are \(a_i\) elements in the \(i\)th category). Our code expects identical elements in the input list to be grouped together, for instance, "baaann" instead of "banana".

multiPerms [] = [[]]
multiPerms xs = concatMap f gs where
  gs = group xs
  f ys = (head ys:) <$> multiPerms $ concat $ a ++ (tail ys:b) where
    (a, _:b) = span (/= ys) gs

Permutations

A permutation of a set can be represented as a word. In particular, we can represent the permutation \(\pi\) on \([1..n]\) defined by \(\pi(i) = a_i\) as the word \(a_1 ... a_n\).

Given a permutation \(\pi\) on a finite set \(S\), for any \(x\in S\) the sequence \(x, \pi(x), \pi(\pi(x)), ...\) must eventually repeat, thus permutations can be written as a product of cycles:

cycles xs = f [] [1..n] where
  f [] [] = []
  f [] (i:is) = f [i] is
  f cs is | head cs == next = cs : f [] is
          | otherwise       = f (cs ++ [next]) $ delete next is
          where next = xs!!(last cs - 1)
  n = length xs

prop_example1_3a = cycles [4,2,7,1,3,6,5] == [[1,4],[2],[3,7,5],[6]]

We define a standard representation for writing products of cycles. We sort cycles in increasing order by their maximum elements, and each cycle is written starting with its maximum element. We write \(\hat\pi\) for the standard representation of \(\pi\).

Such a representation is unique, and furthermore, we can then remove all bracketing: a new cycle begins precisely when an element exceeds the previous element, namely, on all the left-to-right maxima. Thus the standard representation defines a bijection on permutations.

standardize :: Ord a => [[a]] -> [a]
standardize = concatMap (\c -> uncurry (flip (++)) $ span (< maximum c) c) .
  sortOn maximum

cyclesFromStandard []     = []
cyclesFromStandard (x:xs) = (x:a) : cyclesFromStandard b where
  (a, b) = span (< x) xs

prop_example1_3b = standardize (cycles [4,2,7,1,3,6,5]) == [2,4,1,6,7,5,3]

Given a permutation \(\pi\), let \(c_i(\pi)\) be the number of cycles of length \(i\). Then the type of a permutation \(\pi\) is the sequence \((c_1(\pi),...,c_n(\pi))\), and the total number of cycles of \(\pi\) is the sum of this sequence and denoted \(c(\pi)\).

cyclesOfLength xs k = length $ filter (== k) $ length <$> cycles xs

permType xs = cyclesOfLength xs <$> [1..length xs]

numCycles = length . cycles

The number of permutations of type \((c_1,...,c_n)\) is:

\[ \frac{n!}{1^{c_1}c_1!...n^{c_n}c_n!} \]

Why? Suppose we write \([1..n]\) in every possible order, and for each such ordering, from left to right we divide the elements into \(c_1\) cycles of length 1, then \(c_2\) cycles of length 2, and so on up to \(c_n\) cycles of length \(n\).

We have produced all permutations of the desired type, but we have counted each permutation multiple times. This is because we can write a cycle of length \(k\) in \(k\) different ways, and also because for each \(i\), we can permute \(c_i\) cycles amongst themselves without changing the product. The following code generates such duplicate representations of a given product of cycles.

dupPerms :: Ord a => [[a]] -> [[[a]]]
dupPerms = concatMap (mapM rots . concat) <$>
  mapM permutations . groupBy ((==) `on` length) . sortOn length where
    rots xs = take (length xs) $ iterate (\(a:as) -> as ++ [a]) xs

Write \(\begin{bmatrix}n\\k\end{bmatrix}\) for the number of permutations of a set of size \(n\) with exactly \(k\) cycles. These are known as the signless Stirling numbers of the first kind. (We adopt the notation advocated by Knuth rather than the \(c(n, k)\) that Stanley uses.)

We generate all permutations of \([1..n]\) with \(k\) cycles as follows:

zipSplits f xs = zipWith f (inits xs) (tails xs)

kCyclePerms :: Int -> Int -> [[[Int]]]
kCyclePerms 0 0 = [[]]
kCyclePerms _ 0 = []
kCyclePerms 0 _ = []
kCyclePerms n k = concatMap (concat . zipSplits f) (kCyclePerms (n-1) k) ++
  (([n]:) <$> kCyclePerms (n-1) (k-1)) where
    f xs [] = []
    f xs (y:zs) = (xs ++) . (:zs) <$> tail (zipSplits (\a b -> a ++ (n:b)) y)

The above explains the recursion:

\[ \begin{bmatrix}n\\k\end{bmatrix} = (n - 1) \begin{bmatrix}n - 1\\k\end{bmatrix} + \begin{bmatrix}n - 1\\k - 1\end{bmatrix} \]

which we also translate to code:

stirling1 0 0 = 1
stirling1 0 k = 0
stirling1 n 0 = 0
stirling1 n k = (n - 1) * stirling1 (n - 1) k + stirling1 (n - 1) (k - 1)

Thanks to standard representations, the number of permutations with \(k\) left-to-right maxima is \(\begin{bmatrix}n\\k\end{bmatrix}\).

We can prove:

\[ \sum_{k=0}^n \begin{bmatrix}n\\k\end{bmatrix} x^k = x(x+1)...(x+n-1) \]

by showing the coefficients satisfy the same recursive relations, but there are other methods that are more fun.

Another proof: the \(x^k\) terms on the right-hand side correspond to \((n-k)\)-subsets of \([1..n-1]\). The product of such a subset can be viewed as the number of functions \(f\) from its \(n-k\) members to \([1..n-1]\) satisfying \(f(i) \le i\). Thus we need to show that each choice of an \((n-k)\)-subset along with a function \(f\) corresponds to a k-cycle permutation:

-- Answer given in standard representation.
permFromSubsetFunction n s f = foldl ins t $ zip f ([n,n-1..1] \\ t) where
  t = [1..n] \\ map (n -) s
  ins u (a, b) = u0 ++ (b:u1) where
    Just (u0, u1) = find ((== a) . length . filter (> b) . fst) $
      zip (inits u) (tails u)

Yet another proof: we show the polynomials in question agree for all positive integers \(x\). For such an \(x\), the left-hand side is the number of \(k\)-cycle permutations multiplied by the number of functions from \(k\) objects to \([1..x]\), while the right-hand side is the number of integer sequences \((a_1,...,a_n)\) satisfying \(0 \le a_i \le x - 1 + n - i\). (The \(a_i\) are “backwards” and zero-indexed for historical reasons; our code will undo this.)

We can produce a permutation and function from such a sequence as follows:

permFunctionFromSequence n x as = foldl f ([], []) $ zip bs [n, n-1..] where
  f (ps, fs) (b, n) | b <= x    = ([n]:ps, b:fs)
                    | otherwise = (ins n (b - x) ps, fs)
  bs = (+1) <$> reverse as
  ins n k (p:ps) | length p >= k = (p0 ++ n:p1) : ps
                 | otherwise     = p : ins n (k - length p) ps
                 where (p0, p1) = splitAt k p

Setting \(x = 1\) shows for all positive integers \(n, k\), the number of sequences \((a_1, ..., a_n)\) with \(0 \le a_i \le n - i\) containing exactly \(k\) zeroes is \(\begin{bmatrix}n\\k\end{bmatrix}\). Or in code:

prop_137 n k =
  stirling1 n k == length (filter ((k ==) . length . filter (0 ==)) $
    sequence $ enumFromTo 0 . (n -) <$> [1..n])

We can build a permutation \(\pi\) in the form of a word more directly from such a sequence:

permFromInversionTable n as = foldl f [] $ zip (reverse as) [n,n-1..] where
  f p (a, n) = p0 ++ n:p1 where (p0, p1) = splitAt a p

In the sequence, \(a_i\) is the number of entries \(j\) in \(\pi\) to the left of \(i\) such that \(j > i\). As hinted by our function name, if \(\pi = b_1 ... b_n\) then \((b_i, b_j)\) is an inversion if \(i < j\) and \(b_i > b_j\), and the sequence \(a_i\) is called the inversion table of the permutation \(\pi\), and written \(I(\pi)\). An inversion table is another way to represent a permutation.

inversionTable p = (\a -> length $ filter (> a) $ takeWhile (/= a) p) <$> p

If \(i(\pi)\) denotes the number of inversions in the permutation \(\pi\), we see:

\[ \sum_{\pi\in S_n} q^{i(\pi)} = (1 + q)(1 + q + q^2)...(1 + q + ... + q^{n-1}) \]

The descent set \(D(\pi)\) of a permutation \(\pi = a_1 ... a_n\) is defined by:

descentSet p = map fst $ filter snd $ zip [1..] $ zipWith (>) p $ tail p
-- With a list comprehension (and sadly not pointfree):
descentSet' p = [i | i <- [1..length p - 1], p!!(i-1) > p!!i]

For a subset \(S\) of \([1..n-1]\), define \(\alpha(S)\) and \(\beta(S)\) as follows:

alpha n s = length [p | p <- permutations [1..n],
  descentSet p `isSubsequenceOf` s]
beta n s  = length [p | p <- permutations [1..n], descentSet p == s]

Thus an alternative definition of \(\alpha(S)\) is:

alpha' n s = sum [beta n t | t <- subsequences s]

If we write \(S\) as an increasing sequence \(s_1, ..., s_k\), then

\[ \alpha(S) = {n \choose {s_1, s_2 - s_1, ..., n - s_k}} \]

For a permutation \(\pi\), write \(des(\pi)\) for the number of descents \(|D(\pi)|\). Then

\[ A_n(x) = \sum_{\pi\in S_n} x^{1+des(\pi)} \]

is an Eulerian polynomial. We write \(A(n, k)\) for the coefficient of \(x^k\) in \(A_n(x)\), and we call it an Eulerian number. Hence:

eulerian n k = length $ filter (((k - 1) ==) . length . descentSet) $ permutations [1..n]

eulerianPolys = [[eulerian n k | k <- [1..n]] | n <- [1..]]

The definition of the standard representation of a permutation implies:

\[ n - des(\hat\pi) = \#\{i \in [1..n] | \pi(i) \ge i\} \]

A number \(i\) is called a weak excedance of \(\pi\) if \(\pi(i)\ge i\), and an excedance if \(\pi(i) > i\):

weakExcedances  p = map fst $ filter (uncurry (>=)) $ zip p [1..]
excedances      p = map fst $ filter (uncurry (>))  $ zip p [1..]
-- Less fun versions:
weakExcedances' p = [i | i <- [1..length p], p!!(i-1) >= i]
excedances'     p = [i | i <- [1..length p], p!!(i-1) >  i]

From:

prop_weakVsStrictExcedances p =
  length (weakExcedances p) == n - length (excedances $ (n + 1 -) <$> reverse p)
  where n = length p
prop_descentCounts p = null p ||
  length (descentSet p) == n - 1 - length (descentSet $ reverse p)
  where n = length p

it follows that the Eulerian number \(A(n, k+1)\) is equal to the number of permutations of \(S_n\) with \(k\) excedances and also the number of permutations of \(S_n\) with \(k+1\) weak excedances.

The sum of the descent set of a permutation \(\pi\) is called the greater index and is denoted \(\iota(\pi)\). It is also called the major index and denoted \(maj(\pi)\). It turns out:

prop_remarkable n =
  (sort $ sum . inversionTable <$> permutations [1..n]) ==
  (sort $ sum . descentSet     <$> permutations [1..n])

Trees

By recursively finding the minimum, we can represent a permutation as a binary tree where each child node exceeds the parent node, which we’ll call the heap property.

heap [] = Node Nothing  []
heap p  = Node (Just m) [heap p0, heap p1]
  where (p0, m:p1) = span (/= minimum p) p

Define an element \(w_i\) of a word \(w_1...w_n\) to be

a double rise or double ascent

if \(w_{i-1} < w_i < w_{i+1} \)

a double fall or double descent

if \(w_{i-1} > w_i > w_{i+1} \)

a peak

if \(w_{i-1} < w_i > w_{i+1} \)

a valley

if \(w_{i-1} > w_i < w_{i+1} \)

where we set the sentinels \(w_0 = w_{n+1} = 0\). These four classes of “terrain” determine whether a node has no children, a left child, a right child, or both.

From this bijection, we see:

  • The number of binary trees with nodes \([1..n]\) with the heap property is \(n!\).

  • The number of such trees with \(k\) nodes with left children is \(A(n, k + 1).\)

  • The number of such trees with \(k\) leaf nodes is equal to the number of such trees with \(k\) nodes with two children.

  • The number of such trees that are full (that is, each node has zero or two children) with \(2n + 1\) nodes is equal to the number of alternating permutations of \(S_{2n+1}\):

    \[ a_1 > a_2 < a_3 > ... < a_{2n+1} \]

We can also represent a permutation \(\pi\) as an unoriented heap. We prepend the sentinel value of 0 to our permutation, and start with root node of 0. Then the children of the node \(n\) are all nodes \(a\) that appear to the right of \(n\) in \(\pi\) satisfying \(n < a\) and \(b > a\) for all numbers \(b\) that appear between \(n\) and \(a\) in \(\pi\).

unorientedHeap p = f (0:p) 0 where
  f p n = Node n $ f p <$>
    filter (\a -> n < a && head (dropWhile (> a) q) == a) q
      where q = tail $ dropWhile (/= n) p

Thus:

  • The number of unoriented heaps with \(n+1\) nodes is \(n!\).

  • The number of such trees where the root has \(k\) children is \(\begin{bmatrix}n\\k\end{bmatrix}\).

  • The number of such trees with \(k\) leaf nodes is \(A(n, k)\).

Partitions

A partition of a natural number \(n\) is a non-increasing sequence of integers that sums to \(n\). We consider two partitions to be equivalent if we ignore terminating 0s. A partition is like a composition, except the order of the summands is unimportant.

Write \(p(n)\) for the number of partitions of \(n\), \(p_k(n)\) for the number of partitions of \(n\) into \(k\) parts, and \(p(j, k, n)\) for the number of partitions of \(n\) into \(k\) parts with largest part at most \(j\).

We can generate all partitions with:

intPartitions 0 0 = [[]]
intPartitions n k
  | k <= 0 || n <= 0 = []
  | otherwise = ((++ [1]) <$> intPartitions (n - 1) (k - 1)) ++
    (map (+1) <$> intPartitions (n - k) k)

which gives the recurrence:

\[ p_k(n) = p_{k-1}(n - 1) + p_k(n - k) \]

that is:

-- Awkward name to avoid defining `p` at top-level.
pPar 0 0 = 1
pPar n k
  | k<= 0 || n <= 0 = 0
  | otherwise       = pPar (n - 1) (k - 1) + pPar (n - k) k

The following code produces the Ferrers diagram or Ferrers graph of a partition:

ferrers = unlines . map (`replicate` '*')

If we had drawn squares instead, then we would have a Young diagram.

A partition of a finite set \(N\) is a set of disjoint subsets \(\{B_1,...,B_k\}\) of \(N\) whose union is \(N\). A subset \(B_i\) is called a block.

Roughly speaking, partitions of a set are like partitions of an integer, except the elements of a set are distinguishable. We can generate them with:

partitions [] 0 = [[]]
partitions [] _ = []
partitions _  0 = []
partitions (x:xs) k =
  -- Either add `x` to an existing block of `partitions xs k`...
  concat (zipWith (zipWith ($)) (iterate (id:) $
  (x:):repeat id) . replicate k <$> partitions xs k) ++
    -- ...or create a new block containing just `x`.
    (([x]:) <$> partitions xs (k - 1))

Define \(\begin{Bmatrix}n\\k\end{Bmatrix}\) to be the number of ways to partition \(N\) into \(k\) blocks. These are called the Stirling numbers of the second kind. (Again, we adopt Knuth’s recommendation instead of the more standard \(S(n, k)\).)

Our partitions function implies:

stirling2 0 0 = 1
stirling2 _ 0 = 0
stirling2 0 _ = 0
stirling2 n k = k * stirling2 (n - 1) k + stirling2 (n - 1) (k - 1)

The total number of partitions of an \(n\)-set is called a Bell number and is denoted \(B(n)\).

bell n = sum $ stirling2 n <$> [0..n]

prop_bell10 = map bell [0..10] ==  [1,1,2,5,15,52,203,877,4140,21147,115975]

The Twelvefold Way

Two functions \(f, g : N \to X\) are equivalent with \(N\) indistinguishable if there exists a bijection \(\pi : N \to N\) such that \(f \circ \pi = g\).

They are equivalent with \(X\) indistinguishable if there exists a bijection \(\sigma : X \to X\) such that \(\sigma \circ f = g\).

They are equivalent with \(N\) and \(X\) indistinguishable if there exists bijections \(\pi : N \to N\) and \(\sigma : X \to X\) such that \(\sigma \circ f \circ \pi = g\).

Given \(n, x\) and two booleans representing the distinguishability of an \(n\)-set \(N\) and and \(x\)-set \(X\), the following function returns a list containing the number of functions, injections, and surjections from \(N\) to \(X\).

numFunctions n x nDist xDist = case (nDist, xDist) of
  (True , True)  -> [x^n, product [x-n+1..x], product [1..x] * s n x]
  (False, True)  -> [multichoose x n, choose x n, multichoose x $ n - x]
  (True , False) -> [sum $ s n <$> [1..x], fromEnum $ n <= x, s n x]
  (False, False) -> [sum $ p n <$> [1..x], fromEnum $ n <= x, p n x]
  where
    p = pPar
    s = stirling2

DIY

The source of this webpage is literate Haskell. THe following should work:

$ cabal install split
$ wget https://crypto.stanford.edu/~blynn/haskell/count.lhs
$ ghci count.lhs

Try typing:

choose 10 5
compositions 4 2
intPartitions 4 2
partitions [1..4] 2
bell 7

QuickCheck-style properties are sprinkled throughout. They are prefixed with prop_ and were written to help explain the material. However, they can also be run, or even checked with QuickCheck. Care is needed as some of the computations have exponential complexity. Some also assume an input list is a permutation.


Ben Lynn blynn@cs.stanford.edu 💡