-- Bird and Gibbons, Algorithm Design with Haskell inserts x = \case [] -> [[x]] (y:ys) -> (x:y:ys):map (y:) (inserts x ys) perms = foldr (\x xss -> inserts x =<< xss) [[]] sgnPerm = \case [] -> id x:xs -> foldr (.) (sgnPerm xs) [negate | y <- xs, x > y]
Determinants
We’ve managed to reach the Jordan canonical form without encountering determinants. But at some point, anyone studying linear algebra must learn about these things.
-
1693: Leibniz defines the determinant, albeit with an awkward and impractical method of finding the sign of a permutation.
-
1748: Fontaine finds identities involving some kinds of determinants.
-
1750: Cramer introduces Cramer’s rule, a general formula that solves for any variable in a linear system; the determinant is the denominator of the answer.
(Muir assumes the reader is fluent in French, and other languages. See Antoni Kosinski, Cramer’s Rule Is Due to Cramer for an English version of the relevant paragraphs.)
Cramer’s work was noticed and spread, gradually, then suddenly. My guess is that he tired of Gaussian elimination, so slogged through an epic Gaussian elimination on a matrix whose entries were all variables.
Even today, the resulting elegant combinatorial expression is valuable for humans, though it is less useful for computers, as the algorithmic complexity is unchanged, and we lose the ability to order operations to improve accuracy when using floating point.
Permutations
We need to enumerate permutations and compute their signs:
We show the permutations of \([1..4]\) and their signs:
[(pi, sgnPerm pi 1) | pi <- perms [1..4]]
Cramer’s Rule
Suppose we wish to solve the linear system:
\[ \DeclareMathOperator{\sgn}{sgn} \begin{align} a_{11} v_1 + & … + a_{1n} v_n = b_1 \\ & \vdots \\ a_{n1} v_1 + & … + a_{nn} v_n = b_n \\ \end{align} \]
for \(v_1, …, v_n\). In modern notation:
\[ A \begin{bmatrix} v_1 \\ \vdots \\ v_n \end{bmatrix} = \begin{bmatrix} b_1 \\ \vdots \\ b_n \end{bmatrix} \]
where \(A\) is an \(n\times n\) matrix with coefficients \(a_{ij}\). Define the determinant of \(A\) by:
\[ \sum_{\pi\in S_n} \sgn(\pi) a_{\pi(1)1} … a_{\pi(n)n} \]
where \(S_n\) is the set of permutations of \( [1..n] \). This is equivalent to:
\[ \sum_{\pi\in S_n} \sgn(\pi) a_{1\pi(1)} … a_{n\pi(n)} \]
because a permutation and its inverse have the same sign. We denote the determinant of \(A\) by \(\det(A)\) or \(|A|\). The vertical bars are handy when writing out matrices in full:
\[ \begin{vmatrix} 1 & 2 \\ 3 & 4 \end{vmatrix} = 1\times 4 - 2\times 3 = -2 \]
det a = sum [sgnPerm pi $ product $ zipWith get pi [1..n]
| pi <- perms [1..n]]
where
n = length a
get i j = a!!(i - 1)!!(j - 1)
riceEqns = [ [3, 2, 1, 39] , [2, 3, 1, 34] , [1, 2, 3, 26] ] a = init <$> riceEqns bs = last <$> riceEqns det a
We turn this expression into a function where the first column of the matrix has been replaced with input variables. Define the function \(f : K^n → K\) by:
\[ f(x_1,…,x_n) = \sum_{\pi\in S_n} \sgn(\pi) x_{\pi(1)} a_{\pi(2)2} … a_{\pi(n)n} \]
detCol k a xs = sum [sgnPerm pi $ product $ zipWith get pi [1..n]
| pi <- perms [1..n]]
where
n = length a
get i j
| j == k = xs!!(i - 1)
| otherwise = a!!(i - 1)!!(j - 1)
f = detCol 1 a
We could have started from this function and defined:
\[ \det(A) = f(a_{11}, …, a_{n1}) \]
f $ head <$> a
Then if \(\det(A) \ne 0\), we can solve for \(v_1\):
\[ v_1 = \frac{f(b_1,…,b_n)}{\det(A)} \]
f bs / det a
The numerator can be seen as the determinant of the matrix \(A\) except the first column has been replaced by \(b_1,…,b_n\).
Solving for \(v_2, …, v_n\) is similar.
detCol 2 a bs / det a detCol 3 a bs / det a
Proof: Consider applying \(f\) to the second column. This is \(\det(A')\) for some matrix \(A'\) whose first column is identical to the second column, that is if \(a'_{ij}\) denotes the entries of \(A'\), we have:
\[ a'_{k1} = a'_{k2} \]
for all rows \(k\). In other words:
\[ f(a_{12}, ..., a_{n2}) = \det(A') = \sum_{\pi\in S_n} \sgn(\pi) a'_{1\pi(1)} a'_{2\pi(2)} ... a'_{n\pi(n)} \]
Let \(\tau = (1\:2) \), the permutation that transposes 1 and 2. Then we also have:
\[ \begin{align} \det(A') &= \sum_{\pi\in S_n} \sgn(\tau \pi) a'_{1\tau\pi(1)} a'_{2\tau\pi(2)} ... a'_{n\tau\pi(n)} \\ & = \sum_{\pi\in S_n} \sgn(\tau \pi) a'_{1\pi(1)} a'_{2\pi(2)} ... a'_{n\pi(n)} \end{align} \]
Since \(\sgn(\tau) = -1\) we have shown \(\det(A') = -\det(A')\) and hence \(f(a_{12},…,a_{n2}) = 0\). In general:
\[ f(a_{1k},…,a_{nk}) = 0 \]
for all columns \(k > 1\).
f ((!!1) <$> a) f ((!!2) <$> a)
Next, from its definition, we see that \(f\) is a linear combination of its inputs, that is:
\[ f(x_1,…,x_n) = c_1 x_1 + … + c_n x_n \]
for some \(c_1, …, c_n \in K\). Multiplying the \(i\)th equation by \(c_i\) and summing them all yields:
\[ \begin{align} f(a_{11},…,a_{n1}) v_1 + f(a_{12},…,a_{n2}) v_2 + … + f(a_{1n},…,a_{nn}) v_n \\ = f(b_1,…,b_n) \end{align} \]
From above, most of these terms disappear, and we’re left with:
\[ \det(A) v_1 = f(b_1,…,b_n) \]
QED.
This proof has some elements in common with the Chinese Remainder Theorem and the Lagrange interpolating polynomial. In all cases, we cleverly choose a linear combination where all but one of the terms disappear at the crucial moment.
If there is no solution, or if there are infinite solutions, then we must have \(\det(A) = 0\) to avoid contradictions. However, we have not shown that the existence of a unique solution implies \(\det(A) \ne 0\).
We can prove this by first proving:
\[ \det(A)\det(B) = \det(AB) \]
If \(A\) is noninvertible then \(AB\) must also be noninvertible, otherwise we can construct an inverse of \(A\). Thus \(\det(A) = 0\) implies \(\det(AB) = 0\). Similarly for noninvertible \(B\).
Otherwise, consider an invertible matrix \(A\). Then we can reach the identity matrix via elementary row operations. The corresponding elementary matrices are simple enough that we can easily compute their determinants:
-
the matrix scaling a row by \(a\) has determinant \(a\)
-
the matrix swapping any two rows has determinant \(-1\)
-
the matrix adding a multiple of one row to another has determinant \(1\)
From the formula for the determinant, one can check that performing the corresponding operations on matrix \(A\) scales \(det(A)\) by the same factor; the toughest case is adding a multiple of one row to another, which in brief is a consequence of two identical rows leading to a zero determinant (just like two identical columns, which we used above), and because multiplication distributes over addition.
Thus we show \( \det(A)\det(B) = \det(AB) \) for invertible matrices by breaking them down into elemntary matrices.
The Characteristic Equation
Now that we have the determinant at our disposal, we can clear a different path to eigenvalues. The characteristic equation of an \(n\times n\) matrix \(A\) is:
\[ \det (A - x) \newcommand{\v}{\mathbf{v}} \]
A scalar \(\lambda\) is a root if and only if \(\det (A - \lambda) = 0\). This happens exactly when \(A - \lambda\) is non-invertible, and any vector \(v\) in its nontrivial kernel satisfies:
\[ A \v = \lambda \v \]
In other words, we see \(\v\) is an eigenvector with eigenvalue \(\lambda\).
The characteristic equation of the transition matrix \(T\) from our rabbits example is:
\[ \begin{vmatrix} 1 - x & 1 \\ 1 & -x \end{vmatrix} = (1 - x)(-x) - 1 = x^2 - x - 1 \]
The roots of \(x^2 - x - 1\) are:
\[ \begin{align} \phi &= \frac{1 + \sqrt{5}}{2} \\ \psi &= \frac{1 - \sqrt{5}}{2} \end{align} \]
In this case, there are 2 distinct roots, so the Jordan normal form of \(T\) must be:
\[ \begin{bmatrix} \phi & 0 \\ 0 & \psi \end{bmatrix} \]
Next, with Gaussian elimination, we solve:
\[ \begin{bmatrix} 1 - \phi & 1 \\ 1 & -\phi \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \]
to find the eigenspace with eigenvalue \(\phi\). We soon find that the solutions are generated by the vector:
\[ \begin{bmatrix} \phi \\ 1 \end{bmatrix} \]
Similarly, we find the eigenvectors corresponding to the eigenvalue \(\psi\) are generated by:
\[ \begin{bmatrix} \psi \\ 1 \end{bmatrix} \]
This lets us determine the matrix that changes the basis from the standard basis to the above two eigenvectors, along with its inverse, and we have revealed the secret behind the magic numbers that helped us compute the long-term behviour of the rabbit population.