Horner's Method

| math

I was in the library the other day and an introduction to numerical mathematics1 fell into my hands. It's a nice little book, but pretty dense and maybe not ideal for self-study. Already in the second chapter I had to work things out with pen and paper. This is a clean and more verbose version of my paper notes.

Polynomial definition

A polynomial is defined as the following function. $$ p_n(x) = \sum_{i=0}^n a_i x^i $$ where the coefficients \(a_i\) can be real or complex numbers. If \(a_n \neq 0\), we say that the degree of the polynomial is \(n.\) To evaluate a polynomial for some \(x=x_0\), we can simply plug this value into the formula and calculate the result. Since there are \(n+1\) terms in the sum, we'll need \(n\) additions, and because each term has \(i = 0, 1, \ldots, n\) factors \(x\), we'll need \(1 + 2 + \cdots n = \frac{n (n+1)}{2}\) multiplications.

Efficient evaluation

We can do with less, though. With a bit of clever nesting, we get the same result with only \(n\) additions and \(n\) multiplications. $$ p_n(x) = a_0 + x_0 ( a_1 + x_0 ( a_2 + \ldots + x_0( a_{n-1} + x_0 a_n )) \cdots ) $$

The expression on the right should have \(n-1\) pairs of parentheses if fully expanded. This is called Horner's method.

Instead of the equation above with the handwavy dots, we can also give a short recursive definition of the same method. $$ \begin{align*} b_n &= a_n\\ b_i &= a_i + x_0 b_{i+1},\quad i = n-1, n-2, \ldots, 0 \end{align*} $$

Applying this definition, of course, yields the same result.

Synthetic Division

Another thing we can do with the recursive definition is to solve for \(a_i = b_i - x_0 b_{i+1}\) and plug this back into our polynomial definition. It's not obvious why we would want to do this, but we'll just go ahead. $$ \begin{align*} p_n(x) &= \sum_{i=0}^n a_i x^i \\ &= \sum_{i=0}^{n-1} a_i x^i + a_n x^n\\ &= \sum_{i=0}^{n-1} ( b_i - x_0 b_{i+1}) x^i + b_n x^n\\ \end{align*} $$

We pulled the base case of the recursion (\(b_n = a_n\)) out of the sum and did the substitution. With a bit of shuffling we get an interesting formula which involves a new polynomial2 \(p_{n-1} = \sum_{i=1}^n b_i x^{i-1}\) and the value of \(p_n\) at \(x=x_0,\) which is \(b_0.\) $$ \begin{align*} p_n(x) &= b_0 + \sum_{i=1}^{n-1} b_i x^i - x_0 \sum_{i=1}^n b_i x^{i-1} + b_n x^n\\ &= b_0 + x \sum_{i=1}^n b_i x^{i-1} - x_0 \sum_{i=1}^n b_i x^{i-1}\\ &= \left( \sum_{i=1}^n b_i x^{i-1} \right) (x - x_0) + b_0\\ &= p_{n-1}(x) (x - x_0) + p_n(x_0) \end{align*} $$

What's the significance of this formula? It's a special case of synthetic division called Ruffini's rule, in which the divisor is a linear factor \(x - x_0.\) The quotient \(p_{n-1}\) is a polynomial of one degree less than the dividend \(p_n.\)

Manually performing such a division is best done in a table, for example, with \(p_3 = 2x^3 + x^2 - 4x -7\) and \(x_0 = 2,\) we can set up the following table.

\(x_0 = 2\) \(a_3 = 2\) \(a_2 = 1\) \(a_1 = -4\) \(a_0 = -7\)
\(x_0 b_3 = 4\) \(x_0 b_2 = 10\) \(x_0 b_1 = 12\)
\(b_3 = 2\) \(b_2 = 5\) \(b_1 = 6\) \(b_0 = 5\)

The first row has \(x_0\) and the coefficients of the dividend. From the recursive definition of Horner's rule, we know that \(b_3 = a_3.\) Knowing \(b_3,\) we can calculate the first entry in the second row \(x_0 b_3 = 4\). Adding this value to the coefficient above gives us \(b_2 = a_2 + x_0 b_3 = 5\) (following the second case of Horner's rule), and so on. The result can be read off the last row: \(b_1\) to \(b_3\) are the coefficients of \(p_2,\) and \(b_0 = p_3(x_0)\) is the remainder. $$ p_3(x) = ( 2x^2 + 5x + 6 ) ( x - 2 ) + 5 $$

But the usefulness of Horner's method doesn't end here.

Differentiation

If we take the derivative of \(p_n(x) = p_{n-1}(x) (x - x_0) + p_n(x_0)\), the second summand is dropped because it's constant while we can apply the product rule to the first summand. $$ p_n'(x) = p_{n-1}'(x) (x - x_0) + p_{n-1}(x) $$

What about the second derivative? We simply calculate the first derivative of the first derivative. $$ \begin{align*} p_n''(x) &= p_{n-1}''(x) (x - x_0) + p_{n-1}'(x) + p_{n-1}'(x)\\ &= p_{n-1}''(x) (x - x_0) + 2 \cdot p_{n-1}'(x) \end{align*} $$

I think I see a pattern here. Let's replace the ticks with a superscript number and calculate the third derivative. $$ \begin{align*} p_n^{(3)}(x) &= p_{n-1}^{(3)}(x) (x - x_0) + p_{n-1}^{(2)}(x) + 2 \cdot p_{n-1}^{(2)}(x)\\ &= p_{n-1}^{(3)}(x) (x - x_0) + 3 \cdot p_{n-1}^{(2)}(x) \end{align*} $$

Clearly we're seeing this pattern for \(0 \leq k \leq n\): $$ p_n^{(k)}(x) = p_{n-1}^{(k)}(x) (x - x_0) + k \cdot p_{n-1}^{(k-1)}(x) $$

Now if we're only interested in calculating the derivative at a certain \(x = x_0\), the formula simplifies. $$ p_n^{(k)}(x_0) = k \cdot p_{n-1}^{(k-1)}(x_0) $$

This equation can be applied recursively, so $$ k \cdot p_{n-1}^{(k-1)}(x_0) = k (k-1) \cdot p_{n-2}^{(k-2)}(x_0) $$ and following through with these applications $$ p_n^{(k)}(x_0) = k! \cdot p_{n-k}(x_0). $$

This means that we can calculate the kth derivative of \(p_n\) by \(k+1\) successive applications of Horner's rule. For example, we can continue our table from above. I omit the coefficient names for readability and to avoid confusion between as and bs.

\(x_0 = 2\) \(2\) \(1\) \(-4\) \(-7\)
\(4\) \(10\) \(12\)
\(2\) \(5\) \(6\) \(5 = p_3(2)\)
\(4\) \(18\)
\(2\) \(9\) \(24 = \frac{p'_3(2)}{1!} \)
\(4\)
\(2\) \(13 = \frac{p^{(2)}_3(2)}{2!}\)
\(2 = \frac{p^{(3)}_3(2)}{3!}\)

With each application of Horner's rule, the degree of the polynomial decrements by one, so we need \(n + (n-1) + \ldots + (n-k) = (k+1) \frac{2n - k}{2}\) additions and multiplications.


1 Opfer, Gerhard (2008). Numerische Mathematik für Anfänger, 5. Auflage. Vieweg-Teubner.

2 The indices of the coefficients run from \(1\) through \(n\) here, but that doesn't really matter. To comply with our initial definition of a polynomial, we could simply shift all the \(b\) indices by \(-1\), so \(p_n(x_0) = b_{-1}\), or define the recursion as \(b_{n-1} = a_n,\) \(b_j = a_{j+1} + x_0 b_{j+1}\) for \(j = n-2, n-3, \ldots, 0\) and \(p_n(x_0) = a_0 + x_0 b_0.\) The latter is what the book does, but I found it a bit unwieldly (basically you need to understand the whole derivation before you know why the recursion was set up in this weird way).