Horner's Method

Aug 26, 2023 | math

I was in the library the other day and an introduction to numerical mathematics¹ fell into my hands. It's a nice little book, but pretty dense and maybe not ideal for self-study. Already in the second chapter I had to work things out with pen and paper. This is a clean and more verbose version of my paper notes.

Polynomial definition

A polynomial is defined as the following function. $$ p_n(x) = \sum_{i=0}^n a_i x^i $$ where the coefficients $a_i$ can be real or complex numbers. If $a_n \neq 0$, we say that the degree of the polynomial is $n.$ To evaluate a polynomial for some $x=x_0$, we can simply plug this value into the formula and calculate the result. Since there are $n+1$ terms in the sum, we'll need $n$ additions, and because each term has $i = 0, 1, \ldots, n$ factors $x$, we'll need $1 + 2 + \cdots n = \frac{n (n+1)}{2}$ multiplications.

Efficient evaluation

We can do with less, though. With a bit of clever nesting, we get the same result with only $n$ additions and $n$ multiplications. $$ p_n(x) = a_0 + x_0 ( a_1 + x_0 ( a_2 + \ldots + x_0( a_{n-1} + x_0 a_n )) \cdots ) $$

The expression on the right should have $n-1$ pairs of parentheses if fully expanded. This is called Horner's method.

Instead of the equation above with the handwavy dots, we can also give a short recursive definition of the same method. $$ \begin{align*} b_n &= a_n\\ b_i &= a_i + x_0 b_{i+1},\quad i = n-1, n-2, \ldots, 0 \end{align*} $$

Applying this definition, of course, yields the same result.

Synthetic Division

Another thing we can do with the recursive definition is to solve for $a_i = b_i - x_0 b_{i+1}$ and plug this back into our polynomial definition. It's not obvious why we would want to do this, but we'll just go ahead. $$ \begin{align*} p_n(x) &= \sum_{i=0}^n a_i x^i \\ &= \sum_{i=0}^{n-1} a_i x^i + a_n x^n\\ &= \sum_{i=0}^{n-1} ( b_i - x_0 b_{i+1}) x^i + b_n x^n\\ \end{align*} $$

We pulled the base case of the recursion ($b_n = a_n$) out of the sum and did the substitution. With a bit of shuffling we get an interesting formula which involves a new polynomial² $p_{n-1} = \sum_{i=1}^n b_i x^{i-1}$ and the value of $p_n$ at $x=x_0,$ which is $b_0.$ $$ \begin{align*} p_n(x) &= b_0 + \sum_{i=1}^{n-1} b_i x^i - x_0 \sum_{i=1}^n b_i x^{i-1} + b_n x^n\\ &= b_0 + x \sum_{i=1}^n b_i x^{i-1} - x_0 \sum_{i=1}^n b_i x^{i-1}\\ &= \left( \sum_{i=1}^n b_i x^{i-1} \right) (x - x_0) + b_0\\ &= p_{n-1}(x) (x - x_0) + p_n(x_0) \end{align*} $$

What's the significance of this formula? It's a special case of synthetic division called Ruffini's rule, in which the divisor is a linear factor $x - x_0.$ The quotient $p_{n-1}$ is a polynomial of one degree less than the dividend $p_n.$

Manually performing such a division is best done in a table, for example, with $p_3 = 2x^3 + x^2 - 4x -7$ and $x_0 = 2,$ we can set up the following table.

$x_0 = 2$	$a_3 = 2$	$a_2 = 1$	$a_1 = -4$	$a_0 = -7$
		$x_0 b_3 = 4$	$x_0 b_2 = 10$	$x_0 b_1 = 12$
	$b_3 = 2$	$b_2 = 5$	$b_1 = 6$	$b_0 = 5$

The first row has $x_0$ and the coefficients of the dividend. From the recursive definition of Horner's rule, we know that $b_3 = a_3.$ Knowing $b_3,$ we can calculate the first entry in the second row $x_0 b_3 = 4$. Adding this value to the coefficient above gives us $b_2 = a_2 + x_0 b_3 = 5$ (following the second case of Horner's rule), and so on. The result can be read off the last row: $b_1$ to $b_3$ are the coefficients of $p_2,$ and $b_0 = p_3(x_0)$ is the remainder. $$ p_3(x) = ( 2x^2 + 5x + 6 ) ( x - 2 ) + 5 $$

But the usefulness of Horner's method doesn't end here.

Differentiation

If we take the derivative of $p_n(x) = p_{n-1}(x) (x - x_0) + p_n(x_0)$, the second summand is dropped because it's constant while we can apply the product rule to the first summand. $$ p_n'(x) = p_{n-1}'(x) (x - x_0) + p_{n-1}(x) $$

What about the second derivative? We simply calculate the first derivative of the first derivative. $$ \begin{align*} p_n''(x) &= p_{n-1}''(x) (x - x_0) + p_{n-1}'(x) + p_{n-1}'(x)\\ &= p_{n-1}''(x) (x - x_0) + 2 \cdot p_{n-1}'(x) \end{align*} $$

I think I see a pattern here. Let's replace the ticks with a superscript number and calculate the third derivative. $$ \begin{align*} p_n^{(3)}(x) &= p_{n-1}^{(3)}(x) (x - x_0) + p_{n-1}^{(2)}(x) + 2 \cdot p_{n-1}^{(2)}(x)\\ &= p_{n-1}^{(3)}(x) (x - x_0) + 3 \cdot p_{n-1}^{(2)}(x) \end{align*} $$

Clearly we're seeing this pattern for $0 \leq k \leq n$: $$ p_n^{(k)}(x) = p_{n-1}^{(k)}(x) (x - x_0) + k \cdot p_{n-1}^{(k-1)}(x) $$

Now if we're only interested in calculating the derivative at a certain $x = x_0$, the formula simplifies. $$ p_n^{(k)}(x_0) = k \cdot p_{n-1}^{(k-1)}(x_0) $$

This equation can be applied recursively, so $$ k \cdot p_{n-1}^{(k-1)}(x_0) = k (k-1) \cdot p_{n-2}^{(k-2)}(x_0) $$ and following through with these applications $$ p_n^{(k)}(x_0) = k! \cdot p_{n-k}(x_0). $$

This means that we can calculate the kth derivative of $p_n$ by $k+1$ successive applications of Horner's rule. For example, we can continue our table from above. I omit the coefficient names for readability and to avoid confusion between as and bs.

$x_0 = 2$	$2$	$1$	$-4$	$-7$
		$4$	$10$	$12$
	$2$	$5$	$6$	$5 = p_3(2)$
		$4$	$18$
	$2$	$9$	$24 = \frac{p'_3(2)}{1!} $
		$4$
	$2$	$13 = \frac{p^{(2)}_3(2)}{2!}$
	$2 = \frac{p^{(3)}_3(2)}{3!}$

With each application of Horner's rule, the degree of the polynomial decrements by one, so we need $n + (n-1) + \ldots + (n-k) = (k+1) \frac{2n - k}{2}$ additions and multiplications.

¹ Opfer, Gerhard (2008). Numerische Mathematik für Anfänger, 5. Auflage. Vieweg-Teubner.

² The indices of the coefficients run from $1$ through $n$ here, but that doesn't really matter. To comply with our initial definition of a polynomial, we could simply shift all the $b$ indices by $-1$, so $p_n(x_0) = b_{-1}$, or define the recursion as $b_{n-1} = a_n,$ $b_j = a_{j+1} + x_0 b_{j+1}$ for $j = n-2, n-3, \ldots, 0$ and $p_n(x_0) = a_0 + x_0 b_0.$ The latter is what the book does, but I found it a bit unwieldly (basically you need to understand the whole derivation before you know why the recursion was set up in this weird way).

\(x_0 = 2\)	\(a_3 = 2\)	\(a_2 = 1\)	\(a_1 = -4\)	\(a_0 = -7\)
		\(x_0 b_3 = 4\)	\(x_0 b_2 = 10\)	\(x_0 b_1 = 12\)
	\(b_3 = 2\)	\(b_2 = 5\)	\(b_1 = 6\)	\(b_0 = 5\)

\(x_0 = 2\)	\(2\)	\(1\)	\(-4\)	\(-7\)
		\(4\)	\(10\)	\(12\)
	\(2\)	\(5\)	\(6\)	\(5 = p_3(2)\)
		\(4\)	\(18\)
	\(2\)	\(9\)	\(24 = \frac{p'_3(2)}{1!} \)
		\(4\)
	\(2\)	\(13 = \frac{p^{(2)}_3(2)}{2!}\)
	\(2 = \frac{p^{(3)}_3(2)}{3!}\)