I was in the library the other day and an introduction to numerical mathematics^{1} fell into my hands.
It's a nice little book, but pretty dense and maybe not ideal for self-study.
Already in the second chapter I had to work things out with pen and paper.
This is a clean and more verbose version of my paper notes.

A polynomial is defined as the following function.
$$
p_n(x) = \sum_{i=0}^n a_i x^i
$$
where the coefficients \(a_i\) can be real or complex numbers.
If \(a_n \neq 0\), we say that the *degree* of the polynomial is \(n.\)
To evaluate a polynomial for some \(x=x_0\), we can simply plug this value into the formula and calculate the result.
Since there are \(n+1\) terms in the sum, we'll need \(n\) additions, and because each term has \(i = 0, 1, \ldots, n\) factors \(x\), we'll need \(1 + 2 + \cdots n = \frac{n (n+1)}{2}\) multiplications.

We can do with less, though. With a bit of clever nesting, we get the same result with only \(n\) additions and \(n\) multiplications. $$ p_n(x) = a_0 + x_0 ( a_1 + x_0 ( a_2 + \ldots + x_0( a_{n-1} + x_0 a_n )) \cdots ) $$

The expression on the right should have \(n-1\) pairs of parentheses if fully expanded.
This is called *Horner's method*.

Instead of the equation above with the handwavy dots, we can also give a short recursive definition of the same method. $$ \begin{align*} b_n &= a_n\\ b_i &= a_i + x_0 b_{i+1},\quad i = n-1, n-2, \ldots, 0 \end{align*} $$

Applying this definition, of course, yields the same result.

Another thing we can do with the recursive definition is to solve for \(a_i = b_i - x_0 b_{i+1}\) and plug this back into our polynomial definition. It's not obvious why we would want to do this, but we'll just go ahead. $$ \begin{align*} p_n(x) &= \sum_{i=0}^n a_i x^i \\ &= \sum_{i=0}^{n-1} a_i x^i + a_n x^n\\ &= \sum_{i=0}^{n-1} ( b_i - x_0 b_{i+1}) x^i + b_n x^n\\ \end{align*} $$

We pulled the base case of the recursion (\(b_n = a_n\)) out of the sum and did the substitution.
With a bit of shuffling we get an interesting formula which involves a new polynomial^{2} \(p_{n-1} = \sum_{i=1}^n b_i x^{i-1}\) and the value of \(p_n\) at \(x=x_0,\) which is \(b_0.\)
$$
\begin{align*}
p_n(x) &= b_0 + \sum_{i=1}^{n-1} b_i x^i - x_0 \sum_{i=1}^n b_i x^{i-1} + b_n x^n\\
&= b_0 + x \sum_{i=1}^n b_i x^{i-1} - x_0 \sum_{i=1}^n b_i x^{i-1}\\
&= \left( \sum_{i=1}^n b_i x^{i-1} \right) (x - x_0) + b_0\\
&= p_{n-1}(x) (x - x_0) + p_n(x_0)
\end{align*}
$$

What's the significance of this formula?
It's a special case of *synthetic division* called *Ruffini's rule*, in which the divisor is a linear factor \(x - x_0.\)
The quotient \(p_{n-1}\) is a polynomial of one degree less than the dividend \(p_n.\)

Manually performing such a division is best done in a table, for example, with \(p_3 = 2x^3 + x^2 - 4x -7\) and \(x_0 = 2,\) we can set up the following table.

\(x_0 = 2\) | \(a_3 = 2\) | \(a_2 = 1\) | \(a_1 = -4\) | \(a_0 = -7\) |

\(x_0 b_3 = 4\) | \(x_0 b_2 = 10\) | \(x_0 b_1 = 12\) | ||

\(b_3 = 2\) | \(b_2 = 5\) | \(b_1 = 6\) | \(b_0 = 5\) |

The first row has \(x_0\) and the coefficients of the dividend. From the recursive definition of Horner's rule, we know that \(b_3 = a_3.\) Knowing \(b_3,\) we can calculate the first entry in the second row \(x_0 b_3 = 4\). Adding this value to the coefficient above gives us \(b_2 = a_2 + x_0 b_3 = 5\) (following the second case of Horner's rule), and so on. The result can be read off the last row: \(b_1\) to \(b_3\) are the coefficients of \(p_2,\) and \(b_0 = p_3(x_0)\) is the remainder. $$ p_3(x) = ( 2x^2 + 5x + 6 ) ( x - 2 ) + 5 $$

But the usefulness of Horner's method doesn't end here.

If we take the derivative of \(p_n(x) = p_{n-1}(x) (x - x_0) + p_n(x_0)\), the second summand is dropped because it's constant while we can apply the product rule to the first summand. $$ p_n'(x) = p_{n-1}'(x) (x - x_0) + p_{n-1}(x) $$

What about the second derivative? We simply calculate the first derivative of the first derivative. $$ \begin{align*} p_n''(x) &= p_{n-1}''(x) (x - x_0) + p_{n-1}'(x) + p_{n-1}'(x)\\ &= p_{n-1}''(x) (x - x_0) + 2 \cdot p_{n-1}'(x) \end{align*} $$

I think I see a pattern here. Let's replace the ticks with a superscript number and calculate the third derivative. $$ \begin{align*} p_n^{(3)}(x) &= p_{n-1}^{(3)}(x) (x - x_0) + p_{n-1}^{(2)}(x) + 2 \cdot p_{n-1}^{(2)}(x)\\ &= p_{n-1}^{(3)}(x) (x - x_0) + 3 \cdot p_{n-1}^{(2)}(x) \end{align*} $$

Clearly we're seeing this pattern for \(0 \leq k \leq n\): $$ p_n^{(k)}(x) = p_{n-1}^{(k)}(x) (x - x_0) + k \cdot p_{n-1}^{(k-1)}(x) $$

Now if we're only interested in calculating the derivative at a certain \(x = x_0\), the formula simplifies. $$ p_n^{(k)}(x_0) = k \cdot p_{n-1}^{(k-1)}(x_0) $$

This equation can be applied recursively, so $$ k \cdot p_{n-1}^{(k-1)}(x_0) = k (k-1) \cdot p_{n-2}^{(k-2)}(x_0) $$ and following through with these applications $$ p_n^{(k)}(x_0) = k! \cdot p_{n-k}(x_0). $$

This means that we can calculate the *k*th derivative of \(p_n\) by \(k+1\) successive applications of Horner's rule.
For example, we can continue our table from above.
I omit the coefficient names for readability and to avoid confusion between *a*s and *b*s.

\(x_0 = 2\) | \(2\) | \(1\) | \(-4\) | \(-7\) |

\(4\) | \(10\) | \(12\) | ||

\(2\) | \(5\) | \(6\) | \(5 = p_3(2)\) | |

\(4\) | \(18\) | |||

\(2\) | \(9\) | \(24 = \frac{p'_3(2)}{1!} \) | ||

\(4\) | ||||

\(2\) | \(13 = \frac{p^{(2)}_3(2)}{2!}\) | |||

\(2 = \frac{p^{(3)}_3(2)}{3!}\) |

With each application of Horner's rule, the degree of the polynomial decrements by one, so we need \(n + (n-1) + \ldots + (n-k) = (k+1) \frac{2n - k}{2}\) additions and multiplications.

^{1} Opfer, Gerhard (2008). Numerische Mathematik für Anfänger, 5. Auflage. Vieweg-Teubner.

^{2} The indices of the coefficients run from \(1\) through \(n\) here, but that doesn't really matter. To comply with our initial definition of a polynomial, we could simply shift all the \(b\) indices by \(-1\), so \(p_n(x_0) = b_{-1}\), or define the recursion as \(b_{n-1} = a_n,\) \(b_j = a_{j+1} + x_0 b_{j+1}\) for \(j = n-2, n-3, \ldots, 0\) and \(p_n(x_0) = a_0 + x_0 b_0.\) The latter is what the book does, but I found it a bit unwieldly (basically you need to understand the whole derivation before you know why the recursion was set up in this weird way).