This question is asked here and here.

I’m going to try to answer it for my own edification, and for the next poor soul that comes across it and has trouble ðŸ™‚

I’m using Spivak’s Answer Book solution, but adding notes to hopefully clarify things.

Here goes:

From Calculus by Michael Spivak 3rd Edition, Chapter 10, Exercise 19.

10-19. Prove that if $f^{(n)}(g(a))$ and $g^{(n)}(a)$ both exist, then $(fcirc g)^{(n)}(a)$ exists. A little experimentation should convince you that it is unwise to seek a formula for $(fcirc g)^{(n)}(a)$. In order to prove that $(fcirc g)^{(n)}(a)$ exists you will therefore have to devise a reasonable assertion about $(fcirc g)^{(n)}(a)$ which can be proved by induction. Try something like: “$(fcirc g)^{(n)}(a)$ exists and is a sum of terms each of which is a product of terms of the form…”

First, let’s look at the first few derivatives of $(fcirc g)$. (These are calculated via straightforward application of the Chain Rule):

$$(fcirc g)^{prime}(x) = f^{prime}(g(x)) cdot g^{prime}(x)$$

$$(fcirc g)^{primeprime}(x) = f^{primeprime}(g(x)) cdot g^{prime}(x)^2 + f^{prime}(g(x)) cdot g^{primeprime}(x)$$

$$(fcirc g)^{primeprimeprime}(x) = f^{primeprimeprime}(g(x)) cdot g^{prime}(x)^3 + 3f^{primeprime}(g(x)) cdot g^{prime}(x)g^{primeprime}(x) + f^{prime}(g(x)) cdot g^{primeprimeprime}(x)$$

These suggest a general form for derivatives of $(fcirc g)$: the $n$-th derivative of $fcirc g$ seems to be made up of a sum of terms that are each the product of some constant, times some derivative of $f$ at $g(x)$, times some derivatives of $g$ at $x$, with these derivatives of $g$ perhaps raised to some power. None of the derivatives (of $f$ or $g$) are of order higher than $n$.

**Conjecture: If $f^{(n)}(g(a))$ and $g^{(n)}(a)$ both exist, then $(fcirc g)^{(n)}(a)$ exists and is a sum of terms of the form
$$ccdot(g^{prime}(a))^{m_1}cdot(g^{primeprime}(a))^{m_2}cdots(g^{(n)}(a))^{m_n}cdot f^{(k)}(g(a))$$
for some number $c$, nonnegative integers $m_1,dots,m_n$ and a natural number $k leq n$.**

(Aside: The conjecture concerns the existence of the derivative *at a single point*, $a$. It says, “if $g$ is $n$-times differentiable at $a$ and $f$ is $n$-times differentiable at $g(a)$, then $fcirc g$ is $n$-times differentiable at $a$. If there’s a second point $b$ such that $g$ is $n$-times differentiable at $b$ and $f$ is $n$-times differentiable at $g(b)$, then $fcirc g$ is $n$-times differentiable at $b$, and $(fcirc g)^{(n)}(b)$ is the sum of terms of the form

$$ccdot(g^{prime}(b))^{m_1}cdot(g^{primeprime}(b))^{m_2}cdots(g^{(n)}(b))^{m_n}cdot f^{(k)}(g(b))$$

where $c$, $m_1, dots, m_n$, $k$ in each term are all identical to the corresponding term for $(fcirc g)^{(n)}(a)$.)

**Proof:**

Let’s restate the conjecture we’re trying to prove.

Conjecture:

If $f^{(n)}(g(a))$ and $g^{(n)}(a)$ both exist, then $(fcirc g)^{(n)}(a)$ exists and is a sum of terms of the form

$$ccdot(g^{prime}(a))^{m_1}cdot(g^{primeprime}(a))^{m_2}cdots(g^{(n)}(a))^{m_n}cdot f^{(k)}(g(a))$$

Proof of the conjecture is by induction on $n$

*Case for $n = 1$*

If $f^{prime}(g(a))$ and $g^{prime}(a)$ both exist, then the Chain rule states that $(fcirc g)^{prime}(a)$ exists and is equal to $f^{prime}(g(a)) cdot g^{prime}(a)$.

Thus our conjecture is true for $n = 1$ (with $c = m_1 = k = 1$).

*Case for $n + 1$*

We will assume the conjecture is true for $n$, that is, *if* $f^{(n)}(g(a))$ and $g^{(n)}(a)$ both exist, then $(fcirc g)^{(n)}(a)$ exists and is a sum of terms of the form

$$ccdot(g^{prime}(a))^{m_1}cdot(g^{primeprime}(a))^{m_2}cdots(g^{(n)}(a))^{m_n}cdot f^{(k)}(g(a))$$

Note that we are *not* assuming that $(fcirc g)^{(n)}(a)$ exists. Our assumption is that * if* $f^{(n)}(g(a))$ and $g^{(n)}(a)$ both exist,

*$(fcirc g)^{(n)}(a)$ exists.*

**then**(This distinction is important and sets this proof apart from inductive arguments I’ve previously encountered. Our assumption isn’t simply “thing is true”. Our assumption is that “*if* this condition is met, thing is true.” This is in agreement with the theorem we’re trying to prove. The theorem doesn’t say the $n$-th derivative of $fcirc g$ exists. Only that, *if* certain conditions are met, the derivative exists.)

Now we need to prove the conjecture for the $n + 1$ case. We need to show that *if* $f^{(n+1)}(g(a))$ and $g^{(n+1)}(a)$ both exist, *then* $(fcirc g)^{(n+1)}(a)$ exists, and is a sum of terms of the form

$$ccdot(g^{prime}(a))^{m_1}cdots(g^{(n+1)}(a))^{m_{n+1}}cdot f^{(k)}(g(a))$$

with $k leq n+1$.

Suppose $f^{(n+1)}(g(a))$ and $g^{(n+1)}(a)$ both exist.

This means that $f^{(n)}(y)$ exists for all $y$ in some interval around $g(a)$, and likewise $g^{(n)}(x)$ exists for all $x$ in some interval around $a$. Why? Because, by definition the $(n+1)$ derivatives are

$$f^{(n+1)}(g(a)) = lim_{y to g(a)}frac{f^{(n)}(y) – f^{(n)}(g(a))}{y-g(a)}$$

$$g^{(n+1)}(a) = lim_{x to a}frac{g^{(n)}(x) – g^{(n)}(a)}{x-a}$$

and in order for these limits to exist, the $n$-order derivatives *must* exist in some intervals around these points.

Furthermore, $f^{(k)}$ and $g^{(k)}$ must also exist in these intervals for all $1leq k < n$.

Because $f^{(n)}$ exists at and around $g(a)$, there is some $varepsilon_f > 0$ such that for all $y$ if $|y – g(a)| < varepsilon_f$, $f^{(n)}(y)$ exists. Likewise, since $g^{(n)}$ exists at and around $a$ there is some $delta_g >0$ such that for all $x$ if $|x-a| < delta_g$, $g^{(n)}(x)$ exists.

We know that $g$ is continuous at $a$ (it’s differentiable). As such there exists some $delta_f > 0$ such that for all $x$ if $|x-a| < delta_f$, then $|g(x) – g(a)| < varepsilon_f$. If we use $delta_{min} = min(delta_g,delta_f)$ then for all $x$ if $|x-a| < delta_{min}$, both $g^{(n)}(x)$ and $f^{(n)}(g(x))$ will exist.

In this interval both $g^{(n)}(x)$ and $f^{(n)}(g(x))$ exist, so the $n$-case assumption tells us $(fcirc g)^{(n)}(x)$ exists and is a sum of terms of the form

$$ccdot(g^{prime}(x))^{m_1}cdot(g^{primeprime}(x))^{m_2}cdots(g^{(n)}(x))^{m_n}cdot f^{(k)}(g(x))$$

Worth noting: we’re no longer only talking about the value of $(fcirc g)^{(n)}$ at a single point $a$. $(fcirc g)^{(n)}(x)$ exists *for all* $x$ in this interval around $a$.

$(fcirc g)^{(n)}$ is a *function* that’s defined at and around $a$. Furthermore, $(fcirc g)^{(n)}(x)$ is the sum of products of constants and derivatives of $f$ and $g$ and these derivatives are all themselves differentiable at $a$

Therefore, the derivative of $(fcirc g)^{(n)}$ at $a$ exists and can be calculated using the standard, sum, product and chain rules for derivatives.

Doing so will result in terms that look like either this

$$ccdot(g^{prime}(a))^{m_1}cdots m_{alpha}(g^{(alpha)}(a))^{m_{alpha}-1}cdot(g^{(alpha+1)}(a))^{m_{alpha+1}+1}cdots(g^{(n)}(a))^{m_n}cdot f^{(k)}(g(a))$$

this

$$ccdot(g^{prime}(a))^{m_1}cdots m_n(g^{(n)}(a))^{m_n-1}cdot(g^{(n+1)}(a))cdot f^{(k)}(g(a))$$

or this

$$ccdot(g^{prime}(a))^{m_1}cdots(g^{(n)}(a))^{m_n}cdot f^{(k+1)}(g(a))$$

all of which fit the required form for terms of the $n+1$ case.

Therefore, if $f^{(n+1)}(g(a))$ and $g^{(n+1)}(a)$ both exist, *then* $((fcirc g)^{(n)})^{prime}(a) = (fcirc g)^{(n+1)}(a)$ exists, and is a sum of terms of the form

$$ccdot(g^{prime}(a))^{m_1}cdots(g^{(n+1)}(a))^{m_{n+1}}cdot f^{(k)}(g(a))$$

with $k leq n+1$.

This completes the proof.

I hope.