Wednesday, April 17, 2024

A (categorical) diagram for the Chain Rule

The standard proof of the Chain Rule, 
for the derivative $(\blue f \green g)'$ of the composition of two differentiable functions between three vector spaces:
\[ \begin{array} {} && F \\ & \blue{ \llap f \nearrow } && \green{ \searrow \rlap g } \\ \blue E & {} \rlap{ \kern-.5em \xrightarrow [\textstyle \blue f \green g] {\kern6em} } &&& \green G \\  \end{array} \]
 shows a bunch of equations, with some being substituted into others to yield the final desired result.

As an example of this, see the proof of the Chain Rule 
given in Chapter XVII, Section 3 of Serge Lang's well-known <I>Undergraduate Analysis</I>.
(Actually, he only assumes $f$ and $g$ are defined on suitable open sets of $E$ and $F$, but for simplicity we will assume they are globally defined.)

Just giving the equations omits some important information: in which vector space are the equations holding, and in which vector space do the variables live?
Of course the knowledgeable reader can infer that, but it seems to me it would aid the understanding of the proof, and the situation, to make that "typing" information explicit.

That is easy to do with a categorical diagram.

I don't have sophisticated enough diagramming software to let me show you the full diagram in this blog, but I can tell you how to draw it for yourself on a sheet of paper.

Here is the key part, which in the full diagram is inscribed in the $E\to F\to G$ triangle drawn above.

\[ \begin{array} {} &&& F &&& \\ && \llap{\blue{xf}} \Bigg\uparrow & \blue{ \begin{array} {} h \black k \\ \black\Longrightarrow \\ h(xf') \mathrel {\black +} (h\psi_f) \mathrel \cdot |h| \\ \end{array} } & \Bigg\uparrow \rlap{ \blue{ (x+h)f = xf \mathrel{\black+} h \black k = xf + h(xf') + (h\psi_f) \cdot |h| } } \\ & \blue{  \xleftarrow [\kern3em] {\textstyle x+h} } &&&& \xrightarrow [\kern18em] {\textstyle xfg} \\ \blue E & \blue{ \smash{\Bigg\Uparrow} \rlap h } && \rightadj 1 && \smash{ \Bigg\Downarrow } \rlap{ \kern-6em (hk) (xfg') + \big( (hk)\psi_g \big) \cdot |hk| } & G \\ & \blue{ \xleftarrow[\textstyle x] {} } &&&& \xrightarrow [ \textstyle (x+h)fg = (xf+hk)g ] {} \\ \\ \hline \\ {} \rlap{ \blue{ \big(h(xf') \mathrel {\black +}  (h\psi_f) \mathrel \cdot |h|  \big) } (xfg') + \green{ \big( (hk)\psi_g \big) \cdot |hk| } } \\ {} \rlap{ \blue{h(xf')} (xfg') \mathrel {\black +} \blue{ \big((h\psi_f) \mathrel \cdot |h| \big) } (xfg') + \green{ \big( (hk)\psi_g \big) \cdot |hk| } } \\ \end{array} \]

----------

There are several notational conventions at play here.

\[ \begin{array} {l|cccc|cc|cccc} & \text{mult} & \text{appl} & \text{comp} & \text{appl+comp} & \lambda\text{ fn} & \lambda\text{ number} & \rightadj{ \text{the derivative} } \\ \hline \text{Lang} & xy & f(x) & g\circ f & g(f(x)) & \lambda(h) & \lambda h & f(\source{x+h}) \mathrel{\target=} f(\source x) \mathrel{\target+} \big(f'(\source x)= \rightadj D f(\source x)\big)(\source h) \mathrel{\target+} \source{|h|}\psi(\source h) \\ \text{Harbaugh} & x\cdot y & xf & fg & xfg & h\lambda & h\cdot\lambda & \source{(x+h)}f \mathrel{\target=} \source x f \mathrel{\target+} \source h \cdot  \big(\source x (f'=f\rightadj D)\big) \mathrel{\target+} (\source h \psi)\cdot \source{|h|} \\ \text{example} &&&&&&& \source{(x+h)}^2 \mathrel{\target=} \source x ^2 \mathrel{\target+} \source h \cdot ( \source x \cdot \rightadj 2 ) \mathrel{\target+} \source{ \big( h\cdot h = ({h\cdot h \over |h|}) \cdot |h| \big) } \\ \end{array} \]