KWHMath: 2024

Saturday, November 23, 2024

Different use of a matrix and its transpose

We want to establish a string of equalities.

That string will use several substitutions which involve basis expansions,

so it is helpful to state those basis expansions first.

Since $\source{(v_j)_{1\le j\le n}}$ is a basis for $\source V$, there exists $\source{ \boxed{ (x_j)_{1\le j\le n} } }$ such that

\[ \source{ v \xlongequal[ \text{in terms of the $(v_j)_{1\le j\le n}$ basis of $V$} ] { \text{expanding $v\in V$} } v_1x_1 + \dots + v_nx_n } \; . \]

Since $\target{(w_i)_{1\le i\le m}}$ is a basis for $\target W$, there exists $\leftadj{ \boxed{ (a_{\target i\source j})_{\target{1\le i\le m},\source{1\le j\le n}} } }$ such that

\[ \target{ \source{v_1}\leftadj L \xlongequal[ \text{in terms of the $(w_i)_{1\le i\le m}$ basis of $W$} ] { \text{expanding $\source{v_1}\leftadj L \in W$} } \begin{Bmatrix} {} w_1 \leftadj a_{\target1\source1} \\ \vdots \\ w_m \leftadj a_{\target m\source1} \\ \end{Bmatrix} , \kern2em \ldots \kern2em , \source{v_n}\leftadj L \xlongequal[ \text{in terms of the $(w_i)_{1\le i\le m}$ basis of $W$} ] { \text{expanding $\source{v_n}\leftadj L \in W$ } } \begin{Bmatrix} {} w_1 \leftadj a_{\target1\source n} \\ \vdots \\ w_m \leftadj a_{\target m\source n} \\ \end{Bmatrix} } \]

(Note:

The column vectors of $(\leftadj a_{\target i\source j})_{\target{1\le i\le m},\source{1\le j\le n}}$, written horizontally,

combine to form the transpose of $\leftadj{ M^{\source v}_{\target w} (L) }$.)

So,

\[ \begin{array} {} \target{ \begin{Bmatrix} {} w_1 (\source v \leftadj L)_1 \\ \vdots \\ w_m (\source v \leftadj L)_m \\ \end{Bmatrix} \xlongequal [\text{in terms of the $\target{(w_i)_{1\le i\le m}}$ basis of $W$}] {\text{expanding $\source v\leftadj L \in W$}} \source v\leftadj L \mathrel{\source{ \xlongequal[ \text{in terms of the $\source{(v_j)_{1\le j\le n}}$ basis of $V$} ] { \text{expanding $v\in V$} } }} \source{ ( v_1x_1 + \dots + v_nx_n ) } \leftadj L \mathrel{\leftadj{ \xlongequal[\text{linear}]L} } \source{ v_1} \leftadj L \source{x_1} + \dots + \source{v_n} \leftadj L \source{x_n} = \\ \xlongequal[ \text{in terms of the $\target{(w_i)_{1\le i\le m}}$ basis of $W$} ] { \text{expanding each $\source{v_j}\leftadj L \in W$} } \left( \source{v_1}\leftadj L = \begin{Bmatrix} {} \target{w_1} \leftadj a_{\target1\source1} \\ \target\vdots \\ \target{w_m} \leftadj a_{\target m\source1} \\ \end{Bmatrix} \right) \source{x_1} + \dots + \left( \source{v_n}\leftadj L = \begin{Bmatrix} {} \target{w_1} \leftadj a_{\target1\source n} \\ \target\vdots \\ \target{w_m} \leftadj a_{\target m\source n} \\ \end{Bmatrix} \right) \source{x_n} \\ \xlongequal [\text{dist.}] { \text{dist., Fubini,} }\begin{Bmatrix} {} w_1 \source{(\leftadj A_{\target1} \cdot X)} \\ \vdots \\ w_m \source{(\leftadj A_{\target m} \cdot X)} \\ \end{Bmatrix} } \end{array} \]

\[ \boxed{ \begin{array} {c|ccc} \target{w_1} \source{(\leftadj A_{\target1} \cdot X)} = \target{w_1} \source{ ( \sum_{j=1}^n \leftadj a_{\target1 j} x_j ) } & \target{w_1} \leftadj a_{\target1 \source1} \source{x_1} & \source\dots & \target{w_1} \leftadj a_{\target1\source n} \source{x_n} \\ \target\vdots & \target\vdots & \target\vdots & \target\vdots \\ \target{w_m} \source{(\leftadj A_{\target m} \cdot X)} = \target {w_m} \source{ ( \sum_{j=1}^n \leftadj a_{\target m j} x_j ) } & \target{w_m} \leftadj a_{\target m \source 1} \source{x_1} & \source\dots & \target{w_m} \leftadj a_{\target m\source n } \source{x_n} \\ \hline \\ \begin{array} {} { \displaystyle \mathop{ \target{\sum_{i=1}^m} } \mathop{ \source{\sum_{j=1}^n} } \target{w_i} \leftadj a_{\target i\target j} \source{x_j} } \\ \Vert\rlap{\text{Fubini}} \\ { \displaystyle \mathop{ \source{\sum_{j=1}^n} } \mathop{ \target{\sum_{i=1}^m} } \target{w_i} \leftadj a_{\target i\target j} \source{x_j} } \end{array} & \target{ \begin{Bmatrix} {} w_1 \leftadj a_{1\source1} \\ \target\vdots \\ w_m \leftadj a_{m\source1} \\ \end{Bmatrix} } \source{x_1} & \source\dots & \target{ \begin{Bmatrix} {} w_1 \leftadj a_{1\source n} \\ \target\vdots \\ w_m \leftadj a_{m\source n} \\ \end{Bmatrix} } \source{x_n} \\ & \target\Vert && \target\Vert \\ & \target{ (w \cdot \leftadj A^{\source1}) } \source{x_1} & \source\dots & \target{ (w \cdot \leftadj A^{\source n}) } \source{x_n} \\ \end{array} } \]

---------------

To some extent this follows the notation and setup of Chapter IV, Section 3 of Serge Lang's Linear Algebra, 3e.

The notation deviates from that in these ways:

1. If $f$ is a function and $v$ is in the domain of $f$, we write the value of $f$ at $v$ any of three ways: $f(v)$ or $fv$ or $vf$.

As long as it is known which is a function and which can be an argument to that function, that should not be a problem.

2. Sometimes we have a column of values which need to be added up.

We introduce a notation which gives a vertical version of the standard symbol $\Sigma$ for summation.

For example,

\[ \begin{Bmatrix} {} v_1 \\ \vdots \\ v_n \end{Bmatrix} = \sum_{i=1}^n v_i \; \text{whereas} \begin{pmatrix} {} v_1 \\ \vdots \\ v_n \end{pmatrix} \text{is just a vertically oriented $n$-tuple.} \]

3. Consider the situation:

$f:X\to Y$ is a function and $e$ and $e'$ two expressions denoting the same element of $X$.

We introduce the expression

$(e=e')f$ to

1. Indicate that $e$ and $e'$ denote the same element in $X$, and 2. return the element of $Y$ which is the value of $f$ under $f$.

For a simple example,

\[ 2+3=5 \Rightarrow (2+3)^2=5^2 =25 = (2+3=5)^2 \, . \]

4. For an example, which appeared above, combining these notational conversations, consider

\[ \source { \left( \begin{Bmatrix} {} v_1 x_1 \\ \vdots \\ v_n x_n \\ \end{Bmatrix} \xlongequal[\text{basis}]v v \right) } \leftadj F \; , \]

which follows from

\[ \source{ v_1x_1 + \dots + v_nx_n \xlongequal[\text{basis}]v v } \]

and returns the common value

\[ \source{ (v_1x_1 + \dots + v_nx_n)\leftadj F \mathrel{\target =} v\leftadj F } \; . \]

----------

Here is another approach to the same subject:

\[ \target { \left( \leftadj { \left( M^{\source v}_{\target w} (F) = A =con \begin{pmatrix} {} a_{\green1\blue1} & a_{\green1\blue2} & \dots & a_{\green1\blue n} \\ && \vdots \\ a_{\green m\blue1} & a_{\green m\blue2} & \dots & a_{\green m\blue n} \\ \end{pmatrix} \right) } \source { \begin{pmatrix} {} x_1 \\ \vdots \\ x_n \\ \end{pmatrix} } \right) X_w\inv = \begin{Bmatrix} {} ( w_1 \leftadj a_{1,1} + \dots + w_m \leftadj a_{m,1} & \xlongequal[\text{basis}] w & \source{v_1} \leftadj F ) \source{x_1} \\ & \vdots \\ ( w_1 \leftadj a_{1,n} + \dots + w_m \leftadj a_{m,n} & \xlongequal[\text{basis}] w & {\source{v_n}} \leftadj F ) \source{x_n} \\ \end{Bmatrix} } \leftadj { \xlongequal[\text {linear}]F } \source { \left( \begin{Bmatrix} {} v_1 x_1 \\ \vdots \\ v_n x_n \\ \end{Bmatrix} \xlongequal[\text{basis}]v v \right) } \leftadj F = \source v \leftadj F \]

Thursday, October 10, 2024

Preorders and their special cases

\[ \boxed { \begin{array} {} & \text {preorders} & \\ & \text {reflexive:} \; x \leq x & \\ & \text {transitive:}\; x\leq y \;\&\; y\leq z \Rightarrow x\leq z & \\ \boxed { \begin{array} {} \text {partial order} \\ \text {skeletal:}\; x\leq y \;\&\; y\leq x \Rightarrow x=y \\ \text {example: power set } X\mathcal P \\ \end{array} } & \boxed { \begin{array} {} \text{intersection} \\ \text{discrete order: only $x=x$ allowed } \\ \end{array} } & \boxed { \begin{array} {} \text {equivalence relation} \\ \text {symmetric: } x\leq y \Rightarrow y\leq x \\ \text{example: congruence } m \equiv n \; (\text{mod } k) \\ \end{array} } \\ \end{array} } \]

For a relation between partial orders and total orders, see

"Partial orders are the free conservative cocompletion of total orders"

Calin Tataru

http://www.tac.mta.ca/tac/volumes/44/1/44-01abs.html

Wednesday, October 9, 2024

Groups in linear algebra and elementary geometry

There are several groups that play an important role in linear algebra and elementary geometry.

The following table shows what they are and the relations between them.

It includes the simplest possible example, the one-dimensional real case.

In the table $\bf k$ is an arbitrary commutative field (like $\R$ or $\bf C$).

\[ \boxed {\begin{array} {} \text {rotations} & \subset & \text {isometries} & \subset & \text {affine transformations} \\ \hline \text {Isom}_n^+ ({\bf k}) = \text {SO}_n ({\bf k}) \ltimes \text {Trans}_n ({\bf k}) & \subset & \text {Isom}_n ({\bf k}) = \text {O}_n ({\bf k}) \ltimes \text {Trans}_n ({\bf k}) & \subset & \text {Aff}_n ({\bf k}) = \text {GL}_n ({\bf k}) \ltimes \text {Trans}_n ({\bf k}) \\ \cup && \cup && \cup \\ \text {SO}_n ({\bf k}) & \subset & \text {O}_n ({\bf k}) & \subset & \text {GL}_n ({\bf k}) \\ \text {SO}_1(\R) = \text {O}_1^+(\R) = \{1\in\R\} & \subset & \text {O}_1 (\R) = \{1,-1\in\R\} & \subset & \text {GL}_1 (\R)= \{A\in\R: A\not= 0\} \\ \end{array} } \]

The $\ltimes$ symbol denotes semi-direct product.

Reference: Andrew Baker, Matrix Groups

Friday, August 9, 2024

Steps Feynman omits in his "Tips on Physics"

Let me start with my exposition of a basic result about differentiation:

Suppose $u$ and $v$ are two functions from $\R$ to $\R$: $u,v : \R \to \R$.

Then we can form their (pointwise) product, and then differentiate that.

The product rule for differentiation, followed by an elementary algebraic manipulation, then yields:

\[ (uv)' \xlongequal {\text{product rule}} u'v + uv' \xlongequal{\text{algebra}} uv(u'/u+v'/v) \]

That combined the product rule with some simple algebraic manipulations.

That is a simple, general, fact which may or may not be of general significance.

Now let's see what Feynman does in (1.3) through (1.6) of "Feynman's Tips on Physics".

He both generalizes and specializes the above.

He generalizes it to a four-way product:

\[ (uvwx)' = u'vwx + uv'wx + uvw'x + uvwx' = uvwx \Big[ {u' \over u} + {v' \over v} + {w' \over w} + {x' \over x} \Big] \]

------

Next he specializes to the casa where each of $u,v,w,x$ are powers of a polynomial, e.g. $u = p^a$, where $p$ is a polynomial and $a$ is a fixed, constant real number.

To differentiate $p^a$, we need to realize that that is the composite of two functions, the polynomial $p$ and the monomial $x^a$.

Let us write the latter as $x^a = xR_a$ ($x$ raised to the $a$-th power).

We differentiate that function as follows:

\[ (x^a)' = x{R_a}' = x a {R_{a-1}} = a x^{a-1} . \]

Now we can apply the chain rule to differentiate the composite:

\[ (p^a)' = (pR_a)' \xlongequal {\text{chain}} p' \cdot (p{R_a}') = p' \cdot (a(pR_{a-1})) = p' \cdot (a p^{a-1}) . \]

And so, if $u = p^a$, we have

\[ {u' \over u} = {(p^a)' \over p^a} = { p' a p^{a-1} \over p^a } = a {p' \over p} . \]

And this is what Feynman applies four times to get his (1.6).

Sunday, August 4, 2024

Juxtaposition, multiplication, application, and composition

Juxtaposition is certainly a common, efficient way of combining two things.

I want to give here three separate examples of such.

1. Multiplication

If $a$ and $b$ are two real numbers, we routinely write $ab$ to denote their (multiplicative) product.

If necessary, we put a centered dot between them to prevent ambiguity.

Thus, for example, $5\cdot 7 = 35$, while $57$ is just that.

2. Function application.

Suppose $x$ is a real number, and $f:\R\to\R$ is a real-valued function of one real variable.

It is convenient to write $xf$ for the value of $f$ at $x$.

Of course more conventional ways of writing this are $(x)f$ or $f(x)$, depending on whether you are letting functions operate to the right or left of their argument.

3. Composition

Suppose we have three sets $X, Y$ and $Z$,

and functions $f:X\to Y$ and $g:Y\to Z$.

We can then define a composite function $fg$ by $x(fg) = (xf)g$.

Draft

DRAFT, just so I can see how the mathjax is processed.

Richard Phillips Feynman has a near revered reputation in science pedagogy, for very good reasons, which I wholeheartedly endorse.

But he did, in at least one case, miss the key points and issues.

The two basic, necessary, facts:

The product rule

and the chain rule.

(Here we only consider real-valued functions of one real variable, that is functions $f: \R \to \R$.)

The product rule is this:

Suppose we have two functions $f,g : \R \to \R$, and form their product $x(fg) = (xf)(xg)$.

Here the key point is that we are using juxtaposition to denote both function application $xf = (x)f$, i.e. the function $f$ applied for the variable $x$,

and multiplication of real numbers, i.e., $(xf)(xg)$ is the real number $xf$ times, i.e., multiplied by the real number $xg$.

Now the product rule answers the question, what is $(fg)'$.

Answer

$(fg)' = f'g + fg'$, i.e. $x(fg)' = (xf')(xg) + (xf)(xg')$.

Friday, April 26, 2024

Dominique Bourn's "From Groups to Categorial Algebra"

DRAFT. ONLY BEING "PUBLISHED" SO I CAN SEE WHAT IT LOOKS LIKE WHEN PROCESSED BY MATHJAX.

This book actually covers a lot of ground.

I want to focus on a small, but significant, part of that.

Let us consider three topics:

equivalence relation

equivalence class (or, in group theory, coset)

normal subgroup.

To what extent can these concepts be generalized beyond the theory of groups, and how may they be generalized?

Consider the important and ubiquitous concept of "equivalence relation"

https://en.wikipedia.org/wiki/Equivalence_relation

This is normally defined as a relation, say $R$, on a set, say $X$, which is reflexive, symmetric, and transitive.

These are in turn defined by the effect of $R$ on elements of $X$.

E.g., saying that $R$ is symmetric means that,

writing the relation $R$ as $\sim $,

for all $x,y\in X$, $x\sim y \implies y\sim x$.

Now, Bourn's strategy is to translate these conditions into category theory,

so they may be applied to a wide variety of categories,

beyond merely the category of sets.

As background, Bourn notes that a relation in $\Set$ is a special case of a graph in $\Set$,

which can be described as a parallel pair of arrows $A \rightrightarrows N$ from the set of arrows IN the graph to the set of nodes in the graph.

A relation is the special case of a graph where the parallel pair is jointly monomorphic, that is, for any two nodes $N1=x$ and $N2=y$, there is at most one arrow from $N1=x$ to $N2=y$.

These two definitions, that

a graph is a partial pair of arrows and

a relation is a jointly monomorphic parallel pair of arrows,

may easily be interpreted in an arbitrary category, not just the category of sets.

The reflectivity condition may easily be interpreted in both $\Set$ and in an arbitrary category, for both graphs and relations, as a joint section for the parallel pair.

-----

Next, Bourn gives a single condition on (the reflexive relation $R$) which is logically equivalent to (the conjunction of the symmetry and transitive conditions on $R$).

But that single condition,

on the relation $R$ (on $X$),

is a very special case of a much more widely applicable condition,

so let me put it in that broader context.

If $a,b\in X$ are related by the relation $R$,

it is conventional to indicate that by writing $a\sim b$.

I am going to write that as a labeled arrow $a \xrightarrow {\textstyle r} b$,

labelling the specific relation with a letter.

Likewise writing $a\sim c$ as

$a \xrightarrow {\textstyle s} c$.

Now let us start with the situation

\[ \begin{array} {} && b \\ & \llap{r} \nearrow \\ a & \xrightarrow [\textstyle s] {} & c \\ \end{array} \]

I.e., $a$ is related to both $b$ and $c$ by relations labelled respectively $r$ and $s$.

Given that situation, there is a condition, which Bourn calls the "horn-filler condition" but which might equally well be called the "extension condition".

The "extension condition" is that, in that situation,

there exists an arrow (a relation) $r\backslash s : b\to c$.

Thus the extension condition means that we have three arrows, i.e. relations, as shown here:

\[ \begin{array} {} && b \\ & \llap{r} \nearrow & \downarrow \rlap{\exists r\backslash s} \\ a & \xrightarrow [\textstyle s] {} & c \\ \end{array} \]

Now let us see how the horn-filler or extension condition, for a reflexive relation, implies both the symmetry and transitivity conditions.

We combine both into a single diagram:

\[ \begin{array} {} && b \rlap{ \xrightarrow[\kern7.5em]{\textstyle s} } &&&& c \\ & \llap r \nearrow && \searrow \rlap{ r^{-1} } \\ a \rlap{ \xrightarrow[\textstyle 1_a]{\kern9em} } &&&& a \\ \end{array} \]

Note that here we apply the horn-filler, or extension, condition twice,

first to get $r\inv$, and thus the symmetry condition, and

second, using $r\inv$, to get a relation $rs$ (not shown) from $a$ to $c$,

which gives the transitivity condition.

Bourn thus having triumphantly shown how both the symmetry and transitivity conditions follow from a single condition,

the horn-filler or extension condition,

his next task is to show how that condition may be translated into category theory.

He does that using the categorical notion of pullback.

In particular, the collection of "horns" (which may be described more categorically as "spans")

is precisely the kernel pair of $d_0 : R \to X$,

which is the pullback (of $\Delta_X$) in the pullback diagram

\[ \begin{array} {} \boxed{ KP(d_0:R\to X) } = \{<r:a\to b, s:c\to d> : a=c \} & \xrightarrow{} & X & \ni & x\\ \downarrow && \downarrow \rlap{ \Delta_X } && \downarrow \\ R\times R & \xrightarrow [\textstyle d_0\times d_0] {} & X\times X & \ni & <x,x> \\ <r:a\to b, s:c\to d> & \mapsto & <a,c> \\ \end{array} \]

Thus $KP(d_0:R\to X) = \{<r:a\to b, s:a\to c> \}$, the set of all spans, or horns, in $R$.

OTOH, the collection of horns/spans satisfying the extension condition also is a pullback!

Namely the pullback (of $R\to X\times X$) in the diagram

(where $KP(d_0:R\to X)$ is abbreviated to $KP(d_0)$, or even just $KP$)

\[ \begin{array} {} &&&& R \\ &&&& \downarrow \\ KP(d_0) & \rightarrowtail & R\times R & \xrightarrow [d_1\times d_1] {} & X\times X \\ && \Vert & \Vert & \Vert \\ && \nabla R & \xrightarrow [\nabla d_1] {} & \nabla X \\<r:a\to b, s:a\to c> & \mapsto & <r,s> & \mapsto & <b,c> \\ \end{array} \]

That is,

$({(d_1\times d_1)} | KP : KP\to X\times X)\inv (R) = \{<r:a\to b, s:a\to c> : b \mathrel{ {\sim_R} } c \}$

Bourn in fact considers a larger set, not merely a subset of the horns:

\[ \begin{array} {} <r:a\to b, s:c\to d, \rho: b\to d > && \rho: b\to d \\ \boxed{ {d_1}\inv R } & \xrightarrow {} & R \\ \downarrow&& \downarrow \\ R\times R & \xrightarrow [d_1\times d_1] {} & X\times X \\ \Vert & \Vert & \Vert \\ \nabla R & \xrightarrow [\nabla d_1] {} & \nabla X \\<r:a\to b, s:c\to d> & \mapsto & <b,d> \\ \end{array} \]

That is,

${(d_1\times d_1)}\inv (R) = \{ <r:a\to b, s:c\to d> : b \mathrel{ {\sim_R} } d \}$

Then the condition that every horn/span has an extension becomes

\[ \boxed{ KP(d_0) \subseteq {d_1}\inv R } \]

-----

We can combine the various pieces of the above into one quite useful large diagram.

\[ \begin{array} {} &&&& <r:a\to b, s:c\to d, \rho: b\to d > && \rho: b\to d \\ && \big\{ <r:a\to b, s:a\to c, \rho: b\to c> \big\} & \rightarrowtail & \boxed{ {d_1}\inv R } & \xrightarrow {} & R \\ && \downarrow && \downarrow&& \downarrow \\ <r:a\to b, s:a\to c> & \in & \boxed{ KP(d_0) } & \rightarrowtail & R\times R & \xrightarrow [d_1\times d_1] {} & X\times X \\ &&&& \Vert & \Vert & \Vert \\ &&&& \nabla R & \xrightarrow [\nabla d_1] {} & \nabla X \\ &&&& <r:a\to b, s:c\to d> & \mapsto & <b,d> \\ && \downarrow && \downarrow \rlap{d_0\times d_0} \\ a & \in & X & \xrightarrow [\Delta_X] {} & X\times X \\ &&&& <a,c> \\ \end{array} \]

------------------------------

Suppose $\boxed{ R \rightarrowtail G\times G }$ is an internal relation in the category of groups.

Then relations have inverses, and we can multiply them.

So, writing the relation as $\sim$,

if $g\sim h$ then, taking the inverse of the relation, also $g\inv \sim h\inv$.

And if as well $a\sim b$, then, multiplying the relations, $ga \sim hb$.

If further $R$, or $\sim$, is a reflexive relation, i.e. $(\forall g\in G) (g\sim g)$,

then, somewhat surprisingly, $\sim$ is in fact an internal equivalence relation!

Let's see why that is so.

Reflectivity gives $(\forall a\in G)(a\sim a)$.

Then if also $g\sim h$, multiplication of the relations gives $ga \sim ha$ and $ag\sim ah$.

For symmetry, we have:

\[\begin{array} {} g\sim h \\ h\inv \sim g\inv \\ h\sim g \\ \end{array} \]

For transitivity, we have

\[ \begin{array} {} g\sim h \\ h\inv \sim g\inv \\ h \sim k \\ \hline e = h\inv h \sim g\inv k \\ g \sim k \\ \end{array} \]

Let us also note that finitely complete categories in which every internal reflexive relation is an internal equivalence relation

(like the category of groups)

are called, by Bourn and others, <b>Mal'tsev categories</b>.

https://ncatlab.org/nlab/show/Malcev+category

-------------------

This information, i.e. structure, may be described as a 2-category,

with only one 0-cell,

whose 1-cells are the elements $g$ of the group $G$,

whose vertical category is the relations $r$ in the equivalence relation $R$,

whose horizontal composition on 1-cells is the group composition in $G$, and

whose horizontal composition on 2-cells (i.e., elements of $R$) is the group operation in $R$.

Vertical composition of 2-cells is composition of relations.

To see this, consider the equality of 2-diagrams

\[ \begin{array} {} \xrightarrow [\kern3em] {\textstyle g} && \xrightarrow [\kern3em] {\textstyle a} && \xrightarrow [\kern6em] {\textstyle ga} \\ \llap r \Big\Downarrow && \Big\Downarrow \rlap s && \Big\Downarrow \rlap {rs} \\ \xrightarrow [\textstyle h] {} && \xrightarrow [\textstyle b] {} & = & \xrightarrow [\textstyle hb] {} \\ \Big\Downarrow && \Big\Downarrow && \Big\Downarrow \\ \xrightarrow [\textstyle k] {} && \xrightarrow [\textstyle c] {} && \xrightarrow [\textstyle kc] {} \\ \end{array} \]

This may alternatively be depicted as

\[ \begin{array} {} \xrightarrow [\kern3em] {\textstyle g} && \xrightarrow [\kern3em] {\textstyle h} && \xrightarrow [\kern6em] {\textstyle gh} \\ \llap r \Big\Downarrow && \Big\Downarrow \rlap {s} && \Big\Downarrow \rlap {rs} \\ \xrightarrow [\textstyle g'] {} && \xrightarrow [\textstyle h'] {} & = & \xrightarrow [\textstyle g'h'] {} \\ \llap{r'} \Big\Downarrow && \Big\Downarrow \rlap{s'} && \Big\Downarrow \rlap{r's'} \\ \xrightarrow [\textstyle g''] {} && \xrightarrow [\textstyle h''] {} && \xrightarrow [\textstyle g''h''] {} \\ \end{array} \]

\[ \begin{array} {} \xrightarrow [\kern3em] {\textstyle g} && \xrightarrow [\kern3em] {\textstyle g'} && \xrightarrow [\kern6em] {\textstyle gg'} \\ \llap r \Big\Downarrow && \Big\Downarrow \rlap {r'} && \Big\Downarrow \rlap {rr'} \\ \xrightarrow [\textstyle h] {} && \xrightarrow [\textstyle h'] {} & = & \xrightarrow [\textstyle hh'] {} \\ \llap s \Big\Downarrow && \Big\Downarrow \rlap {s'} && \Big\Downarrow \rlap{ss'} \\ \xrightarrow [\textstyle k] {} && \xrightarrow [\textstyle k'] {} && \xrightarrow [\textstyle kk'] {} \\ \end{array} \]

We may diagram this situation as a diagram of sets:

\[ \begin{array} {} \calC_0 & \leftarrow & \calC_h = G \ni g & \to & \calC_0 \\ \uparrow && \uparrow && \uparrow \\ \calC_v & \leftarrow & \calC_2 = R \ni r & \to & \calC_v \\ \downarrow && \downarrow && \downarrow \\ \calC_0 & \leftarrow & \calC_h = G \ni h & \to & \calC_0 \\ \end{array} \]

where $\calC_0$ and $\calC_v$ are both trivial, i.e., one-element sets.

Wednesday, April 17, 2024

A (categorical) diagram for the Chain Rule (more reasonably called the Composition Rule)

The standard proof of the Chain Rule

(for which I suggest a better name is the Composition Rule),

for the derivative $(\blue f \green g)'$ of the composition of two differentiable functions between three vector spaces:

\[ \begin{array} {} && F \\ & \blue{ \llap f \nearrow } && \green{ \searrow \rlap g } \\ \blue E & {} \rlap{ \kern-.5em \xrightarrow [\textstyle \blue f \green g] {\kern6em} } &&& \green G \\ \end{array} \]

shows a bunch of equations, with some being substituted into others to yield the final desired result.

As an example of this, see the proof of the Chain Rule

given in Chapter XVII, Section 3 of Serge Lang's well-known <I>Undergraduate Analysis</I>.

(Actually, he only assumes $f$ and $g$ are defined on suitable open sets of $E$ and $F$, but for simplicity we will assume they are globally defined.)

Just giving the equations omits some important information: in which vector space are the equations holding, and in which vector space do the variables live?

Of course the knowledgeable reader can infer that, but it seems to me it would aid the understanding of the proof, and the situation, to make that "typing" information explicit.

That is easy to do with a categorical diagram.

I don't have sophisticated enough diagramming software to let me show you the full diagram in this blog, but I can tell you how to draw it for yourself on a sheet of paper.

Here is the key part, which in the full diagram is inscribed in the $E\to F\to G$ triangle drawn above.

\[ \begin{array} {} &&& F &&& \\ && \llap{\blue{xf}} \Bigg\uparrow & \blue{ \begin{array} {} h \black k \\ \black\Longrightarrow \\ h(xf') \mathrel {\black +} (h\psi_f) \mathrel \cdot |h| \\ \end{array} } & \Bigg\uparrow \rlap{ \blue{ (x+h)f = xf \mathrel{\black+} h \black k = xf + h(xf') + (h\psi_f) \cdot |h| } } \\ & \blue{ \xleftarrow [\kern3em] {\textstyle x+h} } &&&& \xrightarrow [\kern18em] {\textstyle xfg} \\ \blue E & \blue{ \smash{\Bigg\Uparrow} \rlap h } && \rightadj 1 && \smash{ \Bigg\Downarrow } \rlap{ \kern-6em (hk) (xfg') + \big( (hk)\psi_g \big) \cdot |hk| } & G \\ & \blue{ \xleftarrow[\textstyle x] {} } &&&& \xrightarrow [ \textstyle (x+h)fg = (xf+hk)g ] {} \\ \\ \hline \\ {} \rlap{ \blue{ \big(h(xf') \mathrel {\black +} (h\psi_f) \mathrel \cdot |h| \big) } (xfg') + \green{ \big( (hk)\psi_g \big) \cdot |hk| } } \\ {} \rlap{ \blue{h(xf')} (xfg') \mathrel {\black +} \blue{ \big((h\psi_f) \mathrel \cdot |h| \big) } (xfg') + \green{ \big( (hk)\psi_g \big) \cdot |hk| } } \\ \end{array} \]

----------

There are several notational conventions at play here.

\[ \begin{array} {l|cccc|cc|cccc} & \text{mult} & \text{appl} & \text{comp} & \text{appl+comp} & \lambda\text{ fn} & \lambda\text{ number} & \rightadj{ \text{the derivative} } \\ \hline \text{Lang} & xy & f(x) & g\circ f & g(f(x)) & \lambda(h) & \lambda h & f(\source{x+h}) \mathrel{\target=} f(\source x) \mathrel{\target+} \big(f'(\source x)= \rightadj D f(\source x)\big)(\source h) \mathrel{\target+} \source{|h|}\psi(\source h) \\ \text{Harbaugh} & x\cdot y & xf & fg & xfg & h\lambda & h\cdot\lambda & \source{(x+h)}f \mathrel{\target=} \source x f \mathrel{\target+} \source h \cdot \big(\source x (f'=f\rightadj D)\big) \mathrel{\target+} (\source h \psi)\cdot \source{|h|} \\ \text{example} &&&&&&& \source{(x+h)}^2 \mathrel{\target=} \source x ^2 \mathrel{\target+} \source h \cdot ( \source x \cdot \rightadj 2 ) \mathrel{\target+} \source{ \big( h\cdot h = ({h\cdot h \over |h|}) \cdot |h| \big) } \\ \end{array} \]

-------

Here is a higher level formulation of the Chain Rule:

\[ \begin{array} {} \blue x && \blue{x,x} && \blue{ x, xf} && \blue{xf'}, \blue{xf}\green{g'} && \blue{(xf')} \green{ (\blue{xf} g') } \\ \blue E & \blue{ \xrightarrow {\textstyle \Delta} } & \blue{ E\times E } & \blue{ \xrightarrow { \textstyle E\times f} } & \blue E \times F & \xrightarrow {\textstyle \blue{f'} \times \green{g'} } & \blue{ \hom E L {\black F} } \times \green{ \hom {\black F} L G } & \rightadj\to & \hom {\blue E} L {\green G} \\ {} \rlap { \kern1em \rightadj { \xrightarrow [ \textstyle (\blue f \green g)' ] {\kern 36em} } } \\ \blue x &&&&&&&& \blue x \rightadj{ (\blue f \green g)' } \\ \end{array} \]