Thursday, July 27, 2023

A categorical setting for elementary probability theory

(Early draft; I will be expanding this in the future.)

The basic idea is this:
Given a finite set $X$ and a subset $A\subseteq X$, 
we may regard the ratio of the size of $A$ to the size of $X$ 
as being the <I>probability</I> 
that a randomly chosen element of $X$ will be in $A$.

Thus given $A$ and its superset $X$, 
we have not only 
the inclusion relation between $A$ and $X$, which we may regard as being an arrow in the usual category $\Set$ of sets and functions, 
but also 
the positive rational number $\boxed{ |A| \over |X| }$, a number often called by probabilists <I>the probability of $A$</I> 
(when $X$ is assumed known).
Now, the positive rational numbers in fact form the objects of a discrete closed monoidal category (under multiplication) (details below).
Thus the above allows us to view the collection of sets as being the objects of an enriched category, enriched in the discrete closed monoidal category of positive rational numbers.

<hr />

That covers the notion of absolute probability. 
Let us now consider the notion of relative or conditional probability.

Suppose $X$ contains two sets, $A$ and $B$.
We may ask:
If $A$ is true, what is the probability that $B$ is true, 
and visa versa.
We may also ask:
What is the relation between those two probabilities (the equation that gives that is called "Bayes Theorem").

Let us depict this situation with a diagram;

\[ \boxed{ \begin{array} {} & A\cap B & \xrightarrow{ \textstyle{ A\cap B \over B } } & B & \\ & \llap{ A\cap B \over A } \Bigg\downarrow & \llap{ \scriptstyle \text{Bayes} } \nearrow \rlap{ {A \over B} } & \Bigg\downarrow \rlap{ {B \over X} } & \\ & A & \xrightarrow[ \textstyle{ A \over X } ]{} & X \\ \end{array} } \]

In the diagram,
at the vertices, the symbols $X, A, B, A\cap B$ denote <I>sets</I>;
when labelling arrows, they denote <I>positive integers</I>, viz the number of elements in the set, often denoted e.g. $|A|$.
Thus the ratios shown are ratios of positive integers, thus rational numbers.

Thus in the diagram
the bottom and right arrows give the absolute probabilities of $A$ and $B$ respectively,
The left arrow gives the relative or conditional probability of $B$ relative to $A$;
the top arrow gives that of $A$ relative to $B$. 
The commutativity of each triangle follows from elementary algebra. 
The commutativity of the upper left triangle, when interpreted in terms of probabilities, is Bayes Theorem:
\[ P(A|B) = P(B|A){P(A) \over P(B)} \]

Another example: 
A bipartite partition $A,B$ of $X$ together with a. third sunset $W$.
Note that we have changed notation so that $A,B$ form the two parts of the bipartite partition.
We will diagram this situation as:

\[ \boxed{ \begin{array} {} {} \rlap{ \kern-3em {A\cap W \over A} {A\over X} +  {B\cap W \over B} {B\over X} \xlongequal{\text{cancel}} {A\cap W \over X} + {B\cap W \over X} \xlongequal{\text{distribute}} {(A\cup B)\cap W \over X } \xlongequal{A\cup B = X} {W\over X} } \\ \\ \hline \\ \kern5em & A\cap W & \xrightarrow{ \textstyle{A\cap W \over W} } & W & \xleftarrow{ \textstyle{B\cap W \over W} } & B\cap W & \kern5em \\ & \llap{ A\cap W \over A } \Bigg\downarrow && \Bigg\downarrow \rlap{ W \over X } && \Bigg\downarrow \rlap{ B\cap W \over B } & \\ & A & \xrightarrow[ \textstyle{A \over X} ] {} & X & \xleftarrow[ \textstyle{B \over X} ] {} & B & \\ \end{array} } \]

In several of the examples in the chapter on conditional (or relative) probability in <I<Fat Chance</I>, 
we are given:
0. The numerical value for the bottom two arrows and the two side arrows.
From those values we may compute (setting $|X| = 1$):
1. The numerical values of the upper two corners $A\cap W, B\cap W$.
2. Using the fact that those last two form a partition of $W$ (the pullback of a partition is a partition), we add those last two values to get $|W|$.
3. Now that this is known, we may compute any and all of the three remaining conditional probabilities given by the three remaining arrows.

Such diagrams are useful in displaying numerical information associated with probability situations.
For example, consider the "Monty Hall" situation described in Section 9.1 of <I>Fat Chance</I>.
(And whose precise assumptions are emphasized here:
https://en.wikipedia.org/wiki/Monty_Hall_problem#Standard_assumptions )
The various expressions mentioned in the book are shown in this diagram:
\[ \boxed{ \begin{array} {} \kern8em & A\land W & \xrightarrow{  } & W & \xleftarrow{ } & B\land W & \kern9em \\ & \llap{ P(W|A) = {k-1 \over n-2} } \Bigg\downarrow &&  && \Bigg\downarrow \rlap{ P(W|B) = {k \over n-2} } & \\ & \llap{ \text{your guess is right} = {} } A & \xrightarrow[ \textstyle{ P(A) = {k \over n} } ] {} & X & \xleftarrow[ \textstyle{ P(B) = {n-k \over n} } ] {} & B \rlap{ {} = \text{ your guess is wrong} } & \\ &&& {} \rlap{\kern-10em X = \text{set of doors;} \; |X| = n = \text{number of doors;} \; k = \text{number of cars} } \\ \end{array} } \]
To get the original, basic, Monty Hall situation, take $n=3$, $k=1$.

Saturday, July 8, 2023

Bayes' Theorem -- relations between ratios

This is a preliminary draft.
$\def\bA{{\blue A}} \def\gB{{\green B}}$

First, we give a diagrammatic version of the Bayes situation, 
using the abstract letters $\bA$ and $\gB$ for two subsets of the larger, but finite, set $X$:

\[ \boxed{ \begin{array} {} && \bA \rlap{ \blue{ \xrightarrow [\kern9em] {\textstyle {A \over \black X}} } } &&&& X \\ & \blue{ \llap{ A \cap \gB \over A } \nearrow } \rlap{ \scriptstyle \kern.8em \red{\text{Bayes}} } && \searrow \rlap{\bA \over \gB} && \green{ \nearrow \rlap{ B \over \black X } } \\ \rightadj{ \boxed{\bA \mathrel{\rightadj \cap} \gB} } \rlap{ \green{ \xrightarrow [\textstyle {\bA \mathrel{\rightadj \cap} B \over B} ] {\kern9em} } } &&&& \gB \\ \end{array} } \]

Here the expressions $\bA, \gB, \bA \mathrel{\rightadj \cap} \gB, X$, 
when they are <I>vertices</I> of the diagram, represent sets, 
and the arrows between them represent set-theoretic inclusion.
On the other hand, 
when those expressions are part of <I>the labels</I> of the arrows of the diagram, 
they represent the natural numbers which are the number of elements in the corresponding set.
Thus the quotients shown are either absolute or relative (i.e., conditional) probabilities:
the two shown arrows going into $X$ are labeled with <I>absolute</I> probabilities, 
while the two shown arrows going out of $\bA \mathrel{\rightadj \cap} \gB$ are labeled with <I>relative</I> or <I>conditional</I> probabilities.

Now we present Bayes' theorem as it is traditionally stated, using only  numbers and elementary algebra.
In its simplest, clearest, form, it is the top equation, Bayes-0, (which is trivially true) in the box below.
{For the time being, ignore the lower equation; 
it differs from the top equation only in notation and a trivial equality.)

\[ \boxed{  \begin{array} {ccccc|l}  {H \cap D \over D} & \xlongequal{\text{Bayes-0}} & {H \cap D \over H} & \times & \red{ {H \over D} } & \text{ numbers} \\ \\ \hline \\ P(H|D) & \xlongequal{\text{Bayes}}  & P(D|H) & \times & \red{ P(H) \over P(D) } & \text{ (sub)sets or properties} \\  \end{array}  }  \]

That's it! That bit of very elementary algebra  
(which we might call "the switch of denominators, i.e. contexts", relative to ${H\cap D}$)  
in the top line, Bayes-0,
is the essence of Bayes' Theorem.  

What remains is to explain how that relates to the setup for, and traditional statement of, Bayes' Theorem.
I.e., what it all means.


Here is the setup for the situation. 
For concreteness, we will consider a special case of the general situation which is easily understood.

Suppose we have a population, a specific group of Americans.
We poll them and ask two questions of them: are you a Democrat (a member of the Democratic Party), and are you a homosexual?
We will call the set of people self-describing as Democrats set $\boxed D$, and similarly we let set $\boxed H$ consist of those who self-describe as homosexuals.
We also, for simplicity, let the symbols $D$ and $H$ denote the number of individuals in each set.
The context should make clear whether the symbol denotes a set or a number.

Next we have the intersection $H \cap D$ of sets $D$ and $H$, consisting of those individuals who described themselves as both Democrats and homosexual.
We denote this set by $\boxed{ H\cap D }$. So $H \cap D$, is the set of homosexual Democrats.
Again,  $H\cap D$ will also denote the number of individuals in that set.

With that setup and those definitions out of the way, we can now pose some questions. 

1. What fraction of the Democrats are homosexual? The answer is easy, it is the fraction $H\cap D \over D$.

2. What fraction of the homosexuals are Democrats? The answer is again easy, it is the fraction $H\cap D\over H$.

3. What relation, if any, is there between those two fractions?
The answer is given in the boxed equation Bayes-0 above (whose proof is a triviality), which is called "Bayes' Theorem".

4. What is the use of this equation? Well, if you know any two of those three ratios, it shows the third is determined, and gives you an easy way to calculate it.

Let us illustrate this with a hypothetical numerical example. 
Suppose 80% of homosexuals were Democrats, so ${H\cap D  \over H} = .8$, and
the ratio of Democrats to ALL homosexuals (20% of whom are not Democrats) was 10 to 1, so $\red{ {H\over D} = .1 }$. 
Then, instantiating the equation Bayes-0 with these two assumptions, we get 
\[ \boxed{ \begin{array} {} {H\cap D \over D} & \xlongequal{\text{Bayes-0}} & {H\cap D \over H} & \times & \red{ {H \over D} } \\ \\ \hline \\  {H\cap D \over D} & \xlongequal{\text{Bayes-0}} & .8 & \times & \red{.1} & = & .08 \end{array}  } \]
i.e. our two assumptions imply 8% of Democrats are homosexual.

For a diagrammatic presentation (with some possible numbers for population sizes), see
\[ \boxed{ \begin{array} {} H = 10 & \xleftarrow{ \textstyle {H\cap D \over H} = .8 } & H\cap D = 8 \\ \downarrow & \red{ \searrow \rlap{ {H\over D} = .1 } } & \downarrow \rlap{ {H\cap D \over D} = .08 } \\ X & \xleftarrow{} & D = 100 \\ \end{array} } \]
You can visualize this geometrically.
Imagine a strip divided into three regions.
The first is two units long.
The second is eight units long.
The last is 92 units long.
Let $H$ be the union of the first two parts,
$D$ the union of the last two parts.
Then $H\cap D$ is the second part.
\[ \color{lightpink} { H=10 \atop \Rule{10mm}{5mm}{0mm} } {} \rlap{ \kern-6mm 8 } {} \rlap{ \color{blue} { \kern-10mm \lower5ex { \Rule{100mm}{5mm}{0mm} \atop D=100 } } } \]
(This is a form of Venn diagram.)

So under our two assumptions we have
80% of homosexuals are Democrats, while 
8% of Democrats are homosexuals.
Further, these two statistics IMPLY the ratio of Democrats to homosexuals is 10 to 1, 
since any two of the ratios in the Bayes equation determine the third.

The fact that we want to stress is:
If 80% of homosexuals are Democrats, 
that DOES NOT imply that
80% of Democrats are homosexuals. 
I.e., you can't invert conditional probabilities.
In words, the fact that $H\cap D$ is quite large relative to $H$ (80% in our hypothetical example) says nothing about how large $H\cap D$ is relative to $D$. 
That depends entirely on the ratio of $D$ to $H$.
Precisely, Bayes Theorem says:
\[ \boxed{ \begin{array} {}  { {H\cap D}/H \over {H\cap D}/D } &  \xlongequal {\text{Bayes-0}} & \red{ {D \over H} } \\ \\ \hline \\ {80\% \over 8\%} & \xlongequal {\text{Bayes}}  & \red{ {10 \over 1} } \\  \end{array}  } \; . \]

There is a folk saying related to this phenomenon:
"A big fish in a small pond versus 
a small fish in a large pond."
Here clearly $H\cap D$ (the homosexual Democrats) is (are) the "fish", 
while the "small pond" and "large pond" are respectively the homosexuals and the Democrats.
Bayes theorem states the relation between the various ratios involved, 
the fish to each pond, and between the ponds.
To put it in common sense terms:
If you know the ratio of the pond sizes, 
and the fraction of the small pond the fish takes up, 
then you can easily calculate the fraction of the large pond it would take up.
E.g.,
if the fish is half of the small pond, 
and the large pond is three times the size of the small pond, 
then the fish would be $1/6$ of the large pond.

For another, perhaps hypothetical, example, 
if 90% of men with prostate cancer have high levels of PSA, 
that DOES NOT imply that
90% of men with high levels of PSA will get prostate cancer.

There is a geometrical way of viewing this situation.
Imagine two rectangles, one with its long side horizontal, labeled $H$, 
and one with its long side vertical, labeled $V$.
Suppose they have some overlap, i.e., a non-empty intersection, labeled $H\cap V$.
Again, we let the same label stand for both 
the designated geometric region (a set of points) and 
its area (a number).

Considering the three numbers, i.e. the areas, 
as before we have the three ratios and the simple relation between those ratios

\[ \boxed{ \begin{array} {} H\cap V \over V & \xlongequal{\text{Bayes}} & H\cap V \over H & \times & \red{ {H \over V} } \\ \\ \hline \\ P(H|V) & \xlongequal{\text{Bayes}} & P(V|H) & \times & \red{ P(H) \over P(V) } \\ \end{array} } \]

relating the ratios of their areas.
In fact, this same equation applies to any two measurable regions with a measurable intersection, but it is easier to visualize for rectangles.


For the general situation, 3Blue1Brown has a good elementary discussion:

https://youtube.com/watch?v=HZGCoVF3YvM

Here we instantiate our original presentation of Bayes' Theorem into an instance relevant to that video.
Here $L$ = the set of librarians (or the number of such), and $B$ = the set of book-lovers (or the number of such).
\[ \boxed{ \begin{array} {ccccc|l} {B \cap L \over L} & \xlongequal{\text{Bayes-0}} & {B \cap L \over B} & \times & \red{ {B \over L} } & \text{ numbers} \\ \\ \hline \\ P(B|L) & \xlongequal{\text{Bayes}} & P(L|B) & \times & \red{ P(B) \over P(L} ) & \text{ (sub)sets or properties} \\ \hline \text{high} && \text{low} & \times & \red{\text{large}} & \text{ qualitative description} \\ \hline .8 & = & .016 & \times & \red{50} & \text{ made-up, but plausible, numbers} \end{array} }  \]
Here we repeat our original description of Bayes' Theorem, but using the notation of that video, and introduce the terminology it uses for various parts of the equation. 
Here $H$ = hypothesis, $E$ = evidence.

\[ \boxed{  \begin{array} {ccccc|l}  H\cap E \over E & \xlongequal{\text{Bayes-0}} & H\cap E \over H & \times & \red{ H \over E } & \text{ numbers} \\ \\ \hline \\ P(H|E) & \xlongequal{\text{Bayes}}  & P(E|H) & \times & \red{ P(H) \over P(E) } & \text{ (sub)sets or properties} \\ \hline \text{posterior} & \xlongequal{\text{Bayes}} & \text{likelihood} & \times & \red{ \text{prior} \over {} \text{evidence} } & \text{ descriptive words} \end{array}  }  \]

There is also a five minute video:

https://youtu.be/XQoLVl31ZfQ

-----

Diagrams in preliminary states:

\[ \boxed{ \begin{array} {} \kern2em & {} \rlap{ \xrightarrow [\kern8em] {L\cap B = 8}  } && \kern1em && {} \rlap{ \xrightarrow [\kern8em] {L=10} } \\ && \llap{L = 10} \searrow && \llap{ {L \cap B \over L} \kern-.5em } \nearrow & \kern1em {} \rlap{ \lower3ex{ \scriptstyle \kern-2em \text{Bayes-0} } } & \llap{ {L \over B} \kern-.5em } \searrow && \nearrow \rlap{ B = 500 } & \kern4em \\ &&& {} \rlap{ \xrightarrow [ .016 = {8 \over 500} = {L \cap B \over B } = P(L|B) ] {\kern8em} } &&&&  \\  \end{array} } \]

\[ \boxed{ \begin{array} {} && \green{B \cap L = 4}  \\  & \green{ \llap{ \boxed{{L\over B\cap L } = {10\over 4} = 2.5 } } \swarrow } && \red{  \nwarrow \rlap { {B\cap L \over B\cap F} = {4\over 20} } } \\  \green{ \boxed{L = 10} } &    & \green\downarrow &&   \\  && & &&  \\ \green\downarrow & \red{ \nwarrow \rlap{ \scriptstyle {L\over F} = {10\over 200} } } & B & \xleftarrow {} & \green{B \cap F = 20} \\ & \swarrow && \swarrow \rlap{ \boxed{{F\over B\cap F} = {200\over 20} = 10} } \\ X & \xleftarrow{} & \boxed{F = 200} \\ \end{array} } \]
An equation:
\[ \boxed{ \begin{array} {} \red{4\over 20} & = & \green{4\over 10} & \times & \red{10\over 200} & \times & {200\over 20} \\ \hline  \red{B\cap L \over B\cap F} & = & \green{B\cap L \over L} & \times & \red{L \over F} & \times & {F \over B\cap F} \\ \hline & = & {{B\cap L} / L} \over {{B\cap F} / F} & \times & \red{L \over F}  \\ \hline & = & {4/10} \over {20/200} & \times & \red{10 \over 200} \\ \hline & = & 4 & \times & \red{1 \over 20} \\ \end{array} } \]

Two bipartite partitions of a set, say $X$, yields this diagram:

The general case, say $X=A+B$ and $X=W+L$:
\[ \boxed{ \begin{array} {} \kern1em & A\cap W & \xrightarrow{ \textstyle {A\cap W \over W} } & W & \xleftarrow{ \textstyle {B\cap W \over W} } & B\cap W & \kern1em \\ & \llap{ A\cap W \over A } \Bigg\downarrow & \raise1ex{ A\cap W \over X } & \Bigg\downarrow \rlap{ W\over X } & \raise1ex{ B\cap W \over X } & \Bigg\downarrow \rlap{ B\cap W \over B } & \\ & A & \xrightarrow { \raise0ex{ \smash{ \textstyle{ A\over X } } } } & X & \xleftarrow { \raise0ex { \smash{ \textstyle{B\over X} } } } & B & \\ & \llap{ {A\cap L \over A} } \Bigg\uparrow & A\cap L \over X & \Bigg\uparrow \rlap{ L\over X } &  B\cap L \over X & \Bigg\uparrow \rlap{ B\cap L \over B } & \\ & A\cap L & \xrightarrow [ \textstyle{A\cap L \over L} ] {} & L & \xleftarrow [ \textstyle{B\cap L \over L} ] {} & B\cap L & \\ \end{array} } \]

For the special case in Section 9.4 of <I>Fat Chance</I>, A=Left, B=Right and W=Tracy, L=Paul, 
we get this diagram:
\[ \boxed{ \begin{array} {} \kern4em & T\cap L = 15 & \xrightarrow{} & T = 47 & \xleftarrow{} & T\cap R = 32 & \kern4em \\ & \llap{ \boxed{ {T\cap L \over L} = {3\over 4} } } \Bigg\downarrow && \Bigg\downarrow && \Bigg\downarrow \rlap{ \boxed{  {T\cap R \over R} = {2\over 5} } }  & \\ & L = 20 &  \xrightarrow { \raise2ex{  \smash{ \boxed{  {L\over X} = {1\over 5} = .2 } } } } & X = 100 &  \xleftarrow { \raise2ex { \smash{ \boxed{ {R\over X} = {4\over 5} = .8 } } } } & R = 80 & \\ & \llap{ \boxed{ {P\cap L \over L} = {1\over 4} } } \Bigg\uparrow && \Bigg\uparrow && \Bigg\uparrow \rlap{ \boxed{ {P\cap R \over R} = {3\over 5} } }  & \\ & P\cap L = 5 & \xrightarrow{} & P = 53 & \xleftarrow{} & P\cap R = 48 & \\ \end{array} } \]

--------

The following pertains to the video https://youtu.be/R13BD8qKeTg .

The situation is this:
There is a certain disease $\boxed H$, and a certain test $\boxed E$ for that disease.
(See below for the letters.)

You have been tested for the disease,
and tested positive.
But that doesn't necessarily mean you have it - the test is not a perfect indicator. 
Some people without the disease test positive (false positives). $\lnot H \land E = E-H$.
And some people with the disease will not be detected by the test (false negatives). $H \land \lnot E = H-E$.
If the test were a perfect indicator we would have $H=E$ and both the above differences would be zero (or empty).

To analyze this situation,
let $\boxed H$ be the <I>hypothesis</I>, that you have the disease. Also the number representing the probability that you have it, that is, the fraction of the general population that has it.
Let $\boxed E$ be the <I>evidence</I>, that you tested positive under the test.
Also the number representing the probability that a general member of the population will test positive in that particular test.

So clearly what we are interested in is $\boxed{P(H|E) = {H\land E \over E}}$, that is, if you tested positive in that particular test (the E), what is the probability that you actually have the disease (the H)?

In the video, the narrator gives three items of numerical information (numbers), using words to describe what those numbers mean:

$\boxed{P(H) = .001 = {1 \over 1000}}$ = fraction of people with the disease.
$\boxed{P(E|H) = {H\land E \over H} = .99 = {99\over 100}}$ = fraction of those with the disease who test positive = the rate of valid positives.
$\boxed{P(E|\lnot H) = {\lnot H \land E \over \lnot H} = .01 = {1\over 100}}$ = fraction of those without the disease who test positive = the rate of false positives.
That is all he tells you.
Those are the knowns.
Note that the last two tell you what the testing probabilities will be IF you know whether you have the disease or not.
I.e. they are backward from what we want : to go from test result to disease probability.

From that we infer 
$P(\lnot H) = 1-P(H) = 1-.001 = .999$ (the fraction without the disease).

Now let's see how we can use that information to solve the problem, 
i.e. calculate P(H|E).

First we must calculate P(E):
\[ \begin{array} {} P(E) & = && P\big( E\land {(H\lor \lnot H)} \big) \\ & = && P\big( {(E\land H)} \lor {(E\land \lnot H)} \big) \\ & = & P(E\land H) & + & P(E\land{\lnot H}) \\ & = & \boxed{P(E|H)}\,\boxed{P(H)} & + & \boxed{P(E|\lnot H}\,P(\lnot H) \\ & = & \boxed{.99} \times \boxed{.001} & + & \boxed{.01} \times .999 \\ & = & .00099 & + & .00999 \\ && \text{valid positives} && \text{false positives} \\ & = && .01098 \\ & \sim && 1.1\% \\ &&& \text{all positives} \\ \end{array} \]

With that calculation out of the way, now we may apply the simple, basic Bayes' Theorem to get the answer:

First a little recall:
Again, $H$ = hypothesis = you have the disease, 
$E$ = evidence = your test returned positive.

\[ \boxed{ \begin{array} {ccccc|l} H\cap E \over E & \xlongequal{\text{Bayes-0}} & H\cap E \over H & \times & \red{ H \over E } & \text{ numbers} \\ \\ \hline \\ P(H|E) & \xlongequal{\text{Bayes}} & P(E|H) & \times & \red{ P(H) \over P(E) } & \text{ (sub)sets or properties} \\ \hline \text{posterior} & \xlongequal{\text{Bayes}} & \text{likelihood} & \times & \red{ \text{prior = hypothesis} \over {} \text{evidence} } & \text{ descriptive words} \\ \\ \hline \\ P(H|E) & \xlongequal{\text{Bayes}} & .99 & \times & \red{ .001 \over .01098 } & \\ &  = & .99 & \times & \red{ 1 \over 10.98 } \\ & \sim & .99 & \times & \red{ 1 \over 11 } \\ &  = & .09 \\ & = & 9\% \\ \end{array} } \]
And that is the answer the narrator gives, but we went through the details, 
and found the various intermediate results, which have their own  interest.
-----------------
Let us put some of the above information in a diagram, where the information provided by the narrator is given in boxes:

\[ \begin{array} {} && \text{valid positives} && \leftadj{ \text{all positives} } && \text{false positives} \\ && H\land E \atop P(H\land E) = {99\over100,000} \sim {1\over1,000} & \xrightarrow [{99\over\leftadj{1,098}} \sim {1\over\leftadj{11}}] { H\land E \over \leftadj E} & \leftadj{ E \atop P(E) = {(\blue{99}+\blue{999})=1,098\over100,000} \sim {11\over 1,000} } & \xleftarrow [{999\over\leftadj{1,098}} \sim {10\over\leftadj{11}}] { \lnot H\land E \over \leftadj E} & \lnot H \land E \atop P(\lnot H \land E) = {999\over100,000} \sim {1\over100} = {10\over1,000} \\  && \llap{\text{rate of valid positives} \; \boxed{ {H\land E \over H} = {99\over100} } } \Bigg\downarrow &&&& \Bigg\downarrow \rlap{ \boxed{ {\lnot H \land E \over \lnot H} = {1\over100} } \; \text{rate of false positives} } \\  && \boxed{ H \atop P(H) = .001 = {1\over1000} = {100\over100,000} } &&&& \lnot H \atop P(\lnot H) = .999 = {999\over1000} = {99,900\over100,000} \\ && \text{have the disease} &&&& \text{don't have the disease} \\  \end{array} \]

Here the input parameters are varied to consider another case 
(where the test is much more successful), 
where the test is 100% successful on those who have the disease, 
but for those who don't have the disease, fails only $1\over1K$ of the time:


\[ \begin{array} {} && \text{valid positives} && \leftadj{ \text{all positives} } && \text{false positives} \\ && H\land E \atop P(H\land E) = {1\over 1K} = {1K\over 1M} & \xrightarrow [{1K\over \leftadj{1,999}} \sim {1\over{\leftadj2}}] {H\land E \over \leftadj E} & \leftadj{ E \atop { P(E) = {(\blue{1K}+\blue{999} = 1,999)\over1M} \sim {2K\over1M} = {2\over1K} } } & \xleftarrow { \lnot H\land E \over \leftadj E} & \lnot H \land E \atop P(\lnot H \land E) = {999\over1M} \sim {1K\over 1M} \\ && \llap{ \text{rate of valid positives} \; \boxed{ {H\land E \over H} = 1 } } \Bigg\downarrow &&&& \Bigg\downarrow   \rlap{ \boxed{ {\lnot H \land E \over \lnot H} = {1\over1K} = .001 } \; \text{rate of false positives} } \\ && \boxed{ H \atop P(H) = .001 = {1\over 1K} } &&&& \lnot H \atop P(\lnot H) = .999 = {999\over 1K} \sim 1 \\ && \text{have the disease} &&&& \text{don't have the disease} \\ \end{array} \]

Let us now tackle the issues of independence and correlation, whether positive or negative.
Here are several equivalent formulations of <I>positive correlation</I>, showing the symmetry of the formulations.
For <I>negative correlation</I> or <I>independence</I>, 
merely replace $\lt$, shown here in vertical mode as $\vee$, 
by $\gt$ or $=$.

\[ \boxed{ \begin{array} {ccccc|ccccccccc|ccccc} {H\cap D  = HD \over H} & = & {8\over10} & = & {4\over5} & \bA \cap \gB \over \bA &&  {\bA \cap \gB \over \bA} \times {\bA \over X} & = & \bA \cap \gB \over X & = &  {\bA \cap \gB \over \gB} \times {\gB \over X} &&  \bA \cap \gB \over \gB & {H\cap D = HD \over D} & = & {8\over100} & = & {2\over25} \\ \vee & \iff & \vee & \iff & \vee & \vee & \iff & \vee & \iff & \vee & \iff & \vee & \iff & \vee & \vee & \iff & \vee & \iff & \vee \\ {D\over X} & = & {100\over200} & = & {1\over 2} & \gB \over X && {\bA \over X} \times {\gB \over X} & = & {\bA \over X} \times {\gB \over X} & = &   {\bA \over X} \times {\gB \over X} &&  \bA \over X & {H\over X} & = & 10\over200 & = & 1\over20 \\ \hline &&&&& {(\bA\cap\gB)/\bA} \over \gB/X &&&&  (\bA\cap\gB)/X \over (\bA/X) \times (\gB/X) &&&& {(\bA\cap\gB)/\gB} \over \bA/X &&&&& 2/25 \over 1/20 \\ &&&& 8\over5 &&&&& (\bA\cap\gB) \times X \over \bA \times \gB &&&&&&&&& {40\over25} = {8\over5} \\ \end{array} } \]

In the above box, note that
the leftmost general inequality means a positive correlation between $B$ and $A$, while
the rightmost general inequality means a positive correlation between $A$ and $B$.