Saturday, May 24, 2014

Ramsey theory

Ramsey theory is a fascinating subfield of combinatorics.  It deals with the idea that you can find always find regular substructures in large enough arbitrary, not necessarily regular, structures.  It can be considered a generalization of the Pigeonhole Principle.  A simple case of the general Ramsey's theorem is the following statement: among any group of 6 people, there is either a group of 3 people who mutually know each other, or there is a group of 3 people who mutually do not know each other.  Couched in graph theory, this is equivalent to the statement that for any 2-coloring of of the complete graph on 6 vertices $K_6$, there exists a subgraph of 3 vertices that is monochromatic, i.e. all the edges in the subgraph are colored with the same color.

For this simple case, the proof is quite easy. Pick a person $A$ from the group.  If $A$ has $3$ or more friends, pick $3$ of them, say $B$,$C$ and $D$.  If $2$ people among these friends are friends with each other, then together with $A$ they would form a group of 3 mutual friends.  Otherwise $B$,$C$ and $D$ mutually do not know each other.
If $A$ has $2$ or less friends, then there are at least $3$ people who $A$ does not know and you have a similar argument.  Pick $3$ among them, if they are not mutual friends, then there are $2$ of them who are not friends, then these $2$ people along with $A$ form a group who mutually do not know each other.

Saturday, May 10, 2014

Triangle theorems

Most of the theorems about triangles that I read about date from antiquity.  For instance, Euclid's Elements contains many propositions such as
  • The sum of the length of two sides of a triangle is larger than the length of the third side.
  • An isosceles triangle has the two angles at the base equal to each other.
  • Pythagoras Theorem
An interesting theorem is Heron's formula (probably known in Archimedes' time) for the area $A$ of a triangle with sides of lengths a, b,c:
\[ A = \sqrt{s(s-a)(s-b)(s-c)}\]
where $s$ is the semiperimeter defined as $s = \frac{1}{2}(a+b+c)$.  This theorem allows you to calculate the area of a triangle using solely the lengths of the 3 sides.  A similar theorem was discovered by the Chinese independently from the Greeks.

Of more recent vintage is Routh's theorem, published in 1896, that describes the ratio between a triangle and another triangle generated via the intersection of 3 cevians.

I was pleasantly surprised to hear about 2 new theorems about triangles that have been proved only relatively recently.

  1. Conway's little theorem (1976): A triangle is equilateral if and only if the ratio between the length of any 2 sides is rational and the ratio between any 2 angles is rational.
  2. Grieser and Maronna 2013: Up to congruence, a triangle is determined uniquely by its area, perimeter and the sum of the reciprocals of its angles. 
  • J. Conway, "A Characterization of the Equilateral Triangles and Some Consequences," The Mathematical Intelligencer, 20 March 2014.
  • D. Grieser and S. Maronna, "Hearing the Shape of a Triangle," Notices of AMS, vol. 60, no. 11, pp. 1440-1447, 2013. 

Sunday, May 4, 2014

Causality versus correlation

One of the challenges in data analysis is to determine whether correlated data implies causation.  For instance, sales of sunscreen and ice cream may be correlated during the year, but that does not mean that purchase of sunscreen leads to purchase of ice cream or vice versa.

Correlated data means then when one type of data occurs, so does another type of data and when one type of data is missing, so does the other type of data. From a logic point of view, consider two statements $P$ and $Q$ and we look at when they are true or false from observational data.  The presence or absence of these 2 statements being true can be collected in a truth table.  Let us assume that correlated data means the truth table has the following form (although the case of tautology below shows that this is not always the best assumption):

$P$$Q$$P$ and $Q$ are Correlated
i.e. we observe that $P$ and $Q$ are either both false or either both true.  There are 4 ways to fill in the 2 question marks above.

In mathematical logic, the notation $P\rightarrow Q$ is interpreted as P implies Q.  The only time this statement is false is if P is true, and Q is false.  Logically it is equivalent to $\neg P \vee Q$ and to its contrapositive $\neg Q \rightarrow \neg P$ and has the truth table:
$P$ $Q$ $P\rightarrow Q$
00 1
0 1 1
1 00
1 1 1

However, this must not be interpreted as P causing Q, as $P\rightarrow Q$ is still true if $P$ is false and $Q$ is true.  Just because the P being true resulted in Q being true, does not mean that Q cannot be true independent of P.  The lesson here is that the definition of "P implies Q" is different from our definition of "P causes Q".

The following truth table of $\neg(P\oplus Q)$, the exclusive-nor of P and Q, is a more appropriate way to capture the meaning of "$P$ causes $Q$".  This is also equivalent to the statement $P$ if and only if $Q$, i.e. $(P\rightarrow Q) \wedge (Q\rightarrow P)$, also written as $P\equiv Q$.

$P$$Q$$\neg(P\oplus Q)$

However, we have a dilemma here.  The same argument can be used to say that this truth table is appropriate to indicate that $Q$ causes $P$. This shows the difficulty of deriving causation solely from observational data.  To determine causality we need a method to apply control, say change the truth value of P and see how it affects the truth value of Q.

What about the other 2 ways to fill in the question marks? The following truth table corresponds to
$Q\rightarrow P$:

$P$$Q$$Q\rightarrow P$

Shows that $Q$ implies $P$, but again not necessary that $Q$ causes $P$.

The following truth table corresponding to a tautology, and indicates that P and Q are uncorrelated since
any combination of the truth values of P and Q are possible.


Thus just the fact that we always observe either $P$ and $Q$ both being true or both being false does not mean that they are correlated, it could be that we haven't observed the other combinations yet.

Saturday, May 3, 2014

Odd days in May

During lunch today, we were talking about how tomorrow, May 4, is Star Wars day, a play on words on the classical phrase in Star Wars, "May the force be with you" (May the fourth be with you), and is a day when Star Wars fans celebrate all things Star Wars.

As a kid, I have always been a fan of the Star Wars movies.  My brother and I were very much into special effects and very often we would go to the book store to buy magazines dedicated to the art of special effects in movies and we were members of the Star Wars fan club. We learned how the third movie was initially called "Revenge of the Jedi", but was changed later to "Return of the Jedi" because revenge seems uncharacteristic for a Jedi.  The very first movie that I saw in a movie theater in the US was "Return of the Jedi" (The original theatrical run, not the special edition in the late 1990's).  Many years later I introduced my wife to Star Wars by watching the special edition trilogy with her in the movie theater.

While pondering all this, I turned to my son and said, "Shouldn't any odd day in May (May 1, May 3, etc.) be called 'The Hunger Games Day'?".  There are 16 such days in May, so that is a lot of celebration.