Accidental Desultory Cogitations

Saturday, October 26, 2013

Theorem of the alternative and logical equivalence

In the theory of linear equalities and inequalities, several fundamental results are formulated as a theorem of the alternative: two alternatives are given, and exactly one of the alternatives is satisfied at any one time. A well known example is Farkas' lemma (1894):

For a real matrix $A$ and real vector $b$, exactly one of the following conditions is true:

There exists a real nonnegative vector $x$ such that $Ax = b$
There exists a real row vector y such that $yA$ is nonnegative and $yb$ is negative.

where we say a real vector is (non)negative if all its entries are (non)negative. Consider a theorem of the alternative given in the following form: given two alternatives $P$ or $Q$, only one of them is true at any time, no more and no less. The truth table for this conclusion (denoted as $Z$) is then:

P	Q	Z
0	0	0
1	0	1
0	1	1
1	1	0

This means that $Z$ is equal to $P\oplus Q$ which is equivalent to $P\Leftrightarrow \neg Q$ which is equivalent to $\neg P\Leftrightarrow Q$.
Thus the theorem

Exactly one of the two statements $P$ and $Q$ is true

can also be restated as:

P is true if and only if Q is false

or as:

P is false if and only if Q is true

Setting the 2 statements in Farkas' Lemma above as $P$ and $Q$, then
$P\Leftrightarrow \neg Q$ is written as

Let $A$ be a real matrix and let $b$ be a real vector. Then there exists a nonnegative real vector $x$ such that $Ax=b$ if and only if for each row vector $y$ either $yA$ is not a nonnegative vector or $yb$ is nonnegative.

Given that $P\Rightarrow Q$ is logically equivalent to $\neg P \vee Q$, this can be rewritten as

Let $A$ be a real matrix and let $b$ be a real vector. Then there exists a nonnegative real vector $x$ such that $Ax=b$ if and only if $yb$ is nonnegative for each row vector $y$ such that $yA$ is a nonnegative vector.

which is the form of Farkas' Lemma as stated in A. Schrijver, Theory of Linear and Integer Programming, Wiley, 1998.

Wednesday, October 16, 2013

Gravity

Over the weekend, my family and I went to see the movie "Gravity" in IMAX 3D after hearing all the buzz about it. Admittedly there are some scientific errors in the movie (my son was whispering to me during the movie why Clooney needs to detach himself from Bullock to save her), the movie was a thrilling ride and made you feel what it is like to be adrift in space. I have always been fascinated by spaceflight ever since I read the books "The right stuff" and "2001: a space odyssey" (I have also seen the movies, of course) as a boy. The movie also reminds me of a common misconception we have when we see footage of astronauts in space: that there is no gravity in space (hence the term "zero gravity"). We see that as soon as the spacecraft escapes Earth's atmosphere, things like pens start to float around in the capsule. While is it true that in deep space, far away from other celestial bodies, the effect of their gravitational pull can be minuscule, all manned spaceflights are still relatively close to Earth and will feel its gravitational pull.
Consider astronauts at the International Space Station (ISS). They experience a gravitational pull similar in strength to what we experience on the ground. Recall Sir Isaac Newton's celebrated equation describing the forces between two bodies:
$$ F = \frac{Gm_1m_2}{r^2}$$
where $G\approx 6.67384 \times 10^{-14}m^3g^{-1}s^{-2}$.
Using Newton's second law of motion $F=ma$, the acceleration an object is subjected to due to another object with mass $m$ and a distance $r$ away is $\frac{Gm}{r^2}$. To calculate the acceleration induced by the gravitational force you feel from the Earth, if we assume the Earth is a sphere with a spherically symmetric density, then $r$ is the distance between you and the center of the Earth, approximately 6.371 Mm. The mass of the Earth $M_e$ is approximately $5.972\times 10^{27}g$, resulting in a constant $ \frac{G M_e}{r^2} \approx 9.82 ms^{-2}$.
The ISS is about 0.37 Mm above the ground, increasing $r$ to be about 6.731 Mm. The resulting constant is then approximately $8.8$, a 10% reduction in weight for everyone on board. However, the astronauts do not feel this force as they are orbiting the Earth, as they are essentially in continuous free fall. In other words, without gravity, the spacecraft would move in a straight line, and the speed of the spacecraft is such that the acceleration needed to change the trajectory to an orbit around the Earth is exactly equal to the acceleration induced by the gravitational force. It is a nice exercise in calculus (which by the way was also invented by Isaac Newton through his introduction of Fluxions and Fluents) to show this.

Since everything is relative, we must also note that we (along with the Earth) are orbiting the Sun as well at a speed of over 100Mm/hr. The distance from the Earth to the Sun is approximately $1.496\times 10^{11}m$ and the mass of the Sun is approximately $1.989\times 10^{33}g$. Applying Newton's formula above, we get $ \frac{G M_s}{r^2} \approx 0.00593$, which is much smaller than the constant for Earth. Since we and the Earth are in the same orbit (and thus in free fall) around the Sun, we do not notice the effects of the Sun's gravitational pull.

When I told my wife that we are moving around the Sun at high speed, she quipped "So we don't have a sedentary lifestyle then."

Saturday, September 28, 2013

A short tutorial of nonlinear circuit theory - part 2

In a previous post, we look at Kirchoff's laws and device constituency relations to derive state equations of an electrical circuit. Today we will look at some simple circuits and their corresponding state equations. The simplest dynamic circuit must have at least one capacitor (or inductor). Let us consider a circuit with one linear capacitor and one nonlinear resistor.

Case 1: Assume that the resistor is voltage-controlled ($i = g(v)$)

The state equations can then be written as

\begin{equation} \frac{dq}{dt} = -g(v) = -g(q/C) \end{equation}

The trajectories can be visualized on the dynamic route, where we plot $\frac{dq}{dt}$ against $q$ and the arrows indicate how the trajectory will evolve:

Case 2: assume that the resistor is current controlled ($v = f(i)$).

We have $v = f(i) = f(-dq/dt)$. Since $v = q/C$ we get $q = Cf(-dq/dt)$. Using the dynamic route interpretation and depending on the shape of $f$, it is possible to find impasse points, where the trajectory does not know what to do.

How do we resolve such nonphysical situations? By adding parasitic elements. Suppose we add a parasitic inductor L in series to the nonlinear resistor.

The resulting state equations will be given by: \[ \begin{array}{lcl}\frac{dq}{dt} & = & i_L = \frac{\phi}{L} \\ \frac{d\phi}{dt} & = & v_L = v_R - v_C = f(i_R) - q/c = f(\phi/L) - q/c \end{array} \]

By introducing the variable $i_C = -i_L = -\phi/L$ we get the state equations: \[ \begin{array}{lcl}\frac{dq}{dt} & = & i_C \\ \frac{di_C}{dt} & = & -\frac{1}{L}(q/C - f(-i_C)) \end{array} \] By drawing the phase portrait, we see that at the curve $q = Cf(-i_C)$, $di_C/dt = 0$. In fact, we have an oscillator.

For the nonlinear resistor with a constituency relation as shown above, the existence of the limit cycle can be proved more rigorously by using the Poincaré -Bendixson theorem and the fact that the system has a bounded trapping region and a single equilibrium point that is unstable.

For $f$ an odd function, a second order system in this form is a Liénard equation [5] and the Levinson-Smith theorem [6] allows us to prove the existence of a stable limit cycle when the nonlinear resistor is of a certain form:

Theorem [Levinson-Smith 1942]: Consider the second order equations
\[ \begin{array}{lcl} dx/dt & = & y - F(x) \\ dy/dt & = & -g(x) \end{array} \] Suppose that

$F$ is an odd $C^1$ function (i.e. $F(-x) = -F(x)$),
$g$ is an odd $C^0$ function with $xg(x)>0$ for all $x\neq 0$,
There is a constant $a>0$ such that $F(x) < 0$ on $0 < x < a$, $F(x) > 0$ on $x > a$, and $F'(x) > 0$ on $x>a$,
$\int_0^x g(s)ds \rightarrow\infty$ as $|x|\rightarrow \infty$ and $F(x) \rightarrow\infty$ as $x\rightarrow \infty$.

Then there is a unique stable limit cycle around the origin in the system.

Note that our circuit above satisfies the hypothesis of the Levinson-Smith theorem if our nonlinear resistor has the form $v = f(i)$ where $f$ is a cubic polynomial of the form $x^3-ax$ for $a>0$. In particular, we get exactly the Van der Pol equation $x'' + \alpha (x^2-1)x' + x = 0$ [8], an equation used to model oscillations occurring in a circuit involving vacuum tubes.

Another condition for guaranteeing the existence of a unique periodic solution is the following result [7]:

Theorem: Consider the second order equations
\[ \begin{array}{lcl} dx/dt & = & y - F(x) \\ dy/dt & = & -g(x) \end{array} \] Suppose that

$F$ is an odd $C^1$ function and is zero only at $x=0$,$x=a$ and $x=-a$ for some $a>0$,

$F(x) \rightarrow \infty$ monotonically as $x\rightarrow \infty$ for $x >a$.

$g$ is an odd $C^0$ function with $xg(x)>0$ for all $x\neq 0$,

Then the system has a unique periodic solution.

References:

[1] L. O. Chua, Introduction to nonlinear network theory, McGraw-Hill, 1969.
[2] L. O. Chua, C. A. Desoer, E. S. Kuh, Linear and nonlinear circuits, McGraw-Hill, 1987.
[3] L. O. Chua, Memristor-The missing circuit element, IEEE Transactions on Circuit Theory, Sept. 1971, vol. 18, no. 5, pp. 507-519.
[4] D. B. Strukov, G. S. Snider, D. R. Stewart1 & R. S. Williams, The missing memristor found, Nature 453, 1 May 2008, pp. 80-83.
[5] A. Liénard, Étude des oscillations entretenues, Revue génér. de l'électr. 23, (1928), 901 - 902; 906 - 954.
[6] N. Levinson, O. Smith, A general equation for relaxation oscillations Duke Math. J. 9, (1942), 382 - 403.
[7] D. W. Jordan and P. Smith, Nonlinear Ordinary Differential Equations, Oxford University Press, 1977.
[8] Van der Pol, B. and Van der Mark, J., Frequency demultiplication, Nature, vol. 120, pp. 363-364, 1927.

Sunday, September 15, 2013

How to eat a meal

Many times my family and I face a dilemma when we go to a restaurant for a meal. We would order a few dishes and when the food arrives we need to solve the following problem: there are several types of food on the table, each with a different preference in our mind. The question then is: which food should we eat first, which food should we eat second, etc. My strategy is to eat my favorite food first. My wife has a different philosophy; she saves the best for last. I don't agree with her as I believe that your favorite food will not taste as good at the end of a meal, much more so than for food that you do not like as much anyway. Attempting to formulate this as a mathematical problem, we are essentially maximizing our enjoyment of the meal:
$$ \max_S E(S)$$
where $F$ is the set of food items on the table, $S = (s_1,s_2,s_3,...)$ is the sequence of food to eat subject to the constraint that $s_i\in F$ and $E$ is your enjoyment of your meal based on the sequence $S$. The function $E$ can have very different forms depending on the individual and there are additional constraints depending on the situation. Do you have to eat every items in $F$, or do you eat until you are full and put the leftovers in a doggie bag? Is there a time constraint, in which case certain foods can be more easily digested and thus enjoyed more? This problem can also be defined in more complicated ways by allowing you to eat part of one food item before eating part of another food item, etc. $E(S)$ can simply be $E(S) = \sum_i E_i(s_i)$ where $E_i$ becomes smaller for larger $i$ as you are getting full, i.e. for all $x\in F$, $E_i(x)< E_j(x)$ if $i>j$. My argument at the beginning of this post was that $E_i$ decreases faster proportionally as $i$ increases for my favorite food than for foods I dislike. To complicate things further, the ordering of the food items can influence your enjoyment not simply because your stomach is getting fuller, but could also depend on the flavor involved. Eating something sour might increase your appetite, whereas according to my wife eating something sweet after eating something salty might make you want to eat more salty foods. The use of sorbet to clean your palate between courses is another indication of this phenomenon. Similarly, eating protein or eating food that requires you to chew more will make you satisfied sooner. When presented with a plate of various fruits, she will eat the most sour fruits first, since she believes eating a sweet fruit first will make eating the sour fruit taste even more sour.
How different is the function $E$ different from person to person? Can we use our eating strategy $S$ over various meals to deduce the (presumably hidden) utility function $E$? (assuming that we are mentally optimizing $E$.) Can optimizing $E$ be used to satisfy weight-loss goals?

Saturday, September 14, 2013

Fireworks and the speed of sound

We were watching a fireworks display during our spring break vacation this year, and my son remarked that there is a delay in the sound that we hear from the fireworks that we see in the air. Since the fireworks typically go off quite high in the sky, the delay can be a second or more. (This is the same reason why we see lightning before hearing the thunder and we can use the delay to estimate how far the lightning strike is.) Recall that the speed of sound is about 340 m/s and the speed of light is so high (about 300 Mm/s) that the delay is essentially the time it takes for the sound to reach our ears. I have never thought of that before, mainly because in a fireworks display many fireworks are going off at different times close to each other and this masks the delay phenomenon. After he mentioned it, I notice it now every time I watch fireworks (and there are many opportunities during the year) and makes me realize that the perfectly synchronized sound effects you hear in a movie when they have a fireworks display may not be accurate.

Saturday, September 7, 2013

A short tutorial of nonlinear circuit theory - part 1

Today we look at applications of dynamical system theory in electrical engineering. How is dynamical systems important for electrical engineering? At the most fundamental level, electrical engineering deals with the physics of electricity. We can describe electrical and magnetic phenomena via Maxwell's equations and the interaction of electrons, ions, and protons in physical matter. Such mathematical models in general give us partial differential equations. However, if we describe things in such detail, the resulting equations will be so complicated that either it takes a long time to simulate the system or very little useful information can be extracted from the equations. Therefore in many electrical circuit applications, we will assume the lumped circuit approximation. In a lumped circuit, electrical signals are propagated instantaneously through wires. Exceptions to this approximation occur in applications such as high speed electronics and wireless electronics. What follows is a brief introduction to nonlinear circuit theory. A more in depth study can be found in Ref. [1] and Ref. [2]. In general an electrical circuit consists of devices connected with wires. Because of the lumped circuit approximation, the lengths of the wires are irrelevant. Each device can be thought of as a black box with n terminals, in which case it is called an n-terminal device. For example, a resistor or a capacitor is a 2-terminal device and a transistor is a 3-terminal device. Depending on our level of abstraction, a device could be a transistor, an amplifier consisting of many transistors, or a complicated device such as a microprocessor with billions of transistors. Each n -terminal device can be represented by a digraph with n branches connected at the datum node (Fig. 1).

Figure 1

Resistive and dynamic 2-terminal devices
In this short introduction we will only focus on electrical circuits composed of 2-terminal devices. The two most important physical quantities in electrical circuits are voltages (v) and currents (i). Two related quantities are the electrical flux linkage ($\phi$) and charge (q) which are related to v and i via:

\[ i = \frac{dq}{dt},\qquad v = \frac{d\phi}{dt}
\]

A 2-terminal device has two terminals and can be represented abstractly as a digraph with one branch and an associated reference direction:

A device which relates the voltage across it and the current through it is called a resistor. In particular, it is described by the constituency relation $R(v,i) = 0$. A device which relates the charge on it with the voltage across it is called a capacitor and is described by the constituency relation $C(q,v) = 0$. A device which relates the flux linkage across it with the current through it is called an inductor and is described by $L(\phi,i) = 0$. A resistor is called voltage-controlled if the constituency relation can be written as a function of voltage, i.e. $i = f(v)$. Similar definitions exist for current-controlled resistors, charge-controlled capacitors etc. Resistors are static devices while capacitors and inductors are dynamic devices. As we will see shortly, a circuit without dynamic devices has no dynamics. The introduction of dynamic devices allow us to describe the behavior of the circuit as a continuous-time dynamical system.

What about a device which relates flux linkage and charge via a relation $M(\phi, q) = 0$? Such devices are called memristors (MEMory ResISTOR) as postulated by Leon Chua in 1971 (Ref. [3]). Even though such devices can be synthesized using transistors and amplifiers, an elementary implementation did not exist until Stan Williams and his team from HP developed memristors using titanium dioxide in 2008 (Ref. 4).

Kirchoff's Laws
A electrical circuit can be represented abstractly as a digraph with a branch current and branch voltage associated to each branch. We assume that the circuit is connected, i.e. the corresponding digraph is (weakly) connected (though not necessarily strongly connected). Consider a connected digraph with n nodes and b branches. Since the digraph is connected, $b\geq n-1$. The node-edge incidence matrix (or just incidence matrix) associated to a digraph is an n by b matrix A such that $A_{jk} = 1$ if branch k leaves node j, $A_{jk} = -1$ if branch k enters node j, and $0$ otherwise.

Kirchoff's laws describe the relationships which these currents and voltages need to satisfy due to physical constraints such as charge conservation.

Kirchoff's current law (KCL) states that the sum of the currents entering a node is equal to the sum of the currents leaving the node. A moment's thought will convince you that this requirement for the j-th node corresponds to the product between the j-th row of A and the current vector i be equal to 0. Thus Kirchoff's current law can be stated as Ai = 0.

Theorem: the matrix A has rank n-1.

Proof: the i-th column of A consist of exactly one entry 1 and one entry -1 corresponding to the nodes that branch i is connected to. Consider k rows of A with k. Suppose a linear combination of these equations with nonzero coefficients sums to zero. Since the graph is connected, there is a node m which is connected by a branch to a node in these k nodes. The corresponding column in A has a 1 (or -1) in the m-th row and a -1 (or 1) in a row belonging to the k rows. Therefore a linear combination of these equations with nonzero coefficients cannot sums to zero. This show that the rank of A is at least n-1. The sum of the rows of A is zero, so A cannot be full rank, and thus the rank of A is exactly n-1. $\blacksquare$

Thus we can delete one row from A without losing any information since this row is always a linear combination of the other rows. In the following we assume that A is a n-1 by b matrix with rank n-1.

For a branch k which goes between nodes $j_1$ and $j_2$, Kirchoff's voltage law (KVL) states that the voltage $v_k$ across branch k is equal to the difference $e_{j_1}-e_{j_2}$ where $e_{j}$ is the voltage between the j-th node and the datum node. It is easy to see that Kirchoff's voltage law can be expressed as $A^T$e-v = 0 or $A^T$e = v.

Thus $v$ is in the range of the matrix $A^T$. Since $A^T$ has rank n-1, another way to say this is that v lies in a n-1-dimensional subspace of ${\Bbb R}^b$. Thus we can also express KVL using the orthogonality conditon Bv = 0 where B is a b-(n-1) by b matrix. One such matrix B is the fundamental loop matrix and describes a set of b-(n-1) loops in the circuit from which all loops in the circuit can be derived. This is another way to view KVL: the algebraic sum of branch voltages in a loop is 0.

The complete description of the electronic circuit can be given by KVL, KCL, the constituency relations of the elements and the defining equations $i = \frac{dq}{dt}$ and $v = \frac{d\phi}{dt}$. This gives us a set of constrained differential equations which describes the system. It is not always possible to obtain a set of ordinary differential equations. In the case that we can, it will be given in the form

\begin{equation}\label{eqn:state_circuit} \begin{array}{lclr}
\frac{dq}{dt} & =& f_1(q,\phi) & \qquad \mbox{Eq.} (1)\\
\frac{d\phi}{dt} & = & f_2(q,\phi) & \end{array}
\end{equation}

where q is the charge vector corresponding to the charges of all the capacitors and $\phi$ is the flux vector corresponding to the flux of the inductors. The state vector is $(q,\phi)$. In the case where the capacitors and inductors are linear, we have q = Cv and $\phi = Li$, so for $C\neq 0$ and $L\neq 0$, we can also choose $(v,i)$ as the state vector. This is more desirable since voltages and currents can be more easily measured. Furthermore, this change of variables can be also be done if there is a diffeomorphism between q and v and between $\phi$ and i. The system defined by (v,i) is topologically conjugate to Eq. (1) via the diffeomorphism. The number of equations in Eq. (1) will correspond the the total number of capacitors and inductors since we only have 2-terminal capacitors and inductors.

When there are no dynamic elements the circuit is called a resistive circuit and the equations of the circuit are given by Ai=0, Bv = 0 and R(i,v) = 0. This is a set of nonlinear equations, the solutions of which define the operating voltages and currents of the circuit. There is no dynamics in this system. A solution of these equations is an operating point of the circuit and corresponds to the actual voltages and currents we observe in the circuit.

Now a circuit with more than one operating point presents somewhat of a problem. What does the physical circuit choose as the operating point? In practice, the circuit can be at different operating points depending on when or how you power on the system. To resolve this problem we postulate that parasitic capacitances and inductors are present in parallel and series respectively which results in a dynamic circuit. Consider the state equations (Eq. (1)). The equilibrium points correspond to points when $dq/dt = 0$ for the capacitors and $d\phi/dt = 0$ for the inductors. This corresponds to the capacitor currents equal to zero and inductor voltages equal to zero. Thus we can replace capacitors by a open circuit and inductors by a short circuit which results in the original resistive circuit.

Thus the operating points of the resistive circuit corresponds to equilibrium points of the dynamic circuit and initial conditions play a role in what equilibrium point we will see at steady state (assuming that the state trajectory of the dynamic circuit does converge to an equilibrium point) which corresponds to the operating point we observe.

In part 2, we will look at some simple circuits and their corresponding state equations.

Saturday, March 2, 2013

WYSINWYG (What you see is NOT what you get)

The old adage of appearances being deceptive is true on many different levels. More abstractly consider the problem of communicating the properties of an object or of a concept C from point A to point B. There are many reasons why this communication is imperfect. In fact, there could be a problem at every stage of the communication process.

First of all not all of the properties of C are transmitted to B. This could be deliberate, such as in the game poker or the game Mastermind, where after each trial, only the number of in-location matches and out-location matches are revealed, not their positions. Or it could be limitations of the physical world, for instance, we can only see the outside of an object, (i.e. you can not judge a book by its cover).

Secondly, the transmission medium could distort the information. For instance, we don't see as well at a foggy night. In communication theory, noise in the transmission channel can make the information at the receiving end different than at the transmitting end. Typically, information is lost in this process, i.e. it is not possible to recover all the information transmitted from point A from the information received at point B. Or this could be a fundamental consequence of quantum physics, where the act of observation influences the object being observed.

Thirdly, the sensor apparatus used to receive the information can be deficient. Our eyes can only register lights in the visible spectrum, our ears can only hear in the range of approximately 20Hz-20kHz. This means that something that is invisible (or inaudible) to one animal (or machine), may not be to another. For instance, our warm bodies emit infrared light, but we cannot see it in the dark. However, night vision goggles allow us to see in the dark by using an infrared sensor to capture that radiation and convert it into visible signals that we can see. I have always wondered about the invisibility cloak in the Harry Potter novels or the invisibility serum in H.G. Wells' The Invisible Man, whether they work across the entire electromagnetic spectrum. If they only work in the visible light spectrum, then they could be easily defeated with modern technology by using either an infrared light source (a warm object does by itself emit infrared radiation) or ultraviolet light source (a black light) and using an infrared or ultraviolet sensor to "see" the invisible person. In addition, there are other sensor modalities, e.g. sonar which uses (ultra)sound waves or radar which uses electromagnetic radiation at the radio waves range, that need to addressed for such invisibility cloaks to be truly invisible.

Monday, October 1, 2012

Music, death-rays and the queen of mathematics

Thirty years ago today, the first compact disc (CD) music album was sold in Japan. I have always been fascinated with the CD as it encompasses several technologies that was never meant for entertainment and show how scientific discovery can lead to unexpected applications.

For instance, the pits on the CD is detected by shining laser light onto them. I am quite certain that Theodore Maiman, who made the first working laser (an acronym for Light Amplification by Stimulated Emission of Radiation) in 1960 would have never thought that one of the most common uses of lasers today is in music players. When I was growing up, the laser is viewed as a death-ray in science fiction stories, capable of destroying spaceships and enemies.

To encode music into pits on the surface of the CD, the sound waves have to be sampled and quantized into a digital format, converting sound patterns into bits of 1's and 0's and then manipulated in a computer. Computing in binary was initially meant for doing mathematics on a machine, but the idea that all information, including audio information, can be encoded in binary form and thus can be manipulated by a machine is what makes making music using digital technology possible.

Finally, to prevent scratches on the CD from causing undesirable distortion in the music that is being played back, error correction codes are used to add redundancy to the bits of data. The type of error correcting codes used on audio CD are CIRC (Cross Interleaved Reed-Solomon Codes) and is based on branches of mathematics dealing with abstract algebra and more generally number theory. Some consider number theory (what Carl Friedrich Gauss called the queen of mathematics) to be the purest form of mathematics because it didn't appear to have much of an application, but who knew that it turns out to be so useful in providing us entertainment (not to mention the fact that number theory is used extensively in cryptography which is a necessary technology for virtually all electronic commerce and online shopping)!

Sunday, September 30, 2012

Tennis lessons

My family has recently started taking tennis lessons. In one of the lessons there were 6 students and we got to play on 3 courts. The instructor wanted each of us to play with each other so we had 3 singles matches going simultaneously on 3 courts and we switched players after 2 games. He had us switch the players on one side of the court to the other courts. In other words, the first pairing was:

1 -- 2 Court 1
3 -- 4 Court 2
5 -- 6 Court 3

and the second pairing was:

1 -- 6 Court 1
3 -- 2 Court 2
5 -- 4 Court 3

and the third pairing was:

1 -- 4 Court 1
3 -- 6 Court 2
5 -- 2 Court 3

Since there are a total of $\left(\begin{array}{c}n\\2\end{array}\right)$ = n(n-1)/2 pairs and we can play n/2 games at a time on n/2 courts (let us consider the case of even n for the moment), we should be able to have different pairings at most n-1 times. In the above case of 6 students (n=6), we should be able to have 5 such rotations. But having the first 3 rotations as above, it is easy to see that it is not possible to have a 4th set of pairings without some of the pairs being repeated since players 1, 3, 5 have each played 2,4,6 respectively and thus needed to play each other which cannot be done simultaneously. So we asked ourselves the question, is it possible to have n-1 rotations where each person with play another person exactly once?

My son looked at the configurations and came up with the following algorithm to pair players. Fix one player and rotate the other players around a loop. In other words, keeping player 1 fixed, and rotating the remaining players clockwise, the pairings are:

Court 1 1 -- 2 1 -- 3 1 -- 5 1 -- 6 1 -- 4
Court 2 3 -- 4 5 -- 2 6 -- 3 4 -- 5 2 -- 6
Court 3 5 -- 6 6 -- 4 4 -- 2 2 -- 3 3 -- 5

It is easy to see that this is equivalent to the well-known polygon method of assigning round robin tournaments. Since my son is a fan of Minecraft, he also created a machine in Minecraft that illustrated this process using pistons to move colored blocks around. He made one piston to pull out one of the blocks and then rotate the remaining blocks around a loop. Finally the block that was pull out is reinserted into the group. Pictorially, it looks like this:

Starting configuration:

   1 2
   3 4
   5 6

One block is pulled out (where x represent an empty slot):

1    x 2
3 4
      5 6

The remaining 5 blocks (with the empty slot) are rotated clockwise:

1    3 x
      5 2
      6 4

Block 1 is inserted back (and pushing block 3 across to the right) resulting in the first rotation:

     1 3
     5 2
     6 4

Sunday, August 5, 2012

Zermelo's winning strategy

Zermelo's theorem is a fundamental result in game theory that states that in a two-player game where

each player has perfect information about the state of the game
players take turns placing their move
no chance is involved
games end after a finite number of moves
games do not end in a draw

one of the two players will have a winning strategy. I first encounter this theorem in Hugo Steinhaus' Mathematical Snapshots (a book that I cherish as I received it as a prize from participating in the first LVAIC Mathematics Competition). This result basically states that one of the two players can always win if he or she plays appropriately. On the surface this is easy to understand. If player A does not have a winning strategy, then this means that whatever moves player A makes, player B has a move that prevent A from winning. Since the games do not end in a draw, this means that player B will win eventually, i.e. B has a winning strategy. This is an very interesting result since it implies that these games are inherently not fair; one player always has the advantage. The problem is that in complicated games we don't know which player has this advantage. When applied to chess this implies that either black or white can always force a win or a draw; we just don't know which color that is. Since the game is finite, given enough computational power and time, it is possible to find out which player has the winning strategy by exploring all possible scenarios. Or we may find out that by playing appropriately, all chess games will end in a draw. Once such exhaustive search is done, computers playing chess would lose its excitement and allure, since we know apriori which color will win (or draw) and the winning strategy can be implemented via table lookup. Fortunately, the number of moves in a chess game and the range of moves result in a search space so large that its complete exploration is out of reach for many years to come.