Friday, June 6, 2014

The Hoffman-Wielandt theorem

The following interesting theorem bounds the "distance" between the eigenvalues of normal matrices using the Frobenius matrix norm.

Theorem [Hoffman-Wielandt 1953]: Let $A$ and $B$ be normal matrices with eigenvalues \(a_1,\cdots, a_n\) and \(b_1,\cdots, b_n\), respectively.  There exists an ordering of the eigenvalues such that
$$ \sum_i |a_i-b_i|^2 \leq \|A-B\|_{F}^2$$
where $\|\cdot\|_{F}$ is the Frobenius matrix norm.

Let us look at the proof of this result from Ref. [1].  Note that \(\|A\|_{F}=tr(AA^{*})= tr(A^*A)\) is unitarily invariant, i.e. if $V$ and $U$ are unitary matrices, then $\|UAV\|_F = \|A\|_F$.  Since both $A$ and $B$ are normal, $A = VA_0V^*$ and $B = WB_0W^*$ where $V$ and $W$ are unitary matrices and $A_0$ and $B_0$ are diagonal matrices of the eigenvalues of $A$ and $B$ respectively.  By the unitarily invariant property of the Frobenius norm,  $$\begin{eqnarray}\|A-B\|_F &=& \|VA_0V^*-WB_0W^*\|_F \\ &=& \|A_0-V^*WB_0W^*V\|_F \\
&=& \|A_0-UB_0U^*\|_F\end{eqnarray}$$ where $U = V^*W$ is a unitary matrix.  Thus the Hoffman-Wielandt theorem states that among all unitary matrices $U$,  $\|A_0-UB_0U^*\|_F$ is minimized (not necessarily uniquely) when $U$ is equal to a permutation matrix $P$.  Now
$$ \begin{eqnarray}\|A_0-UB_0U^*\|_F &=& tr((A_0-UB_0U^*)(A_0^*-UB_0^*U^*)) \\&=&tr(A_0A_0^*+B_0B_0^*)-tr(UB_0U^*A_0^*)\\&&-tr(A_0UB_0^*U^*)
\end{eqnarray} $$
Since $UB_0U^*A_0^*$ and $A_0UB_0^*U^*$ are conjugate of each other, this means that
$$ \|A_0-UB_0U^*\|_F = \sum_i |a_i|^2+|b_i|^2-2Re(tr(UB_0U^*A_0^*)) $$
Next note that $tr(UB_0U^*A_0^*) = \sum_{ij} b_i\overline{a}_j u_{ij}\overline{u}_{ij}$ and thus
$Re(tr(UB_0U^*A_0^*)) = \sum_{ij} Re(b_i\overline{a}_j) x_{ij}$
where $x_{ij} = |u_{ij}|^2$.  Since the rows and columns of $U$ are orthonormal vectors, this implies that
$X = \{x_{ij}\}$ is a bi-stochastic matrix and minimizing  $\|A_0-UB_0U^*\|_F$ will result in a value no lower than minimizing the linear function $\sum_{ij} Re(b_i\overline{a}_j) x_{ij}$ over the convex set of all bi-stochastic matrices $X$.  By the definition of convexity, the value of a linear function of any point in the convex set is a convex combination of the values of the function evaluated at the vertices.  This implies that the minimum value of a linear function over a convex set will be attained at a vertex.  By the Birkhoff-von Neumann theorem, the set of vertices of  the convex set of bi-stochastic matrices is exactly the set of  permutation matrices and this means that the minimum occurs when $X$ is a permutation matrix, i.e. the minimum is also achieved for $\|A_0-UB_0U^*\|_F$ when $U$ is a permutation matrix.

[1] A. J. Hoffman and H. W. Wielandt, "The variation of the spectrum of a normal matrix," Duke Mathematical Journal, vol. 20, no. 1 (1953), pp. 37-40.