In problem 7.12 we are asked to find eigenvectors and eigenvalues of a matrix.
For both $\epsilon=0$ and $\epsilon\neq0$ the matrix is block diagonal, which means that the matrix consists of squares of possibly nonzero elements centered along the diagonal with all elements not in such a square equal to zero. Pictorially a block diagonal matrix is of the form:
$$
\begin{pmatrix}
\begin{matrix} a & \cdots & b \\ \vdots & \ddots & \vdots \\ c & \cdots & d \\ \end{matrix} & & \\
& \begin{matrix} e & \cdots & f \\ \vdots & \ddots & \vdots \\ g & \cdots & h \\ \end{matrix} & \\
& & \ddots \\
\end{pmatrix}
$$
where all blank elements are zero.
In mathematical terms:
an $N \times N$ matrix $A_{mn}$ is block diagonal if and only if
there exists an increasing sequence of indices $l_1<l_2<\ldots<l_L$ with $0<l_i<N$ and $0<L<N$ such that $A_{mn}$ is necessarily zero unless
there exists a $k \begin{array}{l} \gt 0 \\ \leq L+1 \end{array}$ such that
both $l_{k-1} \lt n \leq l_k$ and $l_{k-1} \lt m \leq l_k$ ($l_0 = 0$ and $l_{L+1}=N$).
That is a mouthful, but if you think about it awhile this is simply the condition that the coordinate $(m,n)$ belong to some square centered along the diagonal, just like the picture above.
It is easier to diagonalize block matrices.
We begin by splitting up $A_{mn}$ into $L$ "submatrices" $A^{(k)}_{ij}$, where:
$$
A^{(k)}_{ij} := A_{i+l_{k-1},j+l_{k-1}}
$$
Q1: a)
For an eigenvector $\left[ c_{s,1}^{(k)}, \ldots, c_{s,K}^{(k)}\right]$ of the submatrix $A^{(k)}$ ($K = l_{k}-l_{k-1}$) with eigenvalue $\lambda^{(k)}_s$ ($s = 1,\ldots,K$), check that we can make an eigenvector $\vec{d}_s^{(k)}$ of the full matrix $A$ with the same eigenvalue by the rule:
$$
d^{(k)}_{s,n} := \begin{cases} c^{(k)}_{s,n-l_{k-1}} & l_{k-1} \lt n \leq l_k \\ 0 & \text{else} \end{cases}
$$
b)
Show that full set of $\vec{d}^{(k)}_s$, with $s = 1,\ldots,K$ and $k = 1,\ldots,L+1$ constitute a eigenbasis of the full matrix $A$.
A1: b)
Each submatrix $A^{(k)}$ gives $K$ eigenvectors and
$$
\sum_{k=1}^{L+1} K =
\sum_{k=1}^{L+1} l_{k} - l_{k-1} =
l_{L+1} - l_0 =
N - 0 =
N
$$
separability
If you are like me you don't really feel like you're understanding anything by shuffling a bunch of matrix indices around.
You may then like to see a more compact if perhaps more abstract presentation of the concept of "block diagonalization", where we see in the simplest possible terms the sense in which an operator may "split" a larger space into smaller subspaces.
direct sums
Given any two hilbert spaces $V_1$ and $V_2$ we may form another hilbert $V_1 \oplus V_2$ called the direct sum of $V_1$ and $V_2$.
As a set $V_1 \oplus V_2$ is just the "cartestian product" $V_1 \times V_2$, i.e. pairs of vectors $(v_1,v_2), v_1 \in V_1, v_2 \in V_2$.
We define:
vector addition on this set according to the rule:
$$
(v_1,v_2) + (w_1, w_2) := (v_1 + w_1, v_2 + w_2)
$$
where $v_1,w_1 \in V_1$ and $v_2, w_2 \in V_2$, and
Likewise we can always go the other way, i.e. we can "break up" a larger hilbert space $V$ into a direct sum $W \oplus W^{\perp},$ where $W$ is some subspace of $V$ and $W^{\perp}$ is its orthogonal complement.
Q2: Show that the map $\phi: V \to W \oplus W^{\perp}$ defined by the rule
$$\phi(v) := \left( Pv, \left( 1 - P \right)v \right)$$
where $P$ projects a vector $v \in V$ onto the subspace $W$, has the following properties:
a) $\phi$ is linear
b) $\phi$ is invertible
c) $\phi$ preserves the inner product, i.e.
$$
\ip{\phi{v'}}{\phi{v}} = \ip{v'}{v}
$$
for any $v,v' \in V$.
With these three properties we establish that the vector $\phi(v)$ is equivalent to $v$ in all respects pertaining to its nature as an element of a hilbert space.1 A2: b)
We can construct the inverse $\phi^{-1}: W \oplus W^{\perp}$ by the simple rule $\phi^{-1}( w, w_{\perp} ) := w + w_{\perp}$ and check that
$$
\phi^{-1} \circ \phi (v) =
\phi^{-1}(Pv,(1-P)v) =
Pv + (1-P)v =
(P + 1 - P)v =
v
$$
and vice versa:
$$
\phi \circ \phi^{-1}( w, w_{\perp} ) =
\phi( w + w_{\perp} ) =
\left( P(w + w_{\perp}), \left( 1 - P \right) \left( w + w_{\perp} \right) \right) =
\left( w, w + w_{\perp} - w \right) =
\left( w, w_{\perp} \right)
$$
by the definition of the projection operator.
c)
\begin{align}
& \ip{\phi(v')}{\phi(v)} \\ & =
\ip{(Pv',v' - Pv')}{(Pv,v - Pv)} \\ & =
\ip{Pv'}{Pv} + \ip{v'-Pv'}{v-Pv} \\ & =
\ip{Pv'}{Pv} + \ip{v'}{v} - \ip{Pv'}{v} - \ip{v'}{Pv} + \ip{Pv'}{Pv} \\ (!) & =
\ip{v'}{Pv} + \ip{v'}{v} - \ip{v'}{Pv} - \ip{v'}{Pv} + \ip{v'}{Pv} \\ & =
\ip{v'}{v}
\end{align}
where in the step marked with the $(!)$ we make use of the following properties of all projection operators:
$P^2 = P$ (idempotence)
$P = P^{\dagger}$ (hermiticity)
Q3: Suppose for some hermitean operator $A$ we have some subspace $V_A$ which is closed under the action of $A$ so that $a \in V_A$ implies $A a \in V_A$.
Show that the orthogonal complement $V_A^{\perp}$ is also closed under the action of $A$.
A3:
For any $a_{\perp} \in V_A^{\perp}$ and $a \in V_A$ we have:
$$
\ip{v_1}{A v_{\perp}} = \ip{A v_1}{ v_{\perp}} = 0
$$
since $A a \in V_a$ and any $a_{\perp} \in V_a^{\perp}$ is by definition orthogonal to any element of $V_a$.
Therefore $V^{\perp}_A$ is also closed under the action of $A$.
Q4: Show that the action of $A$ on the vector $(a,b) \in V_A \oplus V_A^{\perp}$ takes the special form:
$$
A(a,b) = A_1 \oplus A_2(a,b)
$$
Where $A_1$ is an operator on $V_A$ and $A_2$ is an operator on $V_A^{\perp}$, and in general for an operator $A$ on $V$ and an operator $B$ on $W$ we take $A \oplus B$ to mean:
$$
A \oplus B (v, w):=(Av,Bw)
$$
where $v \in V, w \in W$.
Loosely speaking we would say that $A$ then "splits" into $A_1 \oplus A_2$.
Strictly speaking, $A$ is only defined to operate on $V$, but we can "translate" it to operate on $V_A \oplus V_A^{\perp}$ using the following diagram:
A4:
$$
\tilde{A}(a,b) = \phi \circ A \circ \phi^{-1}(a,b) = \phi \circ A (a + b) = ( P H (a + b) , H(a + b) - P H( a + b) ) = ( Ha, Hb )
$$
The key here is that since the action of $A$ is closed on $V_A$ and $V_A^{\perp}$, it is perfectly valid to consider $A$ to be an operator on either space separately.
Note that an arbitrary operator $B$ on a space $V_1 \oplus V_2$ can not in general be split into two operators $B_1$ and $B_2$ as we've done above (why?).
Only a special class of operators have this property, in the same way that only a subset of tensors can be written as a single tensor product.
In our case $A$ "splits" because of the special closure properties of $V_A$ and $V_A^{\perp}$.
Q5: Show that any map $\phi: V \to W$ with the properties listed in question Q2 preserves eigenvectors and eigenvalues so that if an operator $A$ on $V$ has eigenvectors $\ket{a_n}$ with eigenvalues $a_n$ then the operator $\tilde{A} \equiv \phi \circ A \circ \phi^{-1}$ on $W$ has eigenvectors $\phi(\ket{a_n})$ with the same eigenvalues.
Q6: Show that the eigenvectors of A on $V$ are given by the union $\{(e_1,0),(e_2,0),\ldots\} \cup \{(0,f_1),(0,f_2),\ldots\}$, where the $e_i$ ($f_j$) are eigenvectors on $V_A$ ($V_A^{\perp}$).
Note that there is nothing preventing us from identifying smaller closed subspaces inside of the $V_A$ or $V_A^{\perp}$ themselves to split them as well, so that in general we can break up our original hilbert space $V \to V_1 \oplus V_2 \oplus \ldots$ into smaller and smaller subspaces.
As the subspaces become smaller it becomes in general easier to determine the eigenvectors and eigenvalues.
The decomposition can continue until each subspace $V_n$ is one-dimensional.
What can we say about elements of these $V_n$ then?
putting it together
Q7: Connect the concept of hilbert space splitting with block diagonal matrices.
Suppose that in the basis $\{e_1,e_2,\ldots,e_N\}$ the hermitean operator $A$ is block diagonal.
a)
What does this mean?
In what sense can we associate an operator with a matrix?
b)
Find a subspace that is closed under the action of $A$.
c)
Conversely given a hermitean operator $A$ and a subspace $V_A$ that is closed under $A$, find a basis in which $A$ is block diagonal.
symmetry and block diagonality
Q8: Suppose we have on a hilbert space $V$ two commuting hermitean operators $A$ and $B$, i.e. $[A,B]=0$.
Describe how, given the spectrum (eigenvectors and eigenvalues) of $A$, we might block diagonalize (or equivalently, split $V$ with respect to) $B$.
(Hint: What can you say about the action of an operator on an eigenvector of a commuting operator?)
A8:
The action of $B$ on an eigenvector of $A$ will produce another eigenvector of $A$ with the same eigenvalue (make sure this fact is clear to you).
Therefore the action of $B$ is closed on any degenerate subspace of $A$!
This implies that we can split $V$ with respect to $B$ if any two of the eigenvectors of $A$ differ in eigenvalue.2
Q9: Suppose we have three mutually commuting hermitean operators, $A$, $B$, and $C$.
How can we exploit their commutivity to help us block diagonalize $C$? What about an arbitrary number of mutually commuting operators?
A9:
As we saw in the previous problem, $C$ splits into $C_1 \oplus C_2 \oplus \ldots$ where $C_n$ is the action of $C$ on $V_A^n$ the $n^{\text{th}}$ degenerate eigenspace of $A$.
However, $B$ also commutes with $A$, and so also similarly splits into $B_1 \oplus B_2 \oplus \ldots$.
Since both $B$ and $C$ are closed on $V_A^n$, this means that $[B_n,C_n]=[B,C]$, which is zero by assumption.
This means that $C_n$ splits into $C_{n1},C_{n2},\ldots$, where $C_{nm}$ is the action of $C$ on $V_{BA}^{mn}$, the $m^{\text{th}}$ degenerate eigenspace of $V_A^n$.
$C$ therefore block diagonalizes in a basis $\ket{ a_i b_j }$ of states that are simultaneously eigenstates of $A$ and $B$, and the blocks are composed of states that have the same eigenvalues with respect to both $A$ and $B$.
The blocks are therefore strictly smaller3, and the diagonalization problem thus easier to solve, than if we had split $C$ with respect to $A$ or $B$ alone.
If we are given, say, $N+1$ mutually commuting operators $A_1, A_2, \ldots, A_N, B$ on a hilbert space $V$ we find that we can split $B$ into a great number of operators $B_{n\ldots m}$ which act on tiny subspaces $V_{1 \ldots N}^{n \ldots m}$ the matrix $A_N$ obtained by:
using $A_1$ to split $V$ into $V_1^1 \oplus V_1^2 \oplus \cdots$ and then
using $A_2$ to split $V_A^n$ into $V_{12}^{n1} \oplus V_{12}^{n2} \oplus \cdots$ and then
… and then
using $A_N$ to split $V_{1 \ldots N-1}^{n\ldots n'}$ into $V_{1 \ldots (N-1) N}^{n \ldots n' 1} \oplus V_{1\ldots (N-1) N}^{n \ldots n' 2} \oplus \cdots$
In other words $B$ block diagonalizes in a basis of states that are simultaneously eigenstates of each of the $A_1,A_2,\ldots,A_{N-1}$, with the blocks consisting of states that have the same eigenvalues with respect to each of the $A_1,A_2,\ldots,A_N$.
As an aside I should mention that if a set of mutually commuting commuting observables $A_1,\ldots,A_N$ on a hilbert space $V$ has the property that all the degenerate subspaces $V_{1 \ldots N}^{n \ldots m}$ are one-dimensional, then they constitute what is called a complete set of commuting observables (CSCO).
From these observables a basis of $V$ can then be constructed of states $\ket{ a_{1n} \ldots a_{Nm} }$ that are simultaneous eigenstates of the $A_1,\ldots,A_N$, and their associated sequence of eigenvalues $a_{1n},\ldots,a_{Nm}$ is unique, i.e. a basis vector is identified by its eigenvalues with respect to the CSCO.
We obtain thus the appealing physical interpretation of a general quantum state as a superposition of states that return a definite and unique result upon simultaneous measurement of the CSCO.
Q10*: List a CSCO for the:
a) particle in a box
b) 2-dimensional harmonic oscillator
c) non-relativistic hydrogen atom
A10*:
a)
The spectrum is non degenerate so the hamiltonian $H$ alone constitutes a CSCO, i.e. the energy eigenstates form a basis and are uniquely identified by their quantum number $n$.
b)
The number of energy quanta $n_x,n_y$ in each of two orthogonal directions uniquely identify an energy eigenstate.
A pair of number operators $\hat{a}^+_x\hat{a}^-_x$ and $\hat{a}^+_y\hat{a}^-_y$ then comprise a CSCO.
Check that these operators do commute, as required.
c)
States are uniquely defined by their $n$, $l$, and $m$ quantum numbers (ignoring spin).
So we can construct a CSCO $\{H,L^2,L_z\}$ from the hamiltonian $H$, the total angular momentum $L^2$, and the $z$-projection $L_z$.
Again, how do we know that these operators necessarily commute?
footnotes
1We would say that $\phi$ is an isomorphism between $V$ and $W \oplus W^{\perp}$.
2
The distinctness of eigenvalues is necessary criterion, otherwise we could use the identity operator, which commutes with everything, to help us block diagonalize our matrices.
Unfortunately, the identity operator has only one eigenvalue (what is it?).
3
Or rather at most the same size. Check the case where $B = \lambda A$ for some constant $\lambda \in \C$.