L3: Orthogonal Matrices

L3: Orthogonal Matrices#

Conclusion1: orthogonal subspace#

All the vectors in two different orthogonal subspaces are orthogonal to each other. Especially, the row space is orthogonal to null space.

\[Ax=0\Longrightarrow A_{i\cdot}\cdot x=0, \forall i\in\{1...n\}\]

We can find a fact that the vector \(x\) is orthogonal to every row of the matrix \(A\). Since the rows of the matrix forms the row space, and the columns of \(x\) forms the null space (in fact, we define all of the \(x\) statisfies this equation forms a null space). Thus, we know the row space is orthogonal to the null space.

Now, we have another interesting intuition about the \(Ax=0\). If the \(A\) is full rank, which means all of the rows are independent, assuming there are \(m\) rows, \(n\) columns, and \(m\leq n\), then we can not find a vector in \(\mathbb{R}^m\) that is perpendicular to the row space (since it full of the space \(\mathbb{R}^m\)). Thus, the number of independent vectors of \(x\) that satisfies the equation equals to the \(m-rank(A)\)

Understanding some geometric meaning of matrix#

Digonal matrix is streching each axis since it only changes the sclae of each axis.

../_images/0db301c442290b74292579d9368c6e0f1cb52813912776b9218a9d34e8fd74e4.png

Orthogonal matrix is rotation since the length of each basis is not changing.

../_images/121b42d3ee05116a67abedf9d01c03d5e63bf419690d61f086d54170198dae37.png

Fun fact#

For an orthogonal matrix \(Q\), its inverse \(Q^-{-1}=Q^T\). If a matrix \(A\) is symmetric, then its eigenvectors form an orthogonal matrix. Then, the interesting thing happens. First of all, \(A\) can be considered as a transformation when you strech the space one the directions of eigenvectors. Then, since the eigenvectors are perpendicular to each other, the result of \(A\) can also be achieved by rotating the original axis since the eigenvectors are perpendicular and thus can be considered as the new axis.

../_images/07e6847b21562fa48ef0968ed4c7def1b66030ca212e5f1e1ec1d4458ceeac85.png

Important

Thus, \(S=Q\Lambda Q^T\) can be understand as: any symmtric matrix can be considered as a tramsformation of three steps:

Rotate the axis by \(Q^T\), it transform the vectors in old coordinate to new coordinate.
Stretch the axis by \(\Lambda\), it streches the new axis by the eigenvalues.
Rotate the axis backward by \(Q\), it transform the vectors in new coordinate to old coordinate.

This is awesome, it means some complicated transformation can be achieved by several simple rotation and streching. Specifically, the conditions of such transformation is:

The sum of the dimensions of its eigenspaces is equal to the dimension \(n\) of the space. And the matrix is symmetric.

What’s the meaning of the first condition? I my understanding, it is same as we have \(n\) independent eigenvectors. But chatgpt does not agree with that. But I still did not understand it completely.

If the matrix is not symmetric but still holds the first part of above condition. We can decomposite the matrix into a less beautiful way, i.e., \(S=X\Lambda X^{-1}\), where columns of \(X\) are eigenvectors.

Decomposition with less condition#

First, let’s have a look a rectangle matrix with the size of \(2\times 3\).

../_images/bffc07855e272febe97d90f57b18df41d83f731aaf2f4db13cf327c325df8b7b.png

You can see, the dimension of the vector has been reduced from \(3\) to \(2\). Where the heck does the third dimension go? In fact the value of the third dimension contributes to the value of result vector. It does not just disappered, instead, the information of third dimension has been contained in the two dimension of the result vector. Although, in this picture, the dimension eraser assigns zero to the third dimension, which means it just discard the third dimension.

../_images/17a6d30f153592fd02a4c462cbfb96580deaa966a675741a5afefa72fd45d050.png

What happens when we apply an dimension adder? Where does the third dimension come from? If fact, the third dimension of reuslt vector is the reweighted sum of first two dimensions of the original vector (in fact it only has two dimensions). Although, in this picture, the dimension adder assigns zero to the first two dimensions.

SVD: singular value decomposition#

Recall the definition of the digonalization, where \(S=Q\Lambda Q^T\). The columns of \(Q\) are the eigenvectors of \(S\). But how can we get the eigenvectors from a rectangle matrix \(S\)? An amazing way is to get eigenvectors from the symmetric matrix \(SS^T\). Thus, the columns of \(Q\) are the eigenvectors of \(SS^T\).

What the \(Q^T\) corresponing to in this case? It is the eigenvectors of \(S^TS\). Thus, the columns of \(Q^T\) are the eigenvectors of \(S^TS\).

What’s the \(\Lambda\) corresponding to in this case? We don’t have eigenvalues for a rectangle matrix, but we can have singular values for them. In fact, it’s the eigenvalues of \(SS^T\) and \(S^TS\) sorted in the descending order (largest eigenvalues for these two matrices are same, rest of eigenvalues are zero, see following pictures). Thus, the diagonal of \(\Lambda\) are the singular values of \(S\).

../_images/a4a2c6cd83de3d35695aca2287350ede45647cb6f67e6705ae096c18baa3b3c7.png

../_images/ec3581c2f9764d53630fef27bc50a6f64de5666e868368767e9f779eda81b65a.png

Important

Thus, the SVD of \(S=U\sum V^*\) can also be considered as a sequence of simple transformation:

Rotate with \(V^*\).
Stretch with \(\sum\).
Rotate with \(U\).

The only thing different between SVD and digonalization is SVD might raise or reduce the dimension of the vector. :::

from IPython.display import Image
Image('imgs/l3-m8.png', width=800)

../_images/2c40e50909b56fb3e2944d16c1b1e6b40247b882d8adcddd439475c74dfde24f.png

Interesting problem: projected vector#

Given a vector \(x\), we want to project it to another line, assume there is a vector \(v\) that is on the line. What’s the result of projection?

First, normalize the \(v\) to be \(u=v/||v||\). Then the length of the projection of \(x\) to the line is \(u^T\cdot x\). Finally, the prjection vector is \(u \cdot u^T \cdot x\).