# Matrix projection blues

Hi on OpenCourseware MIT site: Lecture 15: Projections onto subspaces

[edited :the code on sagecell sagemath org was not the good one ! it was the first version I wrote. when I updated the code in the cell, that does not work, I had to open a new cell]

code on sagecell sagemath org

Q1:
p is the b projected vector on vector a
below all letters are matrices
E=B-P with P=X*A
E=B-X*A as E perpendicular to A => dot product(A,E)) = 0 means A.transpose()*E=0
then A^T*(B-X*A)=0 so A^T*(X*A)=A^T*B (and this is false !)
but Gilbert Strang write X*A^T*A=A^T*B (and this is good !)
but we are not allowed to commute A^T*X*A by X*A^T*A if X is a matrix !
I know I made a mistake in my raisonning but I fail to find it !

Q2 : Why I must write  X=A * (A.transpose())/(( (A.transpose()) * A ).det())
instead of             X=A * (A.transpose())/(( (A.transpose()) * A ))
why I need to add .det() ?

edit retag close merge delete

I know the first question is basic math question not a sagemath question, sorry, tell me if I must post it on a math forum instead. now the code in the link is my last code.

Sort by » oldest newest most voted Q1. In his lecture, Prof. Strang begins with the simplest case: the orthogonal projection of a vector $\mathbf{b}$ onto a one dimensional subspace $S=\mathrm{span}\langle\mathbf{a}\rangle$, $\mathbf{a}$ being a non-null vector in $\mathbb{R}^n$. The projection vector $\mathbf{p}$ should be a multiple of $\mathbf{a}$, so Prof. Strang writes it as $\mathbf{p}=x\mathbf{a}$. Here $x$ is a scalar number, not a matrix, and $\mathbf{a}$ is a vector, identified with a $n\times 1$ matrix. He wants to deduce that $\mathbf{p}=P\mathbf{b}$, where $P$ is the projection matrix $P=\frac{1}{\mathbf{a}^T\mathbf{a}}\mathbf{a}\mathbf{a}^T,$ which is an $n\times n$ matrix. To this end, Prof. Strang writes the projection vector as $\mathbf{p}=\mathbf{a}\,x$. Now, you should see $\mathbf{p}$ and $\mathbf{a}$ as $n\times1$ matrices and $x$ as a one dimensional vector or a $1\times1$ matrix, so that $\mathbf{a}$ and $x$ can be multiplied. Thus, we have $x\,\mathbf{a}=\mathbf{a}\,x.$ This identity does not mean that $\mathbf{a}$ and $x$ are two matrices that commute. Just read the left and right sides as stated above. It is true that the notation may be a bit misleading. Perhaps it would be better to write $(x)$ instead of $x$ if $x$ should be seen as a $1\times 1$ matrix (parentheses are used to delimit matrices). For example, $4\begin{pmatrix} 1 \\ 2 \\ 3\end{pmatrix} =\begin{pmatrix} 4 \\ 8 \\ 12\end{pmatrix} =\begin{pmatrix} 1 \\ 2 \\ 3\end{pmatrix}(4).$

The advantage of expressing $\mathbf{p}$ as $\mathbf{a}x$ is that generalization is then possible. If $S$ is spanned by the linearly independent vectors $\mathbf{a}_1,\ldots,\mathbf{a}_k$, the orthogonal projection $\mathbf{p}$ of a vector $\mathbf{b}$ onto $S$ should be a linear combination of $\mathbf{a}_1,\ldots,\mathbf{a}_k$, that is, $\mathbf{p}=x_1\mathbf{a}_1+\cdots+x_k\mathbf{a}_k=AX$ with $A=\left(\mathbf{a}_1 \vert\ldots \vert\mathbf{a}_k\right)$ and $X=\begin{pmatrix}x_1 \\ \vdots \\ x_k\end{pmatrix}$ The reasoning in Prof. Strang’s lecture would then show that $\mathbf{p}=P\mathbf{b}$, where $P$ is now the projection matrix $P=A(A^TA)^{-1}A^T.$

Returning to your question, you write P=X*A where, in fact, you should write $\mathbf{p}=x\mathbf{a}$ and consider $x$ being an scalar, not a matrix. Consequently, since $x$ is a scalar, from $\mathbf{a}^T(\mathbf{b}-\mathbf{p})=0$, one has $\mathbf{a}^T\mathbf{b}=\mathbf{a}^T\mathbf{p}=\mathbf{a}^T(x\mathbf{a})=x\mathbf{a}^T\mathbf{a}.$ Hence $x=\frac{\mathbf{a}^T\mathbf{b}}{\mathbf{a}^T\mathbf{a}}.$

Q2. In your code, A.transpose()*A is a $1\times 1$ matrix. You cannot divide by a matrix. Thus you need to extract the unique element of this matrix, either with the .det() method or simply writing (A.transpose()*A)[0,0]. By the way, note that, in the video, the projection matrix is denoted by $P$, not $X$.

In SageMath, I think that it is better to use vector instead of matrix to represent $\mathbf{a}$, $\mathbf{b}$ and $\mathbf{p}$, as shown here.

more

Thanks a lot, Juanjo I understand my mistakes now. I'm a bit ashamed of the confusion I made, sorry.