Why should we add a $\lambda$ to the matrix?
1. To reduce the value of the weights $\omega$.
2. Make $X^{T}X$ invertible.
3. Prevent overfitting
Why will $X^{T}X$ become not invertible?
$$\omega_{LMS}=(X^{T}X)^{-1}X^{T}y$$
1. Data points of X is less than the dimension of $\omega$2. Columns of X are not linear independent. Such as: a column is a duplicate of one of the features, a column is the scaled version of another. Those columns are dependent between each others.
Example:
There is a duplicated column features in A.
$$A=\begin{bmatrix}x & x & y\end{bmatrix},\:x,y\in \mathbb{R}_{5\times1}$$
$$A^{'}=A^{T}A=\begin{bmatrix}1.0608&1.0608&0.9883\\1.0608&1.0608&0.9883\\0.9883&0.9883&1.8941\end{bmatrix}$$
$$rank(A^{'})=2,\:det(A^{'})=0$$
So $A^{'}$ is not invertible.
Here is another matrix A:
$$A=\begin{bmatrix}x & y & x+y\end{bmatrix}$$ $$A^{'}=A^{T}A=\begin{bmatrix}2.2051&1.0310&3.3261\\1.0310&1.6292&2.6602\\3.2361&2.6602&5.8963\end{bmatrix}$$ $$rank(A^{'})=2,\:det(A^{'})=0$$
Here is another matrix A:
$$A=\begin{bmatrix}x & y & x+y\end{bmatrix}$$ $$A^{'}=A^{T}A=\begin{bmatrix}2.2051&1.0310&3.3261\\1.0310&1.6292&2.6602\\3.2361&2.6602&5.8963\end{bmatrix}$$ $$rank(A^{'})=2,\:det(A^{'})=0$$
Let's look at A matrix above and find the eigenvalues of it:
$$eig(A')=0,0.8550,8.8756$$
By adding $\lambda=0.01$ to diagonal elements:
$$eig(A'+\lambda I)=0.01,0.8650,8.8856$$
It becomes invertible!
No comments:
Post a Comment