Why should we add a λ to the matrix?
1. To reduce the value of the weights ω.
2. Make XTX invertible.
3. Prevent overfitting
Why will XTX become not invertible?
ωLMS=(XTX)−1XTy
1. Data points of X is less than the dimension of ω2. Columns of X are not linear independent. Such as: a column is a duplicate of one of the features, a column is the scaled version of another. Those columns are dependent between each others.
Example:
There is a duplicated column features in A.
A=[xxy],x,y∈R5×1
A′=ATA=[1.06081.06080.98831.06081.06080.98830.98830.98831.8941]
rank(A′)=2,det(A′)=0
So A′ is not invertible.
Here is another matrix A:
A=[xyx+y] A′=ATA=[2.20511.03103.32611.03101.62922.66023.23612.66025.8963] rank(A′)=2,det(A′)=0
Here is another matrix A:
A=[xyx+y] A′=ATA=[2.20511.03103.32611.03101.62922.66023.23612.66025.8963] rank(A′)=2,det(A′)=0
Let's look at A matrix above and find the eigenvalues of it:
eig(A′)=0,0.8550,8.8756
By adding λ=0.01 to diagonal elements:
eig(A′+λI)=0.01,0.8650,8.8856
It becomes invertible!
No comments:
Post a Comment