Computing Parameters Analytically

Normal Equation

We have introduced a way of computing our parameter \(\theta\) – Gradient Descent
But in Gradient Descent, we need to compute it by iteration, and it is obviously very complicated. In some cases, we still need feature scaling and mean normalization.
Now there is a analytically way of computing parameter – Normal Equation.
In the “Normal Equation” method, we will minimize J by explicitly taking its derivatives with respect to the θj ’s, and setting them to zero. This allows us to find the optimum theta without iteration. The normal equation formula is given below:
\(θ=(X^TX)^{−1}X^Ty\)
There is no need to do feature scaling with the normal equation.
In Octave, use pinv(x'*x)*x'*y

Gradient Descent	Normal Equation
Need to choose alpha	No need to choose alpha
Need lots of iterations	No need to iterate
\(O(kn^2)\)	\(O(n^3)\) Inverse needs \(O(n^3)\) and \(X^TX\) needs \(O(n^2)\)
Works well n is large	slow if n is large

With the normal equation, computing the inversion has complexity \(O(n^3)\). So if we have a very large number of features, the normal equation will be slow. In practice, when n exceeds 10,000 it might be a good time to go from a normal solution to an iterative process.

Normal Equation Noninvertibility

When implementing the normal equation in octave we want to use the ‘pinv’ function rather than ‘inv.’ The ‘pinv’ function will give you a value of θ even if \(X^TX\) is not invertible.
If \(X^TX\) is noninvertible, the common causes might be having :

Redundant features, where two features are very closely related (i.e. they are linearly dependent)
- for example size in \(foot^2\) and size in \(m^2\)
Too many features (e.g. m ≤ n). In this case, delete some features or use “regularization” (to be explained in a later lesson).

Solutions to the above problems include deleting a feature that is linearly dependent with another or deleting one or more features when there are too many features.

Natalia Zimniewicz2019年11月22日 12:12
Great article...
回复删除
回复
Dominika Starańska2019年11月23日 07:12
Unfortunately, I do not know about such things, which is why I prefer to use proven solutions that are provided by external companies. One such solution is certainly https://grapeup.com/services/platform-ops-and-support/ where I can be sure that the platform I use will be adapted directly to my needs.
回复删除
回复

添加评论

搜索此博客

MikeChen's Blog

[MLE] W2 Computing Parameters Analytically

Computing Parameters Analytically

Normal Equation

Normal Equation Noninvertibility

评论

发表评论

此博客中的热门博文

[AIM] Evolutionary Algorithms

[MLE]Decision Trees

[SEC] Cryptography I