[MLE] Linear Regression
Table of Content
In my own words, let’s first imagine what is linear regression problem?
House price prediction!
Yes, it is a typical regression problem, and if we predict the house using linear algebra, like y = ax + b, it can be a linear regression problem.
To apply our learning algorithm, we need to predict a function of house price first, and adjust it to resemble the actual function(training set). Thus, the h(x) is what we called hypothesis function, it is a hypothesis. And what we do is to modify the intrinsic parameters w or you can just say modify weights.
Now we need to find a way to modify w so that our hypothesis function is more similar to the actual function, which comes to a new term cost function
Cost function, in brief, is the sum of the difference between all training examples(points) and its predicted value.
For example, on the above image, when size is 750, the actual price is 200 but the predicted value is 150. Therefore, the cost or the difference is 50 in this point, and we sum all these cost.
\(\frac{1}{2}\) is because we will use Gradient Descent to calculate the minimal value of our cost, and when using partial derivatives, the square will times this \(\frac{1}{2}\) for easier calculation
|
Multivariate linear regression with first-order polynomial:
\(yˆ=h(x,w)=w_0 +w_1x_1 +...+w_jx_j +...+w_dx_d\)
Higher-order polynomial:
Actually, I think the explanation by Andrew Ng in Coursera is better, See here
We want to minimise J(w0, w1, … wk)
Take care! This simultaneous updates, or that would be wrong because you use the updated w0 to calculate new temp1
Linear Regression with one variable
So what is univariate linear regression? Let’s first look at one slide in the class:In my own words, let’s first imagine what is linear regression problem?
House price prediction!
Yes, it is a typical regression problem, and if we predict the house using linear algebra, like y = ax + b, it can be a linear regression problem.
To apply our learning algorithm, we need to predict a function of house price first, and adjust it to resemble the actual function(training set). Thus, the h(x) is what we called hypothesis function, it is a hypothesis. And what we do is to modify the intrinsic parameters w or you can just say modify weights.
Now we need to find a way to modify w so that our hypothesis function is more similar to the actual function, which comes to a new term cost function
Cost function
Cost function, in brief, is the sum of the difference between all training examples(points) and its predicted value.
For example, on the above image, when size is 750, the actual price is 200 but the predicted value is 150. Therefore, the cost or the difference is 50 in this point, and we sum all these cost.
Tricky point, why we have a \(\frac{1}{2n}\) before our \(\sum\) (sum)?
\(\frac{1}{n}\) is because we want to normalize them, which indicates that cost function is independent from training example size\(\frac{1}{2}\) is because we will use Gradient Descent to calculate the minimal value of our cost, and when using partial derivatives, the square will times this \(\frac{1}{2}\) for easier calculation
Other cost functions
Well, this cost function is called L2-norm. There are also L1-norm but I think it is worse.|
Multivariate Linear Regression
In its simplest form, multivariate regression is simply the linear sum of the multiplications of each of the features with their corresponding weight term:Multivariate linear regression with first-order polynomial:
\(yˆ=h(x,w)=w_0 +w_1x_1 +...+w_jx_j +...+w_dx_d\)
Higher-order polynomial:
Actually, I think the explanation by Andrew Ng in Coursera is better, See here
Gradient Descent
Cool, here comes to my favorite gradient descent, it is all about math to calculate the minima of cost functionWe want to minimise J(w0, w1, … wk)
- First start with some initial values for w0, w1, …wk
- Keep updating W to reduce J(w0, w1, … wk), hoping to end in the minimum
Take care! This simultaneous updates, or that would be wrong because you use the updated w0 to calculate new temp1
评论
发表评论