- This is part of probstat.
In this section, we shall discuss linear regression. We shall focus on one-variable linear regression.
Model
We consider two variables
and
where
is a function of
. We refer to
as independent or input variable, and
as a dependent variable. We consider linear relationship between independent variable and dependent variable. We assume that there exist hidden variables
and
such that
where
is a random error. We further assume that the error is unbiased, i.e.,
and is independent of
.
Input: As an input to the regression process, we are given a set of
data points:
generated from the previous equation.
Goal: We want to estimate
and
.
The least squares estimators
Denote our estimate for
as
and for
as
. Using both variables as estimator, the error at data point
, the error is
.
We focus more on the sum of squared errors, i.e.,
.
The method of least squares use the parameters that minimize the squared errors as an estimator. Therefore, we want to find
and
that minimize
. To do so, we differentiate
with respect to
and
:
Distribution of regression parameters
Statistical tests on regression parameters