Probstat/notes/regression

จาก Theory Wiki
ไปยังการนำทาง ไปยังการค้นหา
This is part of probstat.

In this section, we shall discuss linear regression. We shall focus on one-variable linear regression.

Model

We consider two variables and where is a function of . We refer to as independent or input variable, and as a dependent variable. We consider linear relationship between independent variable and dependent variable. We assume that there exist hidden variables and such that

where is a random error. We further assume that the error is unbiased, i.e., and is independent of .

Input: As an input to the regression process, we are given a set of data points: generated from the previous equation.

Goal: We want to estimate and .

The least squares estimators

Denote our estimate for as and for as . Using both variables as estimator, the error at data point , the error is

.

We focus more on the sum of squared errors, i.e.,

.

The method of least squares use the parameters that minimize the squared errors as an estimator. Therefore, we want to find and that minimize . To do so, we partially differentiate with respect to and :

We set these two equations to zero to find the maximum and obtain these two equations we have to solve.

Before solving these two equations, let's define

Distribution of regression parameters

Statistical tests on regression parameters