Probstat/notes/regression

จาก Theory Wiki
ไปยังการนำทาง ไปยังการค้นหา
This is part of probstat.

In this section, we shall discuss linear regression. We shall focus on one-variable linear regression.

Model

We consider two variables and where is a function of . We refer to as independent or input variable, and as a dependent variable. We consider linear relationship between independent variable and dependent variable. We assume that there exist hidden variables and such that

where is a random error. We further assume that the error is unbiased, i.e., and is independent of .

Input: As an input to the regression process, we are given a set of data points: generated from the previous equation.

Goal: We want to estimate and .

The least squares estimators

Denote our estimate for as and for as . Using both variables as estimator, the error at data point , the error is

.

We focus more on the sum of squared errors, i.e.,

.

The method of least squares use the parameters that minimize the squared errors as an estimator. Therefore, we want to find and that minimize . To do so, we differentiate with respect to and :

Distribution of regression parameters

Statistical tests on regression parameters