This page contains a list of topics, definitions, and results from Machine Learning course at University of Chicago.
Week 1: Introduction and OLS
Learning problem
Given a distribution
on
. We want to learn the objective function
(with respect to the distribution :
).
Learning Algorithms
Let Z be the set of possible samples. The learning algorithm is a function that maps a number of samples to a measurable function (denoted here by F a class of all measurable functions). Sometimes we consider a class of computable functions instead.

Learning errors
Suppose the learning algorithm outputs h. The learning error can be measured by

One can prove that minimizing this quantity could be reduced to the problem of minimizing the following quantity.

And that's the reason why we try to learn
In other word, we claim that

The proof is easy.

We get

Then observe that,
- The first term only depends on distribution

- The third term is zero
![{\displaystyle \int \int (y-f_{p}(x))(f_{p}(x)-h(x))p(x,y)dydx=\int p(x)(f_{p}(x)-h(x))[\int (y-f_{p}(x))p(y|x)dy]dx}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d955a4ecf56a0f85df9d27be396441caa92d63a2)
Observe also that the term
which is zero.
- The second term is equal to

Example 1
When the class
, i.e. classification problem, our objective
reduces to
![{\displaystyle f_{p}(x)=\mathbb {E} _{p}[y|x]=Pr[y=1|x]-Pr[y=-1|x]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/302e20c9a4efc21087a8962b3cf89493b0f3ee61)
One can show that the function
![{\displaystyle h_{*}(x)=sgn\mathbb {E} _{p}[y|x]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/6f8ff8bee30f84ea0a9505c2a3ed6575f93d1ef0)
minimizes the loss (proof omited)
Example 2
When
, the problem is just like regression where we try to regress :
Ordinary Least Square
If the relation is linear,

OLS provably gives the minimum squared error.
Consider the error

Tikhonov Regularization
Week 2