ML:Evaluating Hypotheses

1. 1 Error

There are two types of error:

true error: $\operatorname { error } { \mathcal { D } } ( h ) \equiv \operatorname { Pr } { x \in \mathcal { D } } [ f ( x ) \neq h ( x ) ]$
$D$ for distribution
sample error: $error_s( h ) \equiv \frac { 1 } { n } \sum_ { x \in S } \delta ( f ( x ) \neq h ( x ) )$
$\delta ( f ( x ) \neq h ( x ) )=1$ if $f ( x ) \neq h ( x )$

How well dose sample error estimate true error？

We can check

Bias
Variance

2. 2 Estimators

Choose sample $S$ of size $n$ according to $D$
measure $error_s(h)$
$\to$ sample error is an unbiased estimator for true error

e.g. with approximately $95%$ probability, true error lie in
$$
\operatorname { error } { S } ( h ) \pm 1.96\sqrt \frac { \text { errors } ( h ) \left( 1 - e r r o r { S } ( h ) \right) } { n }
$$