Bias
 Symptom $ J_{cv}(\theta) 1 $ and $ J_{train}(\theta) $ are high.
 Prescription
 Getting additional features
 Adding polynomial features ($x_{1}^{2}, x_{2}^{2}, x_1x_2, etc $)
 Decreasing $\lambda $
Variance
 Symptom $ J_{cv}(\theta) \gg J_{train}(\theta) $ and $J_{train}(\theta) $ is low.

Prescription
 Getting more training samples
 Getting rid of some features
 Increasing $\lambda$
Regularization
Very big $\lambda \rightarrow $ Bias(underfiiting) very small $\lambda \rightarrow $ Variance(overfilling)
$\lambda $ selection : use the same training set and select the lambda that leads to the smallest CV Error and to check the Test Error.
Learning Curve
 High Bias $J_{train}(\theta) $ is close to $J_{cv}(\theta) $. Getting more data is useless!
 High Variance There is a gap between $J_{train}(\theta) $ and $J_{cv}(\theta) $. Getting more data may give a better result.
Neural networks and overfitting
 Using “large” neural network with good regularization to address overfiting is usually better than “small” neural network, but the computation cost is more expensive.