Logistic Regression(逻辑回归)

Don’t be confused by the name “Logistic Regression”, it is named that way for historical reasons and is actually an approach to classification problems, not regression problems. 别被名字误导了，实际是解决分类问题的。

Binary Classification

Hypothesis should satisfy:

Sigmoid Function,” also called the “Logistic Function”:

$h_\theta$will give us the probability that our output is 1 or 0.

Cost Function

Cost function for logistic regression looks like:

The more our hypothesis is off from y, the larger the cost function output. If our hypothesis is equal to y, then our cost is 0:

Simplified Cost Function and Gradient Descent

Simplified Cost Function

We can compress our cost function’s two conditional cases into one case:

$\mathrm{Cost}(h_\theta(x),y) = - y \; \log(h_\theta(x)) - (1 - y) \log(1 - h_\theta(x))$ $J(\theta) = - \frac{1}{m} \displaystyle \sum_{i=1}^m [y^{(i)}\log (h_\theta (x^{(i)})) + (1 - y^{(i)})\log (1 - h_\theta(x^{(i)}))]$

A vectorized implementation is:

We can work out the derivative part using calculus to get:

A vectorized implementation is:

Partial derivative of J(θ)

The vectorized version:

Regularization

Regularization is designed to address the problem of overfitting. There are two main options to address the issue of overfitting:

1. Reduce the number of features:

• Manually select which features to keep.
• Use a model selection algorithm.

2. Regularization

• Keep all the features, but reduce the parameters.
• Regularization works well when we have a lot of slightly useful features.