## Logistic Regression(逻辑回归)

Don’t be confused by the name “Logistic Regression”, it is named that way for historical reasons and is actually an approach to classification problems, not regression problems. 别被名字误导了，实际是解决分类问题的。

### Binary Classification

Hypothesis should satisfy:

Sigmoid Function,” also called the “Logistic Function”:

$h_\theta$will give us the probability that our output is 1 or 0.

### Cost Function

Cost function for logistic regression looks like:

The more our hypothesis is off from y, the larger the cost function output. If our hypothesis is equal to y, then our cost is 0:

### Simplified Cost Function and Gradient Descent

#### Simplified Cost Function

We can compress our cost function’s two conditional cases into one case:

$\mathrm{Cost}(h_\theta(x),y) = - y \; \log(h_\theta(x)) - (1 - y) \log(1 - h_\theta(x))$ $J(\theta) = - \frac{1}{m} \displaystyle \sum_{i=1}^m [y^{(i)}\log (h_\theta (x^{(i)})) + (1 - y^{(i)})\log (1 - h_\theta(x^{(i)}))]$

A vectorized implementation is:

We can work out the derivative part using calculus to get:

A vectorized implementation is:

#### Partial derivative of J(θ)

The vectorized version:

## Regularization

Regularization is designed to address the problem of overfitting. There are two main options to address the issue of overfitting:

1. Reduce the number of features:

• Manually select which features to keep.
• Use a model selection algorithm.

2. Regularization

• Keep all the features, but reduce the parameters.
• Regularization works well when we have a lot of slightly useful features.