Machine Learning_Week5

Cost Function

a) = total number of layers in the network
b) = number of units (not counting bias unit) in layer l
c) = number of output units/classes

We denote as being a hypothesis that results in the output. Our cost function for neural networks is going to be a generalization of the one we used for logistic regression.
Recall that the cost function for regularized logistic regression was:

For neural networks, it is going to be slightly more complicated:

Backpropagation Algorithm

Given training set

Set
For training example t =1 to m:

  • Set
  • Perform forward propagation to compute for
  • Using compute
  • Compute using
  • or with vectorization,
  • If
  • If

Gradient Checking

We can approximate the derivative of our cost function with:

With multiple theta matrices:

Random Initialization

Reference

Machine Learning by Stanford University

Yuehua(刘跃华) wechat