Esempio n. 1
0
# The performance is really bad, and the cost does not really decrease, and the algorithm performs no better than random guessing. Why? Lets look at the details of the predictions and the decision boundary:

# In[6]:

print ("predictions_train = " + str(predictions_train))
print ("predictions_test = " + str(predictions_test))


# In[7]:

plt.title("Model with Zeros initialization")
axes = plt.gca()
axes.set_xlim([-1.5,1.5])
axes.set_ylim([-1.5,1.5])
plot_decision_boundary(lambda x: predict_dec(parameters, x.T), train_X, train_Y)


# The model is predicting 0 for every example. 
# 
# In general, initializing all the weights to zero results in the network failing to break symmetry. This means that every neuron in each layer will learn the same thing, and you might as well be training a neural network with $n^{[l]}=1$ for every layer, and the network is no more powerful than a linear classifier such as logistic regression. 

# <font color='blue'>
# **What you should remember**:
# - The weights $W^{[l]}$ should be initialized randomly to break symmetry. 
# - It is however okay to initialize the biases $b^{[l]}$ to zeros. Symmetry is still broken so long as $W^{[l]}$ is initialized randomly. 
# 

# ## 3 - Random initialization
# 
# To break symmetry, lets intialize the weights randomly. Following random initialization, each neuron can then proceed to learn a different function of its inputs. In this exercise, you will see what happens if the weights are intialized randomly, but to very large values. 
Esempio n. 2
0
# The performance is really bad, and the cost does not really decrease, and the algorithm performs no better than random guessing. Why? Lets look at the details of the predictions and the decision boundary:

# In[11]:

print ("predictions_train = " + str(predictions_train))
print ("predictions_test = " + str(predictions_test))


# In[12]:

plt.title("Model with Zeros initialization")
axes = plt.gca()
axes.set_xlim([-1.5,1.5])
axes.set_ylim([-1.5,1.5])
plot_decision_boundary(lambda x: predict_dec(parameters, x.T), train_X, train_Y)


# The model is predicting 0 for every example. 
# 
# In general, initializing all the weights to zero results in the network failing to break symmetry. This means that every neuron in each layer will learn the same thing, and you might as well be training a neural network with $n^{[l]}=1$ for every layer, and the network is no more powerful than a linear classifier such as logistic regression. 

# <font color='blue'>
# **What you should remember**:
# - The weights $W^{[l]}$ should be initialized randomly to break symmetry. 
# - It is however okay to initialize the biases $b^{[l]}$ to zeros. Symmetry is still broken so long as $W^{[l]}$ is initialized randomly. 
# 

# ## 3 - Random initialization
# 
# To break symmetry, lets intialize the weights randomly. Following random initialization, each neuron can then proceed to learn a different function of its inputs. In this exercise, you will see what happens if the weights are intialized randomly, but to very large values. 
    L = len(layers_dims)

    for l in range(1,L):
        parameters['W'+str(l)] = np.random.randn(layers_dims[l],layers_dims[l-1])*np.sqrt(2/layers_dims[l-1])
        parameters['b'+str(l)] = np.zeros((layers_dims[l],1))

        #使用断言保证数据格式正确
        assert(parameters['W'+str(l)].shape == (layers_dims[l],layers_dims[l-1]))
        assert(parameters['b'+str(l)].shape == (layers_dims[l],1))

    return parameters

parameters = model(train_X, train_Y, initialization = "he",is_polt=True)
print("训练集:")
predictions_train = init_utils.predict(train_X, train_Y, parameters)
print("测试集:")
init_utils.predictions_test = init_utils.predict(test_X, test_Y, parameters)

plt.title("Model with He initialization")
axes = plt.gca()
axes.set_xlim([-1.5, 1.5])
axes.set_ylim([-1.5, 1.5])
init_utils.plot_decision_boundary(lambda x: init_utils.predict_dec(parameters, x.T), train_X, train_Y)

# 不同的初始化方法可能导致性能最终不同
#
# 随机初始化有助于打破对称,使得不同隐藏层的单元可以学习到不同的参数。
#
# 初始化时,初始值不宜过大。
#
# He初始化搭配ReLU激活函数常常可以得到不错的效果
Esempio n. 4
0
        parameters = init_utils.update_parameters(parameters, grads,
                                                  learning_rate)

        # 记录成本
        if i % 1000 == 0:
            costs.append(cost)
            # 打印成本
            if print_cost:
                print("第" + str(i) + "次迭代,成本值为:" + str(cost))

    # 学习完毕,绘制成本曲线
    if is_polt:
        plt.plot(costs)
        plt.ylabel('cost')
        plt.xlabel('iterations (per hundreds)')
        plt.title("Learning rate =" + str(learning_rate))
        plt.show()

    # 返回学习完毕后的参数
    return parameters


parameters = model(train_X, train_Y, initialization="he", is_polt=False)

plt.title("Model with Zeros initialization")
axes = plt.gca()
axes.set_xlim([-1.5, 1.5])
axes.set_ylim([-1.5, 1.5])
init_utils.plot_decision_boundary(
    lambda x: init_utils.predict_dec(parameters, x.T), train_X, train_Y)