return probs,preds def getAccuracy(someX, someY, w): prob,prede = getProbsAndPreds(someX, w) accuracy = sum(prede == someY)/(float(len(someY))) return accuracy def predict(x, w): out = np.array([np.argmax(x) for x in softmax(x.dot(w))]) return out def score(x, y): prediction = predict(x) acc = 1 / len(y) * sum([(prediction[i] == y[i]) for i in range(len(y))]) return acc if __name__ == "__main__": data, label = read_data.read_data_from_resource() # minmax正则化:否则数据不好收敛,到nan min1 = np.min(data[:, 1]) max1 = np.max(data[:, 1]) min2 = np.min(data[:, 2]) max2 = np.max(data[:, 2]) m, n = np.shape(data) for i in range(m): data[i][1] = (data[i][1] - min1) / (max1 - min1) data[i][2] = (data[i][2] - min2) / (max2 - min2) w, losses, thetaList = softmaxRegressionSGD(data, label, 5000) print('Training Accuracy: ', getAccuracy(data, label, w)) # plt.plot(losses) # plt.legend() # 将样例显示出来 # plt.show()
error = classLabels[random] - h new_weights = weights + alpha * error * dataMatIn[random] weights = new_weights # for i in range(m): # h = sigmoid(np.sum(new_weights * dataMatIn[i])) # loss = loss_funtion(dataMatIn, classLabels, weights) h = sigmoid(np.sum(new_weights * dataMatIn[random])) loss = loss_funtion(dataMatIn, classLabels, weights) loss_array.append(loss) theta_array.append(weights) iter_count += 1 return weights, loss_array, theta_array data, labels = read_data_from_resource("dataset1") # minmax正则化:否则数据不好收敛,到nan min1 = np.min(data[:, 1]) max1 = np.max(data[:, 1]) min2 = np.min(data[:, 2]) max2 = np.max(data[:, 2]) m, n = np.shape(data) for i in range(m): data[i][1] = (data[i][1] - min1) / (max1 - min1) data[i][2] = (data[i][2] - min2) / (max2 - min2) max_iters = 10000 weight, loss_array, theta_array = stocGradDescent(data, labels, max_iters) print("theta is :", theta_array[-1]) print("loss", loss_array[-1])
cost_array = [] weight_array = [] for epoch in range(epochs): z = np.dot(X, weight) activation = softmax(z) diff = activation - y grad = np.dot(X.T, diff) weight -= learning_rate * grad cost = compute_cost(X, y, weight) cost_array.append(cost) weight_array.append(np.array(weight).squeeze()) return weight, cost_array, weight_array # 这里修改了: data.insert(0,1.0) data, label = read_data_from_resource("dataset2") # minmax正则化:否则数据不好收敛,到nan min1 = np.min(data[:, 1]) max1 = np.max(data[:, 1]) min2 = np.min(data[:, 2]) max2 = np.max(data[:, 2]) m, n = np.shape(data) for i in range(m): data[i][1] = (data[i][1] - min1) / (max1 - min1) data[i][2] = (data[i][2] - min2) / (max2 - min2) labels = one_hot_encoder(label, m) learned_w, cost_array, weight_array = softmax_gradient_descent(data, labels) print("Learned weights:") print(learned_w)
else: xcord2.append(dataArr[i, 1]) ycord2.append(dataArr[i, 2]) fig = plt.figure() ax = fig.add_subplot(111) ax.scatter(xcord1, ycord1, s=30, c='red', marker='s') ax.scatter(xcord2, ycord2, s=30, c='green') x = np.arange(-0.1, 1.0, 0.01) y = (-weights[0] - weights[1] * x) / weights[2] ax.plot(x, y) plt.xlabel('X') plt.ylabel('Y') plt.show() dataMat, labelMat = read_data_from_resource() # minmax正则化:否则数据不好收敛,到nan min1 = np.min(dataMat[:, 1]) max1 = np.max(dataMat[:, 1]) min2 = np.min(dataMat[:, 2]) max2 = np.max(dataMat[:, 2]) m, n = np.shape(dataMat) for i in range(m): dataMat[i][1] = (dataMat[i][1] - min1) / (max1 - min1) dataMat[i][2] = (dataMat[i][2] - min2) / (max2 - min2) dataArr = np.array(dataMat) weights, losst, loss_array, theta_array = gradDescent(dataArr, labelMat) print("weight:", weights) print("loss:", losst) # 数据可视化 plotBestFit(dataArr, labelMat, weights)