def check_rtol(self, x, y): x_cpu = cuda.to_cpu(x) y_cpu = cuda.to_cpu(y) max_ratio = numpy.max(numpy.abs(x_cpu - y_cpu) / y_cpu) with self.assertRaises(AssertionError): testing.assert_allclose(x, y, atol=0, rtol=max_ratio - 1) testing.assert_allclose(x, y, atol=0, rtol=max_ratio + 1)
def main(): parser = argparse.ArgumentParser() parser.add_argument('chainermodel') parser.add_argument('--gpu', type=int, default=0) args = parser.parse_args() chainermodel = args.chainermodel gpu = args.gpu save_dir = 'test_result' if not osp.exists(save_dir): os.makedirs(save_dir) dataset = apc2016.APC2016Dataset() n_class = len(dataset.target_names) model = VGG16(n_class=n_class) S.load_hdf5(chainermodel, model) if gpu != -1: model.to_gpu(gpu) batch_size = 25 index = 0 sum_accuracy = 0 label_true_all = [] label_pred_all = [] for index_start in xrange(0, len(dataset.test), batch_size): indices = range(index_start, min(len(dataset.test), index_start + batch_size)) x, t = dataset.next_batch(batch_size, type='test', type_indices=indices) if gpu != -1: x = cuda.to_gpu(x, gpu) t = cuda.to_gpu(t, gpu) x = Variable(x, volatile=True) t = Variable(t, volatile=True) model(x, t) x_data = cuda.to_cpu(x.data) accuracy = float(cuda.to_cpu(model.acc.data)) sum_accuracy += accuracy * len(x_data) label_true = cuda.to_cpu(t.data) label_pred = cuda.to_cpu(model.pred.data).argmax(axis=1) label_true_all.extend(label_true.tolist()) label_pred_all.extend(label_pred.tolist()) fname = '{0}_{1}-{2}_{3:.2}.png'.format( osp.basename(chainermodel), indices[0], indices[-1], accuracy) fname = osp.join(save_dir, fname) draw_test_result(dataset, fname, x_data, label_true, label_pred, n_class) print('Saved {0}.'.format(fname)) mean_accuracy = sum_accuracy / len(dataset.test) print('Accuracy: {0}'.format(mean_accuracy)) print(classification_report( y_true=label_true_all, y_pred=label_pred_all, labels=np.arange(len(dataset.target_names)), target_names=dataset.target_names, ))
def check_backward(update, atom_data, adj_data, y_grad): """Check gradient of GGNNUpdate. This function is different from other backward tests. Because of GRU, reset_state method has to be called explicitly before gradient calculation. Args: update (callable): atom_data (numpy.ndarray): adj_data (numpy.ndarray): y_grad (numpy.ndarray): """ atom = chainer.Variable(atom_data) adj = chainer.Variable(adj_data) update.reset_state() y = update(atom, adj) y.grad = y_grad y.backward() def f(): update.reset_state() return update(atom_data, adj_data).data, gx, = gradient_check.numerical_grad(f, (atom.data, ), (y.grad, )) numpy.testing.assert_allclose( cuda.to_cpu(gx), cuda.to_cpu(atom.grad), atol=1e-3, rtol=1e-3)
def generate_sample(state, primetext): prev_char = np.array([0], dtype=np.int32) if args.gpu >= 0: prev_char = cuda.to_gpu(prev_char) if len(primetext) > 0: for i in primetext: sys.stdout.write(i) prev_char = np.ones((1,)).astype(np.int32) * vocab[i] if args.gpu >= 0: prev_char = cuda.to_gpu(prev_char) state, prob = model.predict(prev_char, state) for i in xrange(args.length): state, prob = model.predict(prev_char, state) if args.sample > 0: probability = cuda.to_cpu(prob.data)[0].astype(np.float64) probability /= np.sum(probability) index = np.random.choice(range(len(probability)), p=probability) else: index = np.argmax(cuda.to_cpu(prob.data)) sys.stdout.write(ivocab[index].encode("utf-8")) prev_char = np.array([index], dtype=np.int32) if args.gpu >= 0: prev_char = cuda.to_gpu(prev_char) return state
def check_atol(self, x, y): x_cpu = cuda.to_cpu(x) y_cpu = cuda.to_cpu(y) max_abs_diff = numpy.max(numpy.abs(x_cpu - y_cpu)) with self.assertRaises(AssertionError): testing.assert_allclose(x, y, atol=max_abs_diff - 1, rtol=0) testing.assert_allclose(x, y, atol=max_abs_diff + 1, rtol=0)
def check_backward(self, gpu): gx1, gx2 = self.f.backward((self.x1, self.x2), (self.gy1, self.gy2)) self.assertEqual(self._get_method('backward', not gpu).call_count, 0) self._get_method('backward', gpu).assert_called_once_with( (self.x1, self.x2), (self.gy1, self.gy2)) self.assertTrue((cuda.to_cpu(gx1) == cuda.to_cpu(self.gx1)).all()) self.assertIsNone(gx2)
def train_epoch(train_data, train_labels, model, optimizer, batchsize, transformations, silent, gpu=0, finetune=False): N = train_data.shape[0] pbar = ProgressBar(0, N) perm = np.random.permutation(N) sum_accuracy = 0 sum_loss = 0 for i in range(0, N, batchsize): x_batch = train_data[perm[i : i + batchsize]] y_batch = train_labels[perm[i : i + batchsize]] if transformations is not None: if "rotation" == transformations: x_batch = rotate_transform_batch(x_batch, rotation=2 * np.pi) if gpu >= 0: x_batch = cuda.to_gpu(x_batch.astype(np.float32)) y_batch = cuda.to_gpu(y_batch.astype(np.int32)) optimizer.zero_grads() x = Variable(x_batch) t = Variable(y_batch) loss, acc = model(x, t, train=True, finetune=finetune) if not finetune: loss.backward() optimizer.update() sum_loss += float(cuda.to_cpu(loss.data)) * y_batch.size sum_accuracy += float(cuda.to_cpu(acc.data)) * y_batch.size if not silent: pbar.update(i + y_batch.size) return sum_loss, sum_accuracy
def check_non_maximum_suppression(self, bbox, threshold, expect): selec = non_maximum_suppression(bbox, threshold) self.assertIsInstance(selec, type(bbox)) self.assertEqual(selec.dtype, np.int32) np.testing.assert_equal( cuda.to_cpu(selec), cuda.to_cpu(expect))
def check_proposal_target_creator( self, bbox, label, roi, proposal_target_creator): xp = cuda.get_array_module(roi) sample_roi, gt_roi_loc, gt_roi_label =\ proposal_target_creator(roi, bbox, label) # Test types self.assertIsInstance(sample_roi, xp.ndarray) self.assertIsInstance(gt_roi_loc, xp.ndarray) self.assertIsInstance(gt_roi_label, xp.ndarray) sample_roi = cuda.to_cpu(sample_roi) gt_roi_loc = cuda.to_cpu(gt_roi_loc) gt_roi_label = cuda.to_cpu(gt_roi_label) # Test shapes self.assertEqual(sample_roi.shape, (self.n_sample, 4)) self.assertEqual(gt_roi_loc.shape, (self.n_sample, 4)) self.assertEqual(gt_roi_label.shape, (self.n_sample,)) # Test foreground and background labels np.testing.assert_equal(np.sum(gt_roi_label >= 0), self.n_sample) n_pos = np.sum(gt_roi_label >= 1) n_neg = np.sum(gt_roi_label == 0) self.assertLessEqual(n_pos, self.n_sample * self.pos_ratio) self.assertLessEqual(n_neg, self.n_sample - n_pos)
def forward(x_data, y_data, print_conf_matrix=False): ''' Neural net architecture :param x_data: :param y_data: :param train: :return: ''' x, t = Variable(x_data), Variable(y_data) h1 = F.relu(model.l1(x)) h1 = F.max_pooling_2d(h1,max_pool_window_1,stride=max_pool_stride_1) h2 = F.dropout(F.relu(model.l2(h1))) h2 = F.average_pooling_2d(h2, avg_pool_window_2, stride=avg_pool_stride_2) h2 = F.max_pooling_2d(h2,max_pool_window_2,stride=max_pool_stride_2) y = model.l3(h2) # display confusion matrix if print_conf_matrix: pdb.set_trace() print confusion_matrix(cuda.to_cpu(t.data), cuda.to_cpu(y.data).argmax(axis=1)) return F.softmax_cross_entropy(y, t), F.accuracy(y, t)
def train_loop(): while True: while data_q.empty(): time.sleep(0.1) inp = data_q.get() if inp == 'end': # quit res_q.put('end') break elif inp == 'train': # restart training res_q.put('train') train = True continue elif inp == 'val': # start validation res_q.put('val') pickle.dump(model, open('model', 'wb'), -1) train = False continue x, y = inp if args.gpu >= 0: x = cuda.to_gpu(x) y = cuda.to_gpu(y) if train: optimizer.zero_grads() loss, accuracy = model.forward(x, y) loss.backward() optimizer.update() else: loss, accuracy = model.forward(x, y, train=False) res_q.put((float(cuda.to_cpu(loss.data)), float(cuda.to_cpu(accuracy.data)))) del loss, accuracy, x, y
def check_forward(self, y_data, z_data): y = chainer.Variable(y_data) z = chainer.Variable(z_data) loss = functions.cross_covariance(y, z) self.assertEqual(loss.data.shape, ()) self.assertEqual(loss.data.dtype, numpy.float32) loss_value = float(cuda.to_cpu(loss.data)) # Compute expected value y_data, z_data = cuda.to_cpu(y_data), cuda.to_cpu(z_data) y_mean = y_data.mean(axis=0) z_mean = z_data.mean(axis=0) N = y_data.shape[0] loss_expect = 0 for i in six.moves.xrange(y_data.shape[1]): for j in six.moves.xrange(z_data.shape[1]): ij_loss = 0. for n in six.moves.xrange(N): ij_loss += (y_data[n, i] - y_mean[i]) * ( z_data[n, j] - z_mean[j]) ij_loss /= N loss_expect += ij_loss ** 2 loss_expect *= 0.5 self.assertAlmostEqual(loss_expect, loss_value, places=5)
def check_backward(self, y_grad): self.func.gW.fill(0) y = self.func() y.grad = y_grad y.backward() self.assertTrue( (cuda.to_cpu(y_grad) == cuda.to_cpu(self.func.gW)).all())
def check_concat_tuples_padding(self, xp): tuples = [ (xp.random.rand(3, 4), xp.random.rand(2, 5)), (xp.random.rand(4, 4), xp.random.rand(3, 4)), (xp.random.rand(2, 5), xp.random.rand(2, 6)), ] arrays = dataset.concat_examples(tuples, padding=0) self.assertEqual(len(arrays), 2) self.assertEqual(arrays[0].shape, (3, 4, 5)) self.assertEqual(arrays[1].shape, (3, 3, 6)) self.assertEqual(type(arrays[0]), type(tuples[0][0])) self.assertEqual(type(arrays[1]), type(tuples[0][1])) for i in range(len(tuples)): tuples[i] = cuda.to_cpu(tuples[i][0]), cuda.to_cpu(tuples[i][1]) arrays = tuple(cuda.to_cpu(array) for array in arrays) numpy.testing.assert_array_equal(arrays[0][0, :3, :4], tuples[0][0]) numpy.testing.assert_array_equal(arrays[0][0, 3:, :], 0) numpy.testing.assert_array_equal(arrays[0][0, :, 4:], 0) numpy.testing.assert_array_equal(arrays[0][1, :4, :4], tuples[1][0]) numpy.testing.assert_array_equal(arrays[0][1, :, 4:], 0) numpy.testing.assert_array_equal(arrays[0][2, :2, :5], tuples[2][0]) numpy.testing.assert_array_equal(arrays[0][2, 2:, :], 0) numpy.testing.assert_array_equal(arrays[1][0, :2, :5], tuples[0][1]) numpy.testing.assert_array_equal(arrays[1][0, 2:, :], 0) numpy.testing.assert_array_equal(arrays[1][0, :, 5:], 0) numpy.testing.assert_array_equal(arrays[1][1, :3, :4], tuples[1][1]) numpy.testing.assert_array_equal(arrays[1][1, 3:, :], 0) numpy.testing.assert_array_equal(arrays[1][1, :, 4:], 0) numpy.testing.assert_array_equal(arrays[1][2, :2, :6], tuples[2][1]) numpy.testing.assert_array_equal(arrays[1][2, 2:, :], 0)
def check_concat_dicts_padding(self, xp): dicts = [ {'x': xp.random.rand(3, 4), 'y': xp.random.rand(2, 5)}, {'x': xp.random.rand(4, 4), 'y': xp.random.rand(3, 4)}, {'x': xp.random.rand(2, 5), 'y': xp.random.rand(2, 6)}, ] arrays = dataset.concat_examples(dicts, padding=0) self.assertIn('x', arrays) self.assertIn('y', arrays) self.assertEqual(arrays['x'].shape, (3, 4, 5)) self.assertEqual(arrays['y'].shape, (3, 3, 6)) self.assertEqual(type(arrays['x']), type(dicts[0]['x'])) self.assertEqual(type(arrays['y']), type(dicts[0]['y'])) for d in dicts: d['x'] = cuda.to_cpu(d['x']) d['y'] = cuda.to_cpu(d['y']) arrays = {'x': cuda.to_cpu(arrays['x']), 'y': cuda.to_cpu(arrays['y'])} numpy.testing.assert_array_equal(arrays['x'][0, :3, :4], dicts[0]['x']) numpy.testing.assert_array_equal(arrays['x'][0, 3:, :], 0) numpy.testing.assert_array_equal(arrays['x'][0, :, 4:], 0) numpy.testing.assert_array_equal(arrays['x'][1, :4, :4], dicts[1]['x']) numpy.testing.assert_array_equal(arrays['x'][1, :, 4:], 0) numpy.testing.assert_array_equal(arrays['x'][2, :2, :5], dicts[2]['x']) numpy.testing.assert_array_equal(arrays['x'][2, 2:, :], 0) numpy.testing.assert_array_equal(arrays['y'][0, :2, :5], dicts[0]['y']) numpy.testing.assert_array_equal(arrays['y'][0, 2:, :], 0) numpy.testing.assert_array_equal(arrays['y'][0, :, 5:], 0) numpy.testing.assert_array_equal(arrays['y'][1, :3, :4], dicts[1]['y']) numpy.testing.assert_array_equal(arrays['y'][1, 3:, :], 0) numpy.testing.assert_array_equal(arrays['y'][1, :, 4:], 0) numpy.testing.assert_array_equal(arrays['y'][2, :2, :6], dicts[2]['y']) numpy.testing.assert_array_equal(arrays['y'][2, 2:, :], 0)
def check_forward(self, x_data): x = chainer.Variable(x_data) y = functions.expand_dims(x, self.axis) self.assertEqual(y.data.shape, self.out_shape) y_expect = numpy.expand_dims(cuda.to_cpu(x_data), self.axis) self.assertEqual(y.data.dtype, self.dtype) numpy.testing.assert_array_equal(cuda.to_cpu(y.data), y_expect)
def test_nn_regression(model, x_test, y_test): """ 回帰タイプのNNを学習させる @param model: NNの構造モデルオブジェクト @param x_test: テストデータの特徴量 @param y_test: テストデータの教師信号 @return: 予測結果のリスト、ロスのリスト """ # テストサンプル数 test_sample_size = len(x_test) sum_loss = 0 # 実際の出力クラスとその確信度 predicted_value_list = [] loss_list = [] for i in xrange(0, test_sample_size): x_batch = x_test[i:i+1] y_batch = y_test[i:i+1] # 順伝播させて誤差と予測値を算出 loss, predicted = forward_data_regression(model, x_batch, y_batch, train=False) # 結果の格納 predicted_value_list.append(float(cuda.to_cpu(predicted.data))) loss_list.append(float(cuda.to_cpu(loss.data))) sum_loss += float(cuda.to_cpu(loss.data)) return predicted_value_list, loss_list
def check_equivariance(im, layers, input_array, output_array, point_group): # Transform the image f = input_array(im) g = point_group.rand() gf = g * f im1 = gf.v # Apply layers to both images im = Variable(cuda.to_gpu(im)) im1 = Variable(cuda.to_gpu(im1)) fmap = im fmap1 = im1 for layer in layers: layer.to_gpu() fmap = layer(fmap) fmap1 = layer(fmap1) # Transform the computed feature maps fmap1_garray = output_array(cuda.to_cpu(fmap1.data)) r_fmap1_data = (g.inv() * fmap1_garray).v fmap_data = cuda.to_cpu(fmap.data) assert np.allclose(fmap_data, r_fmap1_data, rtol=1e-5, atol=1e-3)
def check_bbox_loc_conversions_consistency( self, src_bbox, dst_bbox): bbox = bbox2loc(src_bbox, dst_bbox) out_raw_bbox = loc2bbox(src_bbox, bbox) np.testing.assert_almost_equal( cuda.to_cpu(out_raw_bbox), cuda.to_cpu(dst_bbox), decimal=5)
def xcode(model, train, encode=True): loss = 0 if not train: ys, ts = [], [] last_w = None for i in range(len(batch[0])-1): w, next_w = Variable(batch[:, i]), Variable(batch[:, i+1]) if encode: y = model.encode(w, train=train) else: y = model.decode(w, train=train) loss += F.softmax_cross_entropy(y, next_w) if not train: ys.append(xp.argmax(y.data, axis=1)) ts.append(next_w.data) last_w = next_w # process last words if encode: model.encode(last_w, train=train) else: model.decode(last_w, train=train) if train: return loss else: ys = xp.vstack(ys).T ts = xp.vstack(ts).T if use_gpu: ys = cuda.to_cpu(ys) ts = cuda.to_cpu(ts) return loss, ys, ts
def check_call(self, x_data): x = chainer.Variable(x_data) actual = self.mlp(x) act = functions.sigmoid expect = self.mlp[2](act(self.mlp[1](act(self.mlp[0](x))))) numpy.testing.assert_array_equal( cuda.to_cpu(expect.data), cuda.to_cpu(actual.data))
def check_anchor_target_creator( self, anchor_target_layer, bbox, anchor, img_size): xp = cuda.get_array_module(bbox) loc, label = self.anchor_target_layer( bbox, anchor, img_size) # Test types self.assertIsInstance(loc, xp.ndarray) self.assertIsInstance(label, xp.ndarray) # Test shapes self.assertEqual(loc.shape, (self.n_anchor, 4)) self.assertEqual(label.shape, (self.n_anchor,)) # Test dtype self.assertEqual(loc.dtype, np.float32) self.assertEqual(label.dtype, np.int32) # Test ratio of foreground and background labels np.testing.assert_equal( cuda.to_cpu(utils.force_array(xp.sum(label >= 0))), self.n_sample) n_pos = cuda.to_cpu(utils.force_array(xp.sum(label == 1))) n_neg = cuda.to_cpu(utils.force_array(xp.sum(label == 0))) self.assertLessEqual( n_pos, self.n_sample * self.pos_ratio) self.assertLessEqual(n_neg, self.n_sample - n_pos)
def r2_score(pred, true, sample_weight=None, multioutput="uniform_average", ignore_nan=False): pred = cuda.to_cpu(pred) true = cuda.to_cpu(true) diff = pred - true dev = true - numpy.mean(true, axis=0) if ignore_nan: diff[numpy.isnan(diff)] = 0. dev[numpy.isnan(dev)] = 0. SS_res = numpy.asarray( numpy.sum(diff ** 2, axis=0)) SS_tot = numpy.asarray( numpy.sum(dev ** 2, axis=0)) if multioutput == 'uniform_average': if numpy.any(SS_tot == 0): return 0.0 else: return (1 - SS_res / SS_tot).mean() elif multioutput == 'raw_values': if numpy.any(SS_tot == 0): # Assign dummy value to avoid zero-division SS_tot_iszero = SS_tot == 0 SS_tot[SS_tot_iszero] = 1 return numpy.where(SS_tot_iszero, 0.0, 1 - SS_res / SS_tot) else: return 1 - SS_res / SS_tot
def __forward_word(self, trg_batch, encdec, is_training, generation_limit): trg_stoi = self.trg_vocab.stoi trg_itos = self.trg_vocab.itos t = XP.iarray([trg_stoi('<s>') for _ in range(self.batch_size)]) hyp_batch = [[] for _ in range(self.batch_size)] trg_len = len(trg_batch[0]) if trg_batch else 0 if is_training: loss = XP.fzeros(()) for l in range(trg_len): y = encdec.decode(t) t = XP.iarray([trg_stoi(trg_batch[k][l]) for k in range(self.batch_size)]) loss += functions.softmax_cross_entropy(y, t) output = cuda.to_cpu(y.data.argmax(1)) for k in range(self.batch_size): hyp_batch[k].append(trg_itos(output[k])) return loss, hyp_batch else: while len(hyp_batch[0]) < generation_limit: y = encdec.decode(t) output = cuda.to_cpu(y.data.argmax(1)) t = self.common_function.my_array(output, np.int32) for k in range(self.batch_size): hyp_batch[k].append(trg_itos(output[k])) if all(hyp_batch[k][-1] == '</s>' for k in range(self.batch_size)): break return hyp_batch
def process_state(): # stateの状態を作成する for index, term in [(vocab.get(term),term) for term in TextList.data]: print(term, end="") char = np.array([index], dtype=np.int32) TextList.state, prob = model.forward_one_step(char, char, TextList.state, train=False) probability = cuda.to_cpu(prob.data)[0].astype(np.float64) probability /= np.sum(probability) prob_with_term = [] for e, p in enumerate(probability): prob_with_term.append( [p, ivocab[e] ] ) prob_with_term = sorted(prob_with_term, key=lambda x:-1 * x[0] )[:10] prob, term = prob_with_term[0] # stateから予想を行う for _ in range(500): index = vocab.get(term) char = np.array([index], dtype=np.int32) TextList.state, prob = model.forward_one_step(char, char, TextList.state, train=False) probability = cuda.to_cpu(prob.data)[0].astype(np.float64) probability /= np.sum(probability) prob_with_term = [] for e, p in enumerate(probability): prob_with_term.append( [p, ivocab[e] ] ) prob_with_term = sorted(prob_with_term, key=lambda x:-1 * x[0] )[:10] # termのアップデート prob, term = prob_with_term[0] print(term, end="") return TextList.state
def check_forward(self, gpu): y1, y2 = self.f.forward((self.x1, self.x2)) self.assertEqual(self.f.check_type_forward.call_count, 0) self.assertEqual(self._get_method("forward", not gpu).call_count, 0) self._get_method("forward", gpu).assert_called_once_with((self.x1, self.x2)) self.assertTrue((cuda.to_cpu(y1) == cuda.to_cpu(self.y1)).all()) self.assertTrue((cuda.to_cpu(y2) == cuda.to_cpu(self.y2)).all())
def evaluate_results(x_test, y_test, N_test, batchsize, max_len, print_conf_matrix=False): ''' Evaluate model on test set. :param x_test: :param y_test: :param N_test: :param batchsize: :param max_len: :return: ''' # reshape data to match chainer format x_test = np.reshape(x_test, (x_test.shape[0], 1, word_vector_size, max_len)) # evaluation sum_accuracy = 0 sum_loss = 0 for i in xrange(0, N_test, batchsize): x_batch = x_test[i:i+batchsize] y_batch = y_test[i:i+batchsize] if args.gpu >= 0: x_batch = cuda.to_gpu(x_batch) y_batch = cuda.to_gpu(y_batch) loss, acc = forward(x_batch, y_batch, print_conf_matrix=print_conf_matrix) sum_loss += float(cuda.to_cpu(loss.data)) * batchsize sum_accuracy += float(cuda.to_cpu(acc.data)) * batchsize print 'test mean loss={}, accuracy={}'.format( sum_loss / N_test, sum_accuracy / N_test) return sum_accuracy / N_test
def check_forward(self, x_data, t_data): x = chainer.Variable(x_data) t = chainer.Variable(t_data) y = functions.select_item(x, t) y_exp = cuda.to_cpu(x_data)[range(t_data.size), cuda.to_cpu(t_data)] numpy.testing.assert_equal(cuda.to_cpu(y.data), y_exp)
def test(self, x_l, y_l): y = F.softmax(self.mlp_enc(x_l, test=True)) y_argmax = F.argmax(y, axis=1) acc = F.accuracy(y, y_l) y_l_cpu = cuda.to_cpu(y_l.data) y_argmax_cpu = cuda.to_cpu(y_argmax.data) # Confuction Matrix cm = confusion_matrix(y_l_cpu, y_argmax_cpu) print(cm) # Wrong samples idx = np.where(y_l_cpu != y_argmax_cpu)[0] #print(idx.tolist()) # Generate and Save x_rec = self.mlp_dec(y, test=True) save_incorrect_info(x_rec.data[idx, ], x_l.data[idx, ], y.data[idx, ], y_l.data[idx, ]) # Save model serializers.save_hdf5("./model/mlp_encdec.h5py", self.model) loss = self.forward_for_losses(x_l, y_l, None, test=True) # only measure x_l supervised_loss = loss return acc, supervised_loss
def put(self, data, label): if self.data is None: self.data = cuda.to_cpu(data) self.label = cuda.to_cpu(label) else: self.data = np.vstack([self.data, cuda.to_cpu(data)]) self.label = np.hstack([self.label, cuda.to_cpu(label)]).reshape((self.data.shape[0]))
def convert_to_NumpyArray(self, data): if self.xp == np: return data else: return cuda.to_cpu(data)
def __call__(self, gt_segments, anchor, sequence_length, seg_info): """Assign ground truth supervision to sampled subset of anchors. Types of input arrays and output arrays are same. Here are notations. * :math:`S` is the number of anchors. * :math:`R` is the number of bounding boxes. Args: gt_segments (array): Coordinates of bounding boxes. Its shape is :math:`(B, R, 2)`. anchor (array): Coordinates of anchors. Its shape is :math:`(K*A, 2)`. sequence_length int: A tuple :obj:`W`, which is length of video frames. seg_info (np.array): (B, 2) columns : AU group id and gt_box_number per timeline Returns: (array, array): * **loc**: Offsets and scales to match the anchors to \ the ground truth bounding boxes. Its shape is :math:`(S, 4)`. * **label**: Labels of anchors with values \ :obj:`(1=positive, 0=negative, -1=ignore)`. Its shape \ is :math:`(S,)`. """ xp = cuda.get_array_module(gt_segments) gt_segments = cuda.to_cpu(gt_segments) anchor = cuda.to_cpu(anchor) seg_info = cuda.to_cpu(seg_info) assert seg_info.ndim == 2 mini_batch_size = gt_segments.shape[0] n_anchor = len(anchor) # W x A inside_index = _get_inside_index( anchor, sequence_length) # 这就意味着,每个batch传入的sequence_length必须一样长 anchor = anchor[inside_index] # inside_num, 2 batch_labels = [] batch_locs = [] for b_id in range(mini_batch_size): _, gt_seg_num = seg_info[b_id] gt_seg = gt_segments[b_id][:gt_seg_num] argmax_ious, label = self._create_label( inside_index, anchor, gt_seg) # 这个label指的是1或者0,有没有物体在里面的label # compute bounding box regression targets, argmax_ious指的是每个anchor所对的最大IOU的gt_seg的index loc = encode_segment_target( anchor, gt_seg[argmax_ious, :]) # shape = R, 2; 编码偏差 # map up to original set of anchors, inside_index让长度缩减了,搞回原始长度 W x A label = _unmap(label, n_anchor, inside_index, fill=-1) loc = _unmap(loc, n_anchor, inside_index, fill=0) if xp != np: loc = chainer.cuda.to_gpu(loc) label = chainer.cuda.to_gpu(label) batch_labels.append(label) batch_locs.append(loc) batch_locs = xp.concatenate( batch_locs, axis=0 ) # shape = S, 2; S is all segments' ground truth segment number across batch batch_labels = xp.concatenate( batch_labels, axis=0 ) # shape = S; S is all segments' ground truth segment number return batch_locs, batch_labels
def fit(self, images): """ Train PCANet Parameters ---------- images: np.ndarray | Color / grayscale images of shape | (n_images, height, width, n_channels) or | (n_images, height, width) """ images_trans = images print(images_trans.shape) images = self.process_input(images) # images.shape == (n_images, n_channels, y, x) for image in images: X = [] for channel in image: patches = image_to_patch_vectors(channel, self.filter_shape_l1, self.step_shape_l1) X.append(patches) patches = np.hstack(X) # patches.shape = (n_patches, n_patches * vector length) self.pca_l1.partial_fit(patches) filters_l1 = components_to_filters( self.pca_l1.components_, n_channels=images.shape[1], filter_shape=self.filter_shape_l1, ) if gpu_enabled(): images = to_gpu(images) filters_l1 = to_gpu(filters_l1) images_l1 = convolution_2d(images.astype('float64'), filters_l1, stride=self.step_shape_l1).data if gpu_enabled(): images_l1 = to_cpu(images_l1) filters_l1 = to_cpu(filters_l1) # images.shape == (n_images, L1, y, x) images_l1 = images_l1.reshape(-1, *images_l1.shape[2:4]) images_trans = xp.resize( images_trans, (images_trans.shape[0], images_l1.shape[1], images_l1.shape[2], 1)) images_trans = images_trans.reshape(-1, *images_trans.shape[1:3]) print(images_trans.shape) print(images_l1.shape) print(type(images_trans)) print(type(images_l1)) images = xp.vstack((images_l1, images_trans)) for image in images: patches = image_to_patch_vectors(image, self.filter_shape_l2, self.step_shape_l2) self.pca_l2.partial_fit(patches) filters_l2 = components_to_filters( self.pca_l2.components_, n_channels=1, filter_shape=self.filter_shape_l1, ) if gpu_enabled(): images = to_gpu(images) filters_l2 = to_gpu(filters_l2) print("#######") print(images.shape) images_l2 = convolution_2d( images.reshape(images.shape[0], 1, images.shape[1], images.shape[2]).astype('float64'), filters_l2, stride=self.step_shape_l2).data if gpu_enabled(): images_l2 = to_cpu(images_l2) filters_l2 = to_cpu(filters_l2) images_l2 = images_l2.reshape(-1, *images_l2.shape[2:4]) images_l1_trans = xp.resize( images_l1, (images_l1.shape[0], images_l2.shape[1], images_l2.shape[2])) images = xp.vstack((images_l2, images_l1_trans)) print("################") for image in images: patches = image_to_patch_vectors(image, self.filter_shape_l3, self.step_shape_l3) self.pca_l3.partial_fit(patches) return self
def main(): ########################### #### create dictionary #### ########################### if os.path.exists('./data/corpus/dictionary.dict'): corpus = JaConvCorpus(file_path=None, batch_size=batchsize, size_filter=True) corpus.load(load_dir='./data/corpus/') else: corpus = JaConvCorpus(file_path=data_file, batch_size=batchsize, size_filter=True) corpus.save(save_dir='./data/corpus/') print('Vocabulary Size (number of words) :', len(corpus.dic.token2id)) ################################## #### create model (copy data) #### ################################## rough_model = './data/199_rough.model' model = Seq2Seq(len(corpus.dic.token2id), feature_num=feature_num, hidden_num=hidden_num, batch_size=batchsize, gpu_flg=args.gpu) serializers.load_hdf5(rough_model, model) if args.gpu >= 0: model.to_gpu() optimizer = optimizers.Adam(alpha=0.001) optimizer.setup(model) optimizer.add_hook(chainer.optimizer.GradientClipping(5)) # optimizer.add_hook(chainer.optimizer.WeightDecay(0.0001)) ########################## #### create ID corpus #### ########################## input_mat = [] output_mat = [] max_input_ren = max_output_ren = 0 for input_text, output_text in zip(corpus.fine_posts, corpus.fine_cmnts): # convert to list input_text.reverse() # encode words in a reverse order input_text.insert(0, corpus.dic.token2id["<eos>"]) output_text.append(corpus.dic.token2id["<eos>"]) # update max sentence length max_input_ren = max(max_input_ren, len(input_text)) max_output_ren = max(max_output_ren, len(output_text)) input_mat.append(input_text) output_mat.append(output_text) # padding for li in input_mat: insert_num = max_input_ren - len(li) for _ in range(insert_num): li.insert(0, corpus.dic.token2id['<pad>']) for li in output_mat: insert_num = max_output_ren - len(li) for _ in range(insert_num): li.append(corpus.dic.token2id['<pad>']) # create batch matrix input_mat = np.array(input_mat, dtype=np.int32).T output_mat = np.array(output_mat, dtype=np.int32).T # separate corpus into Train and Test perm = np.random.permutation(len(corpus.fine_posts)) test_input_mat = input_mat[:, perm[0:0 + testsize]] test_output_mat = output_mat[:, perm[0:0 + testsize]] train_input_mat = input_mat[:, perm[testsize:]] train_output_mat = output_mat[:, perm[testsize:]] list_of_references = [] for text_ndarray in test_output_mat.T: reference = text_ndarray.tolist() references = [[w_id for w_id in reference if w_id is not -1]] list_of_references.append(references) ############################# #### train seq2seq model #### ############################# accum_loss = 0 train_loss_data = [] test_loss_data = [] bleu_score_data = [] wer_score_data = [] for num, epoch in enumerate(range(n_epoch)): total_loss = test_loss = 0 batch_num = 0 perm = np.random.permutation(len(corpus.fine_posts) - testsize) # for training for i in range(0, len(corpus.fine_posts) - testsize, batchsize): # select batch data input_batch = train_input_mat[:, perm[i:i + batchsize]] output_batch = train_output_mat[:, perm[i:i + batchsize]] # Encode a sentence model.initialize() # initialize cell model.encode(input_batch, train=True) # encode (output: hidden Variable) # Decode from encoded context end_batch = xp.array( [corpus.dic.token2id["<start>"] for _ in range(batchsize)]) first_words = output_batch[0] loss, predict_mat = model.decode(end_batch, first_words, train=True) next_ids = first_words accum_loss += loss for w_ids in output_batch[1:]: loss, predict_mat = model.decode(next_ids, w_ids, train=True) next_ids = w_ids accum_loss += loss # learn model model.cleargrads() # initialize all grad to zero accum_loss.backward() # back propagation optimizer.update() total_loss += float(accum_loss.data) batch_num += 1 print('Epoch: ', num, 'Batch_num', batch_num, 'batch loss: {:.2f}'.format(float(accum_loss.data))) accum_loss = 0 # for testing list_of_hypotheses = [] for i in range(0, testsize, batchsize): # select test batch data input_batch = test_input_mat[:, i:i + batchsize] output_batch = test_output_mat[:, i:i + batchsize] # Encode a sentence model.initialize() # initialize cell model.encode(input_batch, train=True) # encode (output: hidden Variable) # Decode from encoded context end_batch = xp.array( [corpus.dic.token2id["<start>"] for _ in range(batchsize)]) first_words = output_batch[0] loss, predict_mat = model.decode(end_batch, first_words, train=True) next_ids = xp.argmax(predict_mat.data, axis=1) test_loss += loss if args.gpu >= 0: hypotheses = [cuda.to_cpu(next_ids)] else: hypotheses = [next_ids] for w_ids in output_batch[1:]: loss, predict_mat = model.decode(next_ids, w_ids, train=True) next_ids = xp.argmax(predict_mat.data, axis=1) test_loss += loss.data if args.gpu >= 0: hypotheses.append(cuda.to_cpu(next_ids)) else: hypotheses.append(next_ids) # collect hypotheses for calculating BLEU score hypotheses = np.array(hypotheses).T for hypothesis in hypotheses: text_list = hypothesis.tolist() list_of_hypotheses.append( [w_id for w_id in text_list if w_id is not -1]) # calculate BLEU score from test (develop) data bleu_score = nltk.translate.bleu_score.corpus_bleu(list_of_references, list_of_hypotheses, weights=(0.25, 0.25, 0.25, 0.25)) bleu_score_data.append(bleu_score) print('Epoch: ', num, 'BLEU SCORE: ', bleu_score) # calculate WER score from test (develop) data wer_score = 0 for index, references in enumerate(list_of_references): wer_score += wer(references[0], list_of_hypotheses[index]) wer_score /= len(list_of_references) wer_score_data.append(wer_score) print('Epoch: ', num, 'WER SCORE: ', wer_score) # save model and optimizer if (epoch + 1) % 10 == 0: print('-----', epoch + 1, ' times -----') print('save the model and optimizer') serializers.save_hdf5('data/' + str(epoch) + '_fine.model', model) serializers.save_hdf5('data/' + str(epoch) + '_fine.state', optimizer) # display the on-going status print('Epoch: ', num, 'Train loss: {:.2f}'.format(total_loss), 'Test loss: {:.2f}'.format(float(test_loss))) train_loss_data.append(float(total_loss / batch_num)) test_loss_data.append(float(test_loss)) # evaluate a test loss check_loss = test_loss_data[-10:] # check out the last 10 loss data end_flg = [ j for j in range(len(check_loss) - 1) if check_loss[j] < check_loss[j + 1] ] if len(end_flg) > 9: print('Probably it is over-fitting. So stop to learn...') break # save loss data with open('./data/fine_loss_train_data.pkl', 'wb') as f: pickle.dump(train_loss_data, f) with open('./data/fine_loss_test_data.pkl', 'wb') as f: pickle.dump(test_loss_data, f) with open('./data/fine_bleu_score_data.pkl', 'wb') as f: pickle.dump(bleu_score_data, f) with open('./data/fine_wer_score_data.pkl', 'wb') as f: pickle.dump(wer_score_data, f)
def check_clip_grads(self): self.optimizer.clip_grads(1.0) g = cuda.to_cpu(self.target.param.grad) sqnorm = g.dot(g) self.assertAlmostEqual(sqnorm, 1.0, delta=1.0e-5)
def check_weight_decay(self): self.optimizer.weight_decay(0.1) g = cuda.to_cpu(self.target.param.grad) expect = np.array([0.0, 1.1, 2.2], dtype=np.float32) gradient_check.assert_allclose(g, expect)
def check_accumulate_grads_from_gpu(self, src_id): with cuda.Device(src_id): self.optimizer.accumulate_grads([cuda.cupy.arange(3)]) grad = self.target.param.grad self.assertTrue((cuda.to_cpu(grad) == np.arange(3) * 2).all())
def check_accumulate_grads_from_cpu(self): self.optimizer.accumulate_grads([np.arange(3)]) grad = self.target.param.grad self.assertTrue((cuda.to_cpu(grad) == np.arange(3) * 2).all())
def forward(self, word_list, gold_op_list, unary_limit): is_training = gold_op_list is not None # check args if len(word_list) < 1: raise ValueError('Word list is empty.') if is_training: n_shift = 0 n_binary = 0 for op, _ in gold_op_list: if op == OP_SHIFT: n_shift += 1 if op == OP_BINARY: n_binary += 1 if n_shift != len(word_list) or n_binary != len(word_list) - 1: raise ValueError( 'Invalid operation number: SHIFT=%d (required: %d), BINARY=%d (required: %d)' % (n_shift, n_binary, len(word_list), len(word_list) - 1)) if gold_op_list[-1] != (OP_FINISH, None): raise ValueError('Last operation is not OP_FINISH.') # default values EMBED_ZEROS = XP.fzeros((1, self.n_embed)) QUEUE_ZEROS = XP.fzeros((1, self.n_queue)) STACK_ZEROS = XP.fzeros((1, self.n_stack)) SRSTATE_ZEROS = XP.fzeros((1, self.n_srstate)) QUEUE_DEFAULT = ('', EMBED_ZEROS, QUEUE_ZEROS, QUEUE_ZEROS) STACK_DEFAULT = (None, STACK_ZEROS, STACK_ZEROS) NEG_INF = -1e20 # word embedding x_list = [self.net_embed(XP.iarray([wid])) for _, wid in word_list] # forward encoding a_list = [] ac = QUEUE_ZEROS a = QUEUE_ZEROS for x in x_list: ac, a = self.net_forward(ac, x, a) a_list.append(a) # backward encoding b_list = [] bc = QUEUE_ZEROS b = QUEUE_ZEROS for x in reversed(x_list): bc, b = self.net_backward(bc, x, b) b_list.insert(0, b) q_list = [(text, x, a, b) for (text, _), x, a, b in zip(word_list, x_list, a_list, b_list)] # estimate s_list = [] zc = SRSTATE_ZEROS z = SRSTATE_ZEROS unary_chain = 0 if is_training: loss = XP.fzeros(()) for i in itertools.count(): text, x, a, b = q_list[0] if q_list else QUEUE_DEFAULT t1, sc1, s1 = s_list[-1] if s_list else STACK_DEFAULT t2, sc2, s2 = s_list[-2] if len(s_list) >= 2 else STACK_DEFAULT t3, sc3, s3 = s_list[-3] if len(s_list) >= 3 else STACK_DEFAULT zc, z = self.net_sr(zc, a, b, s1, s2, z) o = self.net_operation(z) if is_training: loss += functions.softmax_cross_entropy(o, XP.iarray([gold_op_list[i][0]])) o_argmax = gold_op_list[i][0] else: o_filter = [0.0 for _ in range(NUM_OP)] filtered = 0 if not q_list: o_filter[OP_SHIFT] = NEG_INF filtered += 1 if not s_list or unary_chain >= unary_limit: o_filter[OP_UNARY] = NEG_INF filtered += 1 if len(s_list) < 2: o_filter[OP_BINARY] = NEG_INF filtered += 1 if q_list or len(s_list) > 1: o_filter[OP_FINISH] = NEG_INF if filtered == NUM_OP: raise RuntimeError('No possible operation!') o += XP.farray([o_filter]) o_argmax = int(cuda.to_cpu(o.data.argmax(1))) if o_argmax == OP_SHIFT: t0 = Tree(None, [text]) sc0, s0 = (STACK_ZEROS, self.net_shift(x, a, b, s1, z)) q_list.pop(0) unary_chain = 0 label = self.net_semiterminal(s0) elif o_argmax == OP_UNARY: t0 = Tree(None, [t1]) sc0, s0 = self.net_unary(sc1, a, b, s1, s2, z) s_list.pop() unary_chain += 1 label = self.net_phrase(s0) elif o_argmax == OP_BINARY: t0 = Tree(None, [t2, t1]) sc0, s0 = self.net_binary(sc1, sc2, a, b, s1, s2, s3, z) s_list.pop() s_list.pop() unary_chain = 0 label = self.net_phrase(s0) else: # OP_FINISH break if is_training: loss += functions.softmax_cross_entropy(label, XP.iarray([gold_op_list[i][1]])) label_argmax = gold_op_list[i][1] else: label_argmax = int(cuda.to_cpu(label.data.argmax(1))) t0.set_label(label_argmax) s_list.append((t0, sc0, s0)) ''' if is_training: o_est = int(cuda.to_cpu(o.data.argmax(1))) label_est = int(cuda.to_cpu(label.data.argmax(1))) trace('%c %c gold=%d-%2d, est=%d-%2d, stack=%2d, queue=%2d' % ( '*' if o_est == gold_op_list[i][0] else ' ', '*' if label_est == gold_op_list[i][1] else ' ', gold_op_list[i][0], gold_op_list[i][1], o_est, label_est, len(s_list), len(q_list))) ''' if is_training: return loss else: # combine multiple trees if they exists, and return the result. t0, _, __ = s_list.pop() if s_list: raise RuntimeError('There exist multiple subtrees!') return t0
break chainer.config.train = True chainer.config.enable_backprop = True batch_array = [ convert.concat_examples([x[idx] for x in batch], args.gpu) for idx in data_idxs ] print("batch_array =", batch_array) input("here") model.cleargrads() loss, pred_y, _ = model(tuple(map(Variable, batch_array))) loss.backward() loss.unchain_backward() optimizer.update() scheduler.update() train_eval.update(cuda.to_cpu(loss.data), pred_y, batch) # Validation & report if (iter_cnt + 1) % args.iter_snapshot == 0: logger.info("Validation...") if args.save_model: serializers.save_npz( os.path.join(save_dir, "model_{}.npz".format(iter_cnt + 1)), model) chainer.config.train = False chainer.config.enable_backprop = False model.cleargrads() prediction_dict = {"arguments": vars(args), "predictions": {}} valid_iterator.reset() valid_eval.reset()
train_perp = evaluate(train_data) # error valid_perp_stack = np.zeros(valid_iter) for i in range(valid_iter): valid_perp = evaluate(valid_data_stack[i]) # error valid_perp_stack[i] = valid_perp valid_perp_mean = np.mean(valid_perp_stack, axis=0) valid_errors_mean[epoch / valid_len] = valid_perp_mean valid_perp_se = np.std(valid_perp_stack, axis=0) / np.sqrt(valid_iter) valid_errors_se[epoch / valid_len] = valid_perp_se train_errors[epoch / valid_len] = train_perp if epoch == 0: perp = None else: perp = cuda.to_cpu(cur_log_perp) / valid_len perp = int(perp * 100) / 100.0 now = time.time() if epoch == 0: throughput = 0.0 else: throughput = valid_len / (now - cur_at) print( 'epoch {}: train perp: {} train classified {}/{}, valid classified {}/100 ({:.2f} epochs/sec)' .format(epoch, perp, whole_len * (1 - train_perp), whole_len, 100 * (1 - valid_perp_mean), throughput)) cur_at = now
model = LDA2Vec(n_documents=n_docs, n_document_topics=n_topics, n_units=n_units, n_vocab=n_vocab, counts=counts, n_samples=15) if os.path.exists('lda2vec.hdf5'): print "Reloading from saved" serializers.load_hdf5("lda2vec.hdf5", model) model.to_gpu() optimizer = O.Adam() optimizer.setup(model) j = 0 epoch = 0 fraction = batchsize * 1.0 / flattened.shape[0] for epoch in range(5000): data = prepare_topics(cuda.to_cpu(model.mixture.weights.W.data).copy(), cuda.to_cpu(model.mixture.factors.W.data).copy(), cuda.to_cpu(model.embed.W.data).copy(), words) print_top_words_per_topic(data) for d, f in utils.chunks(batchsize, doc_ids, flattened): t0 = time.time() l = model.fit_partial(d.copy(), f.copy()) prior = model.prior() loss = l + prior * fraction * clambda optimizer.zero_grads() loss.backward() optimizer.update() msg = ("J:{j:05d} E:{epoch:05d} L:{loss:1.3e} " "P:{prior:1.3e} R:{rate:1.3e}") prior.to_cpu()
def colorize(self, filename, step='C', blur=4, s_size=128, colorize_format="png"): if self.gpu >= 0: cuda.get_device(self.gpu).use() _ = {'S': "ref/", 'L': "out_min/", 'C': "ref/"} dataset = ImageAndRefDataset(paths=[filename], root1=self.root, root2=self.root + _[step]) _ = {'S': True, 'L': False, 'C': True} sample = dataset.get_example(0, minimize=_[step], blur=blur, s_size=s_size) _ = {'S': 0, 'L': 1, 'C': 0}[step] sample_container = np.zeros( (1, 4, sample[_].shape[1], sample[_].shape[2]), dtype='f') sample_container[0, :] = sample[_] if self.gpu >= 0: sample_container = cuda.to_gpu(sample_container) cnn = {'S': self.cnn_128, 'L': self.cnn_512, 'C': self.cnn_128} with chainer.no_backprop_mode(): with chainer.using_config('train', False): image_conv2d_layer = cnn[step].calc(Variable(sample_container)) del sample_container if step == 'C': input_bat = np.zeros( (1, 4, sample[1].shape[1], sample[1].shape[2]), dtype='f') print(input_bat.shape) input_bat[0, 0, :] = sample[1] output = cuda.to_cpu(image_conv2d_layer.data[0]) del image_conv2d_layer # release memory for channel in range(3): input_bat[0, 1 + channel, :] = cv2.resize( output[channel, :], (sample[1].shape[2], sample[1].shape[1]), interpolation=cv2.INTER_CUBIC) if self.gpu >= 0: link = cuda.to_gpu(input_bat, None) else: link = input_bat with chainer.no_backprop_mode(): with chainer.using_config('train', False): image_conv2d_layer = self.cnn_512.calc(Variable(link)) del link # release memory image_out_path = self.outdir + filename.split(".")[0] + "_color.png" self.save_as_img(image_conv2d_layer.data[0], image_out_path) del image_conv2d_layer return image_out_path
def sample(self, trainer): x = trainer.updater.forward(test=True) x = x.data if cuda.get_array_module(x) == cuda.cupy: x = cuda.to_cpu(x) return x
def _forward(self, data, fn=None, batchsize=16, converter=concat_examples, retain_inputs=False, preprocess_fn=None, postprocess_fn=None): """Forward data by iterating with batch Args: data: "train_x array" or "chainer dataset" fn (Callable): Main function to forward. Its input argument is either Variable, cupy.ndarray or numpy.ndarray, and returns Variable. batchsize (int): batch size converter (Callable): convert from `data` to `inputs` retain_inputs (bool): If True, this instance keeps inputs in `self.inputs` or not. preprocess_fn (Callable): Its input is numpy.ndarray or cupy.ndarray, it can return either Variable, cupy.ndarray or numpy.ndarray postprocess_fn (Callable): Its input argument is Variable, but this method may return either Variable, cupy.ndarray or numpy.ndarray. Returns (tuple or numpy.ndarray): forward result """ input_list = None output_list = None it = SerialIterator(data, batch_size=batchsize, repeat=False, shuffle=False) for batch in it: inputs = converter(batch, self._device) inputs = _to_tuple(inputs) if preprocess_fn: inputs = preprocess_fn(*inputs) inputs = _to_tuple(inputs) inputs = (_to_variable(x) for x in inputs) outputs = fn(*inputs) # Init if retain_inputs: if input_list is None: input_list = [[] for _ in range(len(inputs))] for j, input in enumerate(inputs): input_list[j].append(cuda.to_cpu(input)) if output_list is None: output_list = [[] for _ in range(len(outputs))] if postprocess_fn: outputs = postprocess_fn(*outputs) outputs = _to_tuple(outputs) for j, output in enumerate(outputs): output_list[j].append(_extract_numpy(output)) if retain_inputs: self.inputs = [ numpy.concatenate(in_array) for in_array in input_list ] result = [_concat(output) for output in output_list] # result = [numpy.concatenate(output) for output in output_list] if len(result) == 1: return result[0] else: return result
def _extract_numpy(x): if isinstance(x, chainer.Variable): x = x.data return cuda.to_cpu(x)
colornet.zerograds() loss2 = colornet(Variable(labd[:,:,2]),2) loss2.backward() optimizer.update() if iteratecnt % args.log == 0: print ("{0},{1},{2}".format(iteratecnt, loss1.data, loss2.data)) if not args.fast : pred = colornet(None, 3) pred_lab_a = xp.array(pred.data[0,0,:,:]) if args.gpu >= 0: pred_lab_a = cuda.to_cpu(pred_lab_a) pred_lab_a = cv2.resize(pred_lab_a,(widthInputMovie, heightInputMovie)) pred_lab_b = xp.array(pred.data[0,1,:,:]) if args.gpu >= 0: pred_lab_b = cuda.to_cpu(pred_lab_b) pred_lab_b = cv2.resize(pred_lab_b,(widthInputMovie, heightInputMovie)) frame_lab_out = np.concatenate((frame_lab_l[:,:,np.newaxis],pred_lab_a[:,:,np.newaxis],pred_lab_b[:,:,np.newaxis]),axis=2) frame_rgb_out = cv2.cvtColor(frame_lab_out, cv2.COLOR_Lab2RGB) frame_rgb_out = (frame_rgb_out * 255).astype(np.uint8) cvFrame = cv2.cvtColor(frame_rgb_out, cv2.COLOR_RGB2BGR) out.write(cvFrame)
# Also the data['vocab'] is mostly <OoV> # (Pdb) print(sum(x != '<OoV>' for x in data['vocab']), 'out of', len(data['vocab']), ' is NOT <OoV>') # 27 out of 5835 is NOT <OoV> # # Debug>>> # (Pdb) model.mixture.weights.W.data.shape -> (11314, 20) (weights) # (Pdb) model.mixture.factors.W.data.shape -> (20, 300) (factors -> factor_vector) # (Pdb) model.sampler.W.data.shape -> (5837, 300) (word_vectors) # (Pdb) len(words) -> 5837 (vocab) if gpu_id >= 0: data = prepare_topics(cuda.to_gpu(model.mixture.weights.W.data).copy(), cuda.to_gpu(model.mixture.factors.W.data).copy(), cuda.to_gpu(model.sampler.W.data).copy(), words, normalize = False) else: data = prepare_topics(cuda.to_cpu(model.mixture.weights.W.data).copy(), cuda.to_cpu(model.mixture.factors.W.data).copy(), cuda.to_cpu(model.sampler.W.data).copy(), words, normalize = False) top_words = print_top_words_per_topic(data) if j % 100 == 0 and j > 100: coherence = topic_coherence(top_words) for j in range(n_topics): print(j, coherence[(j, 'cv')]) kw = dict(top_words=top_words, coherence=coherence, epoch=epoch) progress[str(epoch)] = pickle.dumps(kw) data['doc_lengths'] = doc_lengths data['term_frequency'] = term_frequency
x_train = numpy.array(tmp) N_test = x_test.shape[0] N_train = x_train.shape[0] batchsize = 22 logger.info("Applying batch normalization") for i in xrange(0, N_train, batchsize): x_batch = x_train[i:i + batchsize] model.forward(x_batch, test=False) logger.info("Extracting final layer") save_to = args.save_to X = [] for i in xrange(0, N_test): utt_id = utt_ids_tst[i] x_batch = x_test[i:i + 1] X.append( cuda.to_cpu(F.softmax(model.forward(x_batch, test=True)).data)) X = numpy.asarray(X)[:, 0, :] logger.info("Calcurating average precision") start_time = timeit.default_timer() labels = swbd_utts_to_labels(utt_ids_tst) distances = pdist(X, metric="cosine") matches = samediff.generate_matches_array(labels) ap, prb = samediff.average_precision(distances[matches == True], distances[matches == False]) end_time = timeit.default_timer() logger.info("Average precision: %s (processing time: %f [sec])" % (str(ap), end_time - start_time)) logger.info('Saving output layer to %s' % save_to + ".npz") numpy.savez_compressed(save_to, X)
def check_forward(readout, data): y_actual = cuda.to_cpu(readout(*data).data) assert y_actual.shape == (batch_size, out_dim)
def test_call_single_return_value_cpu(self): self.f.forward_cpu.return_value = (cuda.to_cpu(self.y1),) self.check_call_single_return_value()
writer = csv.writer(f) writer.writerows(t_array) t_array = [] with open('./imgdatalog3/ydata_train.csv', 'a') as f: writer = csv.writer(f) writer.writerows(y_array) y_array = [] mean_accuracy = sum_accuracy / N_train mean_Loss = sum_Loss / N_train #accuracy_last = evaluate(y_trainbackup,t_trainbackup) accuracy_last = None if use_gpu: mean_accuracy = cuda.to_cpu(mean_accuracy) mean_Loss = cuda.to_cpu(mean_Loss) accuracy_last = cuda.to_cpu(accuracy_last) print "Train Epoch {} : Loss {} : Accuracy {} accuracy at t=15 : {}".format( epoch, mean_Loss, mean_accuracy, accuracy_last) trainloss.append(mean_Loss) trainaccuracy.append(mean_accuracy) trainaccuracylast.append(accuracy_last) with open('./imgdatalog3/train_loss.csv', 'w') as f: writer = csv.writer(f) writer.writerow(trainloss) with open('./imgdatalog3/train_accuracy.csv', 'w') as f:
# cropped_img.fill(255) white ver cropped_img[crop_height_start:crop_height_end, crop_width_start:crop_width_end] = resized_img top = left = (size - model.insize) / 2 bottom = model.insize + top right = model.insize + left cropped_img = cropped_img.astype(np.float32).swapaxes(0, 2).swapaxes( 1, 2) cropped_img = cropped_img[:, top:bottom, left:right] cropped_img -= mean_image[:, top:bottom, left:right] cropped_img /= 255 x = np.ndarray((1, 3, model.insize, model.insize), dtype=np.float32) x[0] = cropped_img x = cuda.to_gpu(x) score = model.predict(x, train=False) score = cuda.to_cpu(score.data) categories = np.loadtxt("labels.txt", str, delimiter="\t") prediction = zip(score[0].tolist(), categories) prediction.sort(cmp=lambda x, y: cmp(x[0], y[0]), reverse=True) top_k = 1 ys_pass = 0 for rank, (score, name) in enumerate(prediction[:top_k], start=1): print(args.source_dir + "/" + source_dirpath + "/" + source_imgpath + " " + name + " " + source_dirpath) if name == source_dirpath: ok += 1 else: ng += 1
def main(): parser = argparse.ArgumentParser(description='') parser.add_argument('out') parser.add_argument('--gpu', '-g', type=int, default=0, help='GPU device ID') parser.add_argument('--epoch', '-e', type=int, default=200, help='# of epoch') parser.add_argument('--batch_size', '-b', type=int, default=10) parser.add_argument('--memory_size', '-m', type=int, default=500) parser.add_argument('--real_label', type=float, default=0.9) parser.add_argument('--fake_label', type=float, default=0.0) parser.add_argument('--block_num', type=int, default=6) parser.add_argument('--g_nobn', dest='g_bn', action='store_false', default=True) parser.add_argument('--d_nobn', dest='d_bn', action='store_false', default=True) parser.add_argument('--variable_size', action='store_true', default=False) parser.add_argument('--lambda_dis_real', type=float, default=0) parser.add_argument('--size', type=int, default=128) parser.add_argument('--lambda_', type=float, default=10) # args = parser.parse_args() args, unknown = parser.parse_known_args() # log directory out = datetime.datetime.now().strftime('%m%d%H') out = out + '_' + args.out out_dir = os.path.abspath(os.path.join(os.path.curdir, "runs", out)) os.makedirs(os.path.join(out_dir, 'models'), exist_ok=True) os.makedirs(os.path.join(out_dir, 'visualize'), exist_ok=True) # hyper parameter with open(os.path.join(out_dir, 'setting.txt'), 'w') as f: for k, v in args._get_kwargs(): print('{} = {}'.format(k, v)) f.write('{} = {}\n'.format(k, v)) trainA = ImageDataset('horse2zebra/trainA', augmentation=True, image_size=256, final_size=args.size) trainB = ImageDataset('horse2zebra/trainB', augmentation=True, image_size=256, final_size=args.size) testA = ImageDataset('horse2zebra/testA', image_size=256, final_size=args.size) testB = ImageDataset('horse2zebra/testB', image_size=256, final_size=args.size) train_iterA = chainer.iterators.MultiprocessIterator(trainA, args.batch_size, n_processes=min( 8, args.batch_size)) train_iterB = chainer.iterators.MultiprocessIterator(trainB, args.batch_size, n_processes=min( 8, args.batch_size)) N = len(trainA) # genA convert B -> A, genB convert A -> B genA = Generator(block_num=args.block_num, bn=args.g_bn) genB = Generator(block_num=args.block_num, bn=args.g_bn) # disA discriminate realA and fakeA, disB discriminate realB and fakeB disA = Discriminator(bn=args.d_bn) disB = Discriminator(bn=args.d_bn) if args.gpu >= 0: cuda.get_device_from_id(args.gpu).use() genA.to_gpu() genB.to_gpu() disA.to_gpu() disB.to_gpu() optimizer_genA = chainer.optimizers.Adam(alpha=0.0002, beta1=0.5, beta2=0.9) optimizer_genB = chainer.optimizers.Adam(alpha=0.0002, beta1=0.5, beta2=0.9) optimizer_disA = chainer.optimizers.Adam(alpha=0.0002, beta1=0.5, beta2=0.9) optimizer_disB = chainer.optimizers.Adam(alpha=0.0002, beta1=0.5, beta2=0.9) optimizer_genA.setup(genA) optimizer_genB.setup(genB) optimizer_disA.setup(disA) optimizer_disB.setup(disB) # start training start = time.time() fake_poolA = np.zeros( (args.memory_size, 3, args.size, args.size)).astype('float32') fake_poolB = np.zeros( (args.memory_size, 3, args.size, args.size)).astype('float32') lambda_ = args.lambda_ const_realA = np.asarray([testA.get_example(i) for i in range(10)]) const_realB = np.asarray([testB.get_example(i) for i in range(10)]) iterations = 0 for epoch in range(args.epoch): if epoch > 100: decay_rate = 0.0002 / 100 optimizer_genA.alpha -= decay_rate optimizer_genB.alpha -= decay_rate optimizer_disA.alpha -= decay_rate optimizer_disB.alpha -= decay_rate # train iter_num = N // args.batch_size for i in range(iter_num): # load real batch imagesA = train_iterA.next() imagesB = train_iterB.next() if args.variable_size: crop_size = np.random.choice([160, 192, 224, 256]) resize_size = np.random.choice([160, 192, 224, 256]) imagesA = [ random_augmentation(image, crop_size, resize_size) for image in imagesA ] imagesB = [ random_augmentation(image, crop_size, resize_size) for image in imagesB ] realA = chainer.Variable(genA.xp.asarray(imagesA, 'float32')) realB = chainer.Variable(genB.xp.asarray(imagesB, 'float32')) # load fake batch if iterations < args.memory_size: fakeA = genA(realB) fakeB = genB(realA) fakeA.unchain_backward() fakeB.unchain_backward() else: fake_imagesA = fake_poolA[np.random.randint( args.memory_size, size=args.batch_size)] fake_imagesB = fake_poolB[np.random.randint( args.memory_size, size=args.batch_size)] if args.variable_size: fake_imagesA = [ random_augmentation(image, crop_size, resize_size) for image in fake_imagesA ] fake_imagesB = [ random_augmentation(image, crop_size, resize_size) for image in fake_imagesB ] fakeA = chainer.Variable(genA.xp.asarray(fake_imagesA)) fakeB = chainer.Variable(genA.xp.asarray(fake_imagesB)) ############################ # (1) Update D network ########################### # dis A y_realA = disA(realA) y_fakeA = disA(fakeA) loss_disA = (F.sum((y_realA - args.real_label) ** 2) + F.sum((y_fakeA - args.fake_label) ** 2)) \ / np.prod(y_fakeA.shape) # dis B y_realB = disB(realB) y_fakeB = disB(fakeB) loss_disB = (F.sum((y_realB - args.real_label) ** 2) + F.sum((y_fakeB - args.fake_label) ** 2)) \ / np.prod(y_fakeB.shape) # discriminate real A and real B not only realA and fakeA if args.lambda_dis_real > 0: y_realB = disA(realB) loss_disA += F.sum( (y_realB - args.fake_label)**2) / np.prod(y_realB.shape) y_realA = disB(realA) loss_disB += F.sum( (y_realA - args.fake_label)**2) / np.prod(y_realA.shape) # update dis disA.cleargrads() disB.cleargrads() loss_disA.backward() loss_disB.backward() optimizer_disA.update() optimizer_disB.update() ########################### # (2) Update G network ########################### # gan A fakeA = genA(realB) y_fakeA = disA(fakeA) loss_ganA = F.sum( (y_fakeA - args.real_label)**2) / np.prod(y_fakeA.shape) # gan B fakeB = genB(realA) y_fakeB = disB(fakeB) loss_ganB = F.sum( (y_fakeB - args.real_label)**2) / np.prod(y_fakeB.shape) # rec A recA = genA(fakeB) loss_recA = F.mean_absolute_error(recA, realA) # rec B recB = genB(fakeA) loss_recB = F.mean_absolute_error(recB, realB) # gen loss loss_gen = loss_ganA + loss_ganB + lambda_ * (loss_recA + loss_recB) # loss_genB = loss_ganB + lambda_ * (loss_recB + loss_recA) # update gen genA.cleargrads() genB.cleargrads() loss_gen.backward() # loss_genB.backward() optimizer_genA.update() optimizer_genB.update() # logging logger.plot('loss dis A', float(loss_disA.data)) logger.plot('loss dis B', float(loss_disB.data)) logger.plot('loss rec A', float(loss_recA.data)) logger.plot('loss rec B', float(loss_recB.data)) logger.plot('loss gen A', float(loss_gen.data)) # logger.plot('loss gen B', float(loss_genB.data)) logger.tick() # save to replay buffer fakeA = cuda.to_cpu(fakeA.data) fakeB = cuda.to_cpu(fakeB.data) for k in range(args.batch_size): fake_sampleA = fakeA[k] fake_sampleB = fakeB[k] if args.variable_size: fake_sampleA = cv2.resize( fake_sampleA.transpose(1, 2, 0), (256, 256), interpolation=cv2.INTER_AREA).transpose(2, 0, 1) fake_sampleB = cv2.resize( fake_sampleB.transpose(1, 2, 0), (256, 256), interpolation=cv2.INTER_AREA).transpose(2, 0, 1) fake_poolA[(iterations * args.batch_size) % args.memory_size + k] = fake_sampleA fake_poolB[(iterations * args.batch_size) % args.memory_size + k] = fake_sampleB iterations += 1 progress_report(iterations, start, args.batch_size) if epoch % 5 == 0: logger.flush(out_dir) visualize(genA, genB, const_realA, const_realB, epoch=epoch, savedir=os.path.join(out_dir, 'visualize')) serializers.save_hdf5( os.path.join(out_dir, "models", "{:03d}.disA.model".format(epoch)), disA) serializers.save_hdf5( os.path.join(out_dir, "models", "{:03d}.disB.model".format(epoch)), disB) serializers.save_hdf5( os.path.join(out_dir, "models", "{:03d}.genA.model".format(epoch)), genA) serializers.save_hdf5( os.path.join(out_dir, "models", "{:03d}.genB.model".format(epoch)), genB)
def check_forward(readout, atom_data): # type: (Set2Set, numpy.ndarray) -> None readout.reset_state() y_actual = cuda.to_cpu(readout(atom_data).data) assert y_actual.shape == (batch_size, in_channels * 2)
def __call__(self, trainer): fn = self.filename.format(trainer) img = ioutil.deprocess( cuda.to_cpu(self.target[self.paramname].W.data)[0], self.mean) img.save(os.path.join(trainer.out, fn)) '''prefix = 'tmp' + fn
def check_forward(self, x_data): axes = self.axes x = chainer.Variable(x_data) y = functions.transpose(x, axes) self.assertEqual(y.data.dtype, self.dtype) self.assertTrue((self.x.transpose(axes) == cuda.to_cpu(y.data)).all())
def forward_one_step(self, x_seq, pos, test=True, concat_weight=True, softmax=False): self.reset_state() xp = self.xp length = x_seq.shape[1] if self.gpu: x_seq = cuda.to_gpu(x_seq) if length < 1: if concat_weight: return None, None else: return None, None, None, None sum_loss = 0 former = None latter = None attention_sum = 0 if pos == 0: latter = x_seq[:, 1:] elif pos == length - 1: former = x_seq[:, :pos] else: former = x_seq[:, :pos] latter = x_seq[:, pos + 1:] former_context = None latter_context = None former_attention_weight = None latter_attention_weight = None if former is not None: former_context, former_encode = self.encode_backward(former, test=test) former_attention_weight, former_attention_sum = self.attend( former_context, former_encode, test=test) attention_sum += former_attention_sum if latter is not None: latter_context, latter_encode = self.encode_forward(latter, test=test) latter_attention_weight, latter_attention_sum = self.attend( latter_context, latter_encode, test=test) attention_sum += latter_attention_sum representation = 0 if former_context is not None: for t in xrange(len(former_context)): representation += apply_attention( former_context[t], former_attention_weight[t] / attention_sum) if latter_context is not None: for t in xrange(len(latter_context)): representation += apply_attention( latter_context[t], latter_attention_weight[t] / attention_sum) g = self.f_rg(representation) predicted_char_bef_softmax = self.reader_fc(g) if concat_weight: batchsize = x_seq.shape[0] weight = xp.zeros((batchsize, length), dtype=xp.float32) index = 0 if former_attention_weight is not None: f_length = len(former_attention_weight) for i in xrange(f_length): index = i weight[:, f_length - i - 1] = former_attention_weight[i].data.reshape(-1) index += 1 if latter_attention_weight is not None: for i in xrange(len(latter_attention_weight)): weight[:, index + i + 1] = latter_attention_weight[i].data.reshape(-1) weight /= attention_sum.data if xp is not np: weight = cuda.to_cpu(weight) if softmax: return weight, F.softmax(predicted_char_bef_softmax) else: return weight, predicted_char_bef_softmax else: return former_attention_weight, latter_attention_weight, attention_sum, predicted_char_bef_softmax
def transform(self, images): """ Parameters ---------- images: np.ndarray | Color / grayscale images of shape | (n_images, height, width, n_channels) or | (n_images, height, width) Returns ------- X: np.ndarray A set of feature vectors of shape (n_images, n_features) where :code:`n_features` is determined by the hyperparameters """ images = self.process_input(images) # images.shape == (n_images, n_channels, y, x) N = images.shape[0] filters_l1 = components_to_filters( self.pca_l1.components_, n_channels=images.shape[1], filter_shape=self.filter_shape_l1, ) filters_l2 = components_to_filters(self.pca_l2.components_, n_channels=1, filter_shape=self.filter_shape_l2) filters_l3 = components_to_filters(self.pca_l3.components_, n_channels=1, filter_shape=self.filter_shape_l3) if gpu_enabled(): images = to_gpu(images) filters_l1 = to_gpu(filters_l1) filters_l2 = to_gpu(filters_l2) filters_l3 = to_gpu(filters_l3) images = convolution_2d(images.astype('float32'), filters_l1, stride=self.step_shape_l1).data print("Memory: {}Mb".format(memory_profiler.memory_usage())) images = images.reshape(-1, *images.shape[2:4]) images = convolution_2d( images.reshape( images.shape[0], 1, images.shape[1], images.shape[2]).astype('float32'), # 1 channel images filters_l2, stride=self.step_shape_l2).data print("Memory: {}Mb".format(memory_profiler.memory_usage())) images = images.reshape(N, self.n_l1_output * self.n_l2_output, images.shape[2], images.shape[3]) images = xp.swapaxes(images, 0, 1) print(images.shape) # L1*L2.shape == (L1*L2, n_images, y, x) # iterate over each L1*L2 output X = [] for maps in images: n_images, h, w = maps.shape #print(type(maps)) #print(type(filters_l3)) maps = convolution_2d( maps.reshape(n_images, 1, h, w).astype('float32'), # 1 channel images filters_l3, stride=self.step_shape_l3).data # maps.shape == (n_images, L1*L2*L3, y, x) right here maps = binarize(maps) maps = binary_to_decimal(maps) # maps.shape == (n_images, y, x) t1 = timeit.default_timer() x = self.histogram(maps) t2 = timeit.default_timer() hist_time = t2 - t1 print("hist_time: %f" % hist_time) # x is a set of feature vectors. # The shape of x is (n_images, vector length) X.append(x) # concatenate over L1 X = xp.hstack(X) if gpu_enabled(): X = to_cpu(X) X = X.astype(np.float32) print("Memory: {}Mb".format(memory_profiler.memory_usage())) # The shape of X is (n_images, L1 * vector length) return X
def check_forward(model, atom_data, adj_data): # type: (MPNN, numpy.ndarray, numpy.ndarray) -> None y_actual = cuda.to_cpu(model(atom_data, adj_data).data) assert y_actual.shape == (batch_size, out_dim)