def testShapeInferenceConvNet(self): model = model_helper.ModelHelper(name="convtest") model.NHWC2NCHW("data", "data_nchw") brew.conv(model, "data_nchw", 'conv1', 3, 64, weight_init=("MSRAFill", {}), kernel=7, stride=2, pad=3, no_bias=0) brew.spatial_bn(model, 'conv1', 'conv1_spatbn_relu', 64, epsilon=1e-3, is_test=False) brew.relu(model, 'conv1_spatbn_relu', 'conv1_spatbn_relu') brew.max_pool(model, 'conv1_spatbn_relu', 'pool1', kernel=3, stride=2) brew.fc(model, 'pool1', 'fc', dim_in=(64 * 56 * 56), dim_out=100) brew.dropout(model, 'fc', 'fc_drop', is_test=False) model.Sigmoid('fc_drop', 'fc_sigm') brew.softmax(model, 'fc_sigm', 'softmax') model.LabelCrossEntropy(['softmax', 'label'], 'xent') loss = model.AveragedLoss('xent', 'loss') model.AddGradientOperators([loss]) LR = model.param_init_net.ConstantFill([], 'LR', shape=[1], value=0.1) for param in model.GetParams(): param_grad = model.param_to_grad[param] param_momentum = model.param_init_net.ConstantFill([param], param + '_momentum', value=0.0) model.net.MomentumSGDUpdate( [param_grad, param_momentum, LR, param], [param_grad, param_momentum, param], ) workspace.FeedBlob( "data", np.random.rand(16, 227, 227, 3).astype(np.float32), ) workspace.FeedBlob( "label", (100 * np.random.rand(16)).astype(np.int32), ) workspace.FeedBlob( "label", (100 * np.random.rand(16)).astype(np.int32), ) # Then do automatic comparison test: run the next once to # initialize everything self.InferTensorRunAndCompare(model)
def test_dropout(self): p = 0.2 X = np.ones((100, 100)).astype(np.float32) - p workspace.FeedBlob("x", X) model = ModelHelper(name="test_model") brew.dropout(model, "x", "out") workspace.RunNetOnce(model.param_init_net) workspace.RunNetOnce(model.net) out = workspace.FetchBlob("out") self.assertLess(abs(out.mean() - (1 - p)), 0.05)
def test_dropout(self): p = 0.2 X = np.ones((100, 100)).astype(np.float32) - p workspace.FeedBlob("x", X) model = ModelHelper(name="test_model") brew.dropout(model, "x", "out") workspace.RunNetOnce(model.param_init_net) workspace.RunNetOnce(model.net) out = workspace.FetchBlob("out") self.assertLess(abs(out.mean() - (1 - p)), 0.05)
def AddMLP_BN(model, data): ''' Implement MLP model on MNIST ''' # number of nuerons in fc layer num_units = 4096 # NCHW: 64 x 1 x 28 x 28 -> 64 x 1 x 28 x 28 drop1 = brew.dropout(model, data, 'drop1', ratio=0.5, is_test=0) # NCHW: 64 x 1 x 28 x 28 -> 64 x 4096 fc2 = brew.fc(model, drop1, 'fc2', dim_in=1 * 28 * 28, dim_out=num_units) fc2_reshaped = model.Reshape([fc2], ['fc2_reshaped'], shape=(4, 2)) # bn2 = brew.spatial_bn(model, fc2, 'bn2', num_units, epsilon=1e-3, momentum=0.9, is_test=is_test) #relu2 = brew.relu(model, fc2, 'relu2') #fc3 = brew.fc(model, relu2, 'fc3', dim_in=num_units, dim_out=num_units) # bn3 = brew.spatial_bn(model, fc3, 'bn3', bn_units, epsilon=1e-3, momentum=0.9, is_test=is_test) #relu3 = brew.relu(model, fc3, 'relu3') #fc4 = brew.fc(model, relu3, 'fc4', dim_in=num_units, dim_out=num_units) # bn4 = brew.spatial_bn(model, fc4, 'bn4', bn_units, epsilon=1e-3, momentum=0.9, is_test=is_test) #relu4 = brew.relu(model, fc4, 'relu4') #fc5 = brew.fc(model, relu4, 'fc5', dim_in=num_units, dim_out=10) # 10 for 10-classes # bn5 = brew.spatial_bn(model, fc5, 'bn5', 10, epsilon=1e-3, momentum=0.9, is_test=is_test) softmax = brew.softmax(model, fc2, 'softmax') return softmax
def testShapeInferenceConvNet(self): model = model_helper.ModelHelper(name="convtest") model.NHWC2NCHW("data", "data_nchw") brew.conv(model, "data_nchw", 'conv1', 3, 64, weight_init=("MSRAFill", {}), kernel=7, stride=2, pad=3, no_bias=0) brew.spatial_bn(model, 'conv1', 'conv1_spatbn_relu', 64, epsilon=1e-3, is_test=False) brew.relu(model, 'conv1_spatbn_relu', 'conv1_spatbn_relu') brew.max_pool(model, 'conv1_spatbn_relu', 'pool1', kernel=3, stride=2) brew.fc(model, 'pool1', 'fc', dim_in=(64 * 56 * 56), dim_out=100) brew.dropout(model, 'fc', 'fc_drop', is_test=False) model.Sigmoid('fc_drop', 'fc_sigm') brew.softmax(model, 'fc_sigm', 'softmax') model.LabelCrossEntropy(['softmax', 'label'], 'xent') loss = model.AveragedLoss('xent', 'loss') model.AddGradientOperators([loss]) LR = model.param_init_net.ConstantFill( [], 'LR', shape=[1], value=0.1 ) for param in model.GetParams(): param_grad = model.param_to_grad[param] param_momentum = model.param_init_net.ConstantFill( [param], param + '_momentum', value=0.0 ) model.net.MomentumSGDUpdate( [param_grad, param_momentum, LR, param], [param_grad, param_momentum, param], ) workspace.FeedBlob( "data", np.random.rand(16, 227, 227, 3).astype(np.float32), ) workspace.FeedBlob( "label", (100 * np.random.rand(16)).astype(np.int32), ) workspace.FeedBlob( "label", (100 * np.random.rand(16)).astype(np.int32), ) # Then do automatic comparison test: run the next once to # initialize everything self.InferTensorRunAndCompare(model)
def add_dropout(self, prev_blob, ratio, is_test=False): self.prev_blob = brew.dropout(self.model, prev_blob, '%s_dropout_%d' % (self.block_name, self.layer_num), ratio=ratio, is_test=is_test) self.layer_num += 1 return self.prev_blob
def forward_pass_builder(self, model, loss_scale=1.0): """ This function adds the operators, layers to the network. It should return a list of loss-blobs that are used for computing the loss gradient. This function is also passed an internally calculated loss_scale parameter that is used to scale your loss to normalize for the number of GPUs. Signature: function(model, loss_scale) """ is_inference = self.phase == 'inference' v = 'data' v = brew.conv(model, v, 'conv1', 3, 96, kernel=11, stride=4) v = brew.relu(model, v, 'relu1') v = brew.lrn(model, v, 'norm1', size=5, alpha=0.0001, beta=0.75) v = brew.max_pool(model, v, 'pool1', kernel=3, stride=2) v = brew.conv(model, v, 'conv2', 96, 256, kernel=5, pad=2, group=1) v = brew.relu(model, v, 'relu2') v = brew.lrn(model, v, 'norm2', size=5, alpha=0.0001, beta=0.75) v = brew.max_pool(model, v, 'pool2', kernel=3, stride=2) v = brew.conv(model, v, 'conv3', 256, 384, kernel=3, pad=1) v = brew.relu(model, v, 'relu3') v = brew.conv(model, v, 'conv4', 384, 384, kernel=3, pad=1, group=1) v = brew.relu(model, v, 'relu4') v = brew.conv(model, v, 'conv5', 384, 256, kernel=3, pad=1, group=1) v = brew.relu(model, v, 'relu5') v = brew.max_pool(model, v, 'pool5', kernel=3, stride=2) v = brew.fc(model, v, 'fc6', dim_in=9216, dim_out=4096) v = brew.relu(model, v, 'relu6') v = brew.dropout(model, v, 'drop6', ratio=0.5, is_test=is_inference) v = brew.fc(model, v, 'fc7', dim_in=4096, dim_out=4096) v = brew.relu(model, v, 'relu7') v = brew.dropout(model, v, 'drop7', ratio=0.5, is_test=is_inference) return self.add_head_nodes(model, v, 4096, 'fc8', loss_scale=loss_scale)
def AddModel(model, data, device_opts,is_test=False): with core.DeviceScope(device_opts): conv1 = brew.conv( model, data, 'conv1', dim_in=IMAGE_CHANNELS, dim_out=6, weight_init=('MSRAFill', {}), kernel=5, stride=1, pad=0) relu1 = brew.relu(model, conv1, 'relu1') pool1 = brew.max_pool(model, relu1, 'pool1', kernel=2, stride=2) conv2 = brew.conv( model, pool1, 'conv2', dim_in=6, dim_out=16, weight_init=('MSRAFill', {}), kernel=5, stride=1, pad=0) relu2 = brew.relu(model, conv2, 'relu2') pool2 = brew.max_pool(model, relu2, 'pool2', kernel=2, stride=2) # Fully connected layers fc1 = brew.fc(model, pool2, 'fc1', dim_in=16*4*4, dim_out=120) relu3 = brew.relu(model, fc1, 'relu3') dropout1 = brew.dropout(model, relu3, 'dropout1', ratio=0.5, is_test=is_test) fc2 = brew.fc(model, dropout1, 'fc2', dim_in=120, dim_out=84) relu4 = brew.relu(model, fc2, 'relu4') dropout2 = brew.dropout(model, relu4, 'dropout2', ratio=0.5, is_test=is_test) fc3 = brew.fc(model, dropout2, 'fc3', dim_in=84, dim_out=NUM_CLASSES) # Softmax layer softmax = brew.softmax(model, fc3, 'softmax') return softmax
def AddMLP(model, data, batch_size): ''' Implement MLP model on MNIST ''' num_units = 4096 # number of nuerons in fc layer num_labels = 10 # for 10 classes in mnist drop1 = brew.dropout(model, data, 'drop1', ratio=0.5, is_test=0) fc2 = brew.fc(model, drop1, 'fc2', dim_in=1 * 28 * 28, dim_out=num_units) model.Reshape([fc2], [fc2, 'fc2_old_shape'], shape=(batch_size, num_units, 1, 1)) bn2 = brew.spatial_bn(model, fc2, 'bn2', num_units, epsilon=1e-4, momentum=0.9) relu2 = brew.relu(model, bn2, 'relu2') fc3 = brew.fc(model, relu2, 'fc3', dim_in=num_units, dim_out=num_units) model.Reshape([fc3], [fc3, 'fc3_old_shape'], shape=(batch_size, num_units, 1, 1)) bn3 = brew.spatial_bn(model, fc3, 'bn3', num_units, epsilon=1e-4, momentum=0.9) relu3 = brew.relu(model, bn3, 'relu3') fc4 = brew.fc(model, relu3, 'fc4', dim_in=num_units, dim_out=num_units) model.Reshape([fc4], [fc4, 'fc4_old_shape'], shape=(batch_size, num_units, 1, 1)) bn4 = brew.spatial_bn(model, fc4, 'bn4', num_units, epsilon=1e-4, momentum=0.9) relu4 = brew.relu(model, bn4, 'relu4') fc5 = brew.fc(model, relu4, 'fc5', dim_in=num_units, dim_out=num_labels) model.Reshape([fc5], [fc5, 'fc5_old_shape'], shape=(batch_size, num_labels, 1, 1)) bn5 = brew.spatial_bn(model, fc5, 'bn5', num_labels, epsilon=1e-4, momentum=0.9) softmax = brew.softmax(model, bn5, 'softmax') return softmax
def fc_module(model, inputs, dim_in, dim_out, module_seq, dropout, is_test): fc = brew.fc(model, inputs, 'fc{}'.format(module_seq), dim_in=dim_in, dim_out=dim_out) relu = brew.relu(model, fc, fc) dropout = brew.dropout(model, relu, 'pool{}'.format(module_seq), is_test=is_test, ratio=dropout) return dropout
def create_model(m, device_opts) : with core.DeviceScope(device_opts): conv1 = brew.conv(m, 'data', 'conv1', dim_in=3, dim_out=50, kernel=3, pad=1, no_gradient_to_input=1) relu1 = brew.relu(m, conv1, 'relu1') conv2 = brew.conv(m, relu1, 'conv2', dim_in=50, dim_out=50, kernel=3, pad=1) pool1 = brew.max_pool(m, conv2, 'pool1', kernel=2, stride=2) relu2 = brew.relu(m, pool1, 'relu2') drop1 = brew.dropout(m, relu2, 'drop1', ratio=0.25) conv3 = brew.conv(m, drop1, 'conv3', dim_in=50, dim_out=100, kernel=3, pad=1) relu3 = brew.relu(m, conv3, 'relu3') conv4 = brew.conv(m, relu3, 'conv4', dim_in=100, dim_out=100, kernel=3, pad=1) pool2 = brew.max_pool(m, conv4, 'pool2', kernel=2, stride=2) relu4 = brew.relu(m, pool2, 'relu4') drop2 = brew.dropout(m, relu4, 'drop2', ratio=0.25) fc1 = brew.fc(m, drop2, 'fc1', dim_in=100 * 8 * 8, dim_out=512) relu5 = brew.relu(m, fc1, 'relu5') drop3 = brew.dropout(m, relu5, 'drop3', ratio=0.5) fc2 = brew.fc(m, drop3, 'fc2', dim_in=512, dim_out=N_CLASSES) softmax = brew.softmax(m, fc2, 'softmax') return softmax
def forward_pass_builder(self, model, loss_scale=1.0): """ This function adds the operators, layers to the network. It should return a list of loss-blobs that are used for computing the loss gradient. This function is also passed an internally calculated loss_scale parameter that is used to scale your loss to normalize for the number of GPUs. Signature: function(model, loss_scale) """ is_inference = self.phase == 'inference' layers, filters = VGG.specs[self.__model]['specs'] v = 'data' dim_in = self.input_shape[0] for i, num in enumerate(layers): for j in range(num): v = brew.conv(model, v, 'conv%d_%d' % (i + 1, j + 1), dim_in, filters[i], kernel=3, pad=1) v = brew.relu(model, v, 'relu%d_%d' % (i + 1, j + 1)) dim_in = filters[i] v = brew.max_pool(model, v, 'pool%d' % (i + 1), kernel=2, stride=2) dim_in = 25088 # 512 * 7 * 7 (output tensor of previous max pool layer) for i in range(2): v = brew.fc(model, v, 'fc%d' % (6 + i), dim_in=dim_in, dim_out=4096) v = brew.relu(model, v, 'relu%d' % (6 + i)) v = brew.dropout(model, v, 'drop%d' % (6 + i), ratio=0.5, is_test=is_inference) dim_in = 4096 return self.add_head_nodes(model, v, 4096, 'fc8', loss_scale=loss_scale)
def MakeForwardPassOps( model: ModelHelper, model_id: str, input_blob: str, output_blob: str, weights: List[str], biases: List[str], activations: List[str], layers: List[int], dropout_ratio: float, is_test: bool = False, ) -> None: """ Performs a forward pass of a multi-layer perceptron. :param model: The ModelHelper object whose net will execute this pass :param model_id: A unique string for this model that is used to hold activation levels :param input_blob: The blob containing the input data :param output_blob: The blob where the output data will be placed :param weights: A list of blobs containing the weights :param biases: A list of blobs containing the bias nodes :param activations: A list of strings describing the activation functions Currently only 'linear' and 'relu' are supported :param layers: A list of integers describing the layer sizes :param dropout_ratio: The fraction of nodes to drop out during training. :param is_test: Indicates whether or not this forward pass should skip node dropout. """ model.net.NanCheck([input_blob], [input_blob]) num_layer_connections = len(layers) - 1 for x in six.moves.range(num_layer_connections): inputs = None outputs = None if x == 0: inputs = input_blob else: inputs = "ModelState_" + str(x) + "_" + model_id if x + 1 == num_layer_connections: outputs = output_blob else: outputs = "ModelState_" + str(x + 1) + "_" + model_id activation = activations[x] dim_in = layers[x] dim_out = layers[x + 1] weight_name = weights[x] bias_name = biases[x] brew.fc_explicit_param_names( # type: ignore model, inputs, outputs, dim_in=dim_in, dim_out=dim_out, bias_name=bias_name, weight_name=weight_name, weight_init=("GivenTensorFill", { 'values': workspace.FetchBlob(weight_name) }), bias_init=("GivenTensorFill", { 'values': workspace.FetchBlob(bias_name) })) if activation == 'relu': brew.relu(model, outputs, outputs) elif activation == 'linear': pass else: raise Exception("Unknown activation function") if dropout_ratio > 0.01: brew.dropout(model, outputs, outputs, ratio=dropout_ratio, is_test=is_test) model.net.NanCheck([output_blob], [output_blob])
def Dropout(self, *args, **kwargs): return brew.dropout(self, *args, order=self.order, use_cudnn=self.use_cudnn, **kwargs)
def VGG_Net(model, loss_scale): #----- 3 x 224 x 224 --> 64 x 224 x 224 -----# conv1_1 = brew.conv(model, 'data', 'conv1_1', 3, 64, 3, pad=1, weight_init=('GaussianFill', { 'mean': 0.0, 'std': 1e-2 })) relu1_1 = brew.relu(model, conv1_1, 'relu1_1') #----- 64 x 224 x 224 --> 64 x 224 x 224 -----# conv1_2 = brew.conv(model, relu1_1, 'conv1_2', 64, 64, 3, pad=1, weight_init=('GaussianFill', { 'mean': 0.0, 'std': 1e-2 })) relu1_2 = brew.relu(model, conv1_2, 'relu1_2') #----- 64 x 224 x 224 --> 64 x 112 x 112 -----# pool1 = brew.max_pool(model, relu1_2, 'pool1', kernel=2, stride=2) #----- 64 x 112 x 112 --> 128 x 112 x 112 -----# conv2_1 = brew.conv(model, pool1, 'conv2_1', 64, 128, 3, pad=1, weight_init=('GaussianFill', { 'mean': 0.0, 'std': 1e-2 })) relu2_1 = brew.relu(model, conv2_1, 'relu2_1') #----- 128 x 112 x 112 --> 128 x 112 x 112 -----# conv2_2 = brew.conv(model, relu2_1, 'conv2_2', 128, 128, 3, pad=1, weight_init=('GaussianFill', { 'mean': 0.0, 'std': 1e-2 })) relu2_2 = brew.relu(model, conv2_2, 'relu2_2') #----- 128 x 112 x 112 --> 128 x 56 x 56 -----# pool2 = brew.max_pool(model, relu2_2, 'pool2', kernel=2, stride=2) #----- 128 x 56 x 56 --> 256 x 56 x 56 -----# conv3_1 = brew.conv(model, pool2, 'conv3_1', 128, 256, 3, pad=1, weight_init=('GaussianFill', { 'mean': 0.0, 'std': 1e-2 })) relu3_1 = brew.relu(model, conv3_1, 'relu3_1') #----- 256 x 56 x 56 --> 256 x 56 x 56 -----# conv3_2 = brew.conv(model, relu3_1, 'conv3_2', 256, 256, 3, pad=1, weight_init=('GaussianFill', { 'mean': 0.0, 'std': 1e-2 })) relu3_2 = brew.relu(model, conv3_2, 'relu3_2') #----- 256 x 56 x 56 --> 256 x 56 x 56 -----# conv3_3 = brew.conv(model, relu3_2, 'conv3_3', 256, 256, 3, pad=1, weight_init=('GaussianFill', { 'mean': 0.0, 'std': 1e-2 })) relu3_3 = brew.relu(model, conv3_3, 'relu3_3') #----- 256 x 56 x 56 --> 256 x 28 x 28 -----# pool3 = brew.max_pool(model, relu3_3, 'pool3', kernel=2, stride=2) #----- 256 x 28 x 28 --> 512 x 28 x 28 -----# conv4_1 = brew.conv(model, pool3, 'conv4_1', 256, 512, 3, pad=1, weight_init=('GaussianFill', { 'mean': 0.0, 'std': 1e-2 })) relu4_1 = brew.relu(model, conv4_1, 'relu4_1') #----- 512 x 28 x 28 --> 512 x 28 x 28 -----# conv4_2 = brew.conv(model, relu4_1, 'conv4_2', 512, 512, 3, pad=1, weight_init=('GaussianFill', { 'mean': 0.0, 'std': 1e-2 })) relu4_2 = brew.relu(model, conv4_2, 'relu4_2') #----- 512 x 28 x 28 --> 512 x 28 x 28 -----# conv4_3 = brew.conv(model, relu4_2, 'conv4_3', 512, 512, 3, pad=1, weight_init=('GaussianFill', { 'mean': 0.0, 'std': 1e-2 })) relu4_3 = brew.relu(model, conv4_3, 'relu4_3') #----- 512 x 28 x 28 --> 512 x 14 x 14 -----# pool4 = brew.max_pool(model, relu4_3, 'pool4', kernel=2, stride=2) #----- 512 x 14 x 14 --> 512 x 14 x 14 -----# conv5_1 = brew.conv(model, pool4, 'conv5_1', 512, 512, 3, pad=1, weight_init=('GaussianFill', { 'mean': 0.0, 'std': 1e-2 })) relu5_1 = brew.relu(model, conv5_1, 'relu5_1') #----- 512 x 14 x 14 --> 512 x 14 x 14 -----# conv5_2 = brew.conv(model, relu5_1, 'conv5_2', 512, 512, 3, pad=1, weight_init=('GaussianFill', { 'mean': 0.0, 'std': 1e-2 })) relu5_2 = brew.relu(model, conv5_2, 'relu5_2') #----- 512 x 14 x 14 --> 512 x 14 x 14 -----# conv5_3 = brew.conv(model, relu5_2, 'conv5_3', 512, 512, 3, pad=1, weight_init=('GaussianFill', { 'mean': 0.0, 'std': 1e-2 })) relu5_3 = brew.relu(model, conv5_3, 'relu5_3') #----- 512 x 14 x 14 --> 512 x 7 x 7 -----# pool5 = brew.max_pool(model, relu5_3, 'pool5', kernel=2, stride=2) fc6 = brew.fc(model, pool5, 'fc6', 25088, 4096) relu6 = brew.relu(model, fc6, 'relu6') drop6 = brew.dropout(model, relu6, 'drop6', ratio=0.5, is_test=0) fc7 = brew.fc(model, drop6, 'fc7', 4096, 4096) relu7 = brew.relu(model, fc7, 'relu7') drop7 = brew.dropout(model, relu7, 'drop7', ratio=0.5, is_test=0) fc8 = brew.fc(model, drop7, 'fc8', 4096, 2622) softmax = brew.softmax(model, fc8, 'softmax') xent = model.LabelCrossEntropy([softmax, 'label'], 'xent') # compute the expected loss loss = model.AveragedLoss(xent, "loss") # track the accuracy of the model AddAccuracy(model, softmax) loss = model.Scale(loss, "loss", scale=loss_scale) return [loss]
def forward_pass_builder(self, model, loss_scale=1.0): v = 'data' # 3x299x299 # conv1 = self.conv_factory(model, v, 3, num_filters=32, kernel=3, stride=2, name='conv1') #32x149x149 conv2 = self.conv_factory(model, conv1, 32, 32, kernel=3, name='conv2') #32x147x147 conv3 = self.conv_factory(model, conv2, 32, 64, kernel=3, pad=1, name='conv3') #64x147x147 pool1 = brew.max_pool(model, conv3, 'pool1', kernel=3, stride=2) #64x73x73 # conv4r = self.conv_factory(model, pool1, 64, 80, kernel=1, name='conv4_reduce') #80x73x73 conv4 = self.conv_factory(model, conv4r, 80, 192, kernel=3, name='conv4') #192x71x71 pool2 = brew.max_pool(model, conv4, 'pool2', kernel=3, stride=2) #192x35x35 # conv5 = [None, None, None, None] conv5[0] = self.conv_factory(model, pool2, 192, 96, kernel=1, name='conv5_1_1') #96x35x35 conv5[1] = self.conv_factory(model, pool2, 192, 48, kernel=1, name='conv5_2_1') #48x35x35 conv5[1] = self.conv_factory(model, conv5[1], 48, 64, kernel=5, pad=2, name='conv5_2_2') #64x35x35 conv5[2] = self.conv_factory(model, pool2, 192, 64, kernel=1, name='conv5_3_1') #64x35x35 conv5[2] = self.conv_factory(model, conv5[2], 64, 96, kernel=3, pad=1, name='conv5_3_2') #96x35x35 conv5[2] = self.conv_factory(model, conv5[2], 96, 96, kernel=3, pad=1, name='conv5_3_3') #96x35x35 conv5[3] = brew.average_pool(model, pool2, 'conv5_4_1_pool', kernel=3, stride=1, pad=1) #192x35x35 conv5[3] = self.conv_factory(model, conv5[3], 192, 64, kernel=1, name='conv5_4_2') #64x35x35 conv5 = brew.concat(model, conv5, blob_out='conv5') #320x35x35 # block35 = conv5 for i in range(10): block35 = self.block35(model, block35, num_in_channels=320, scale=0.17, name='inception_resnet_v2_a%d'%(i+1)) #320x35x35 # ra - reduction_a ra = [None, None, None] ra[0] = self.conv_factory(model, block35, 320, 384, kernel=3, stride=2, name='ra_1_1') #384x17x17 ra[1] = self.conv_factory(model, block35, 320, 256, kernel=1, name='ra_2_1') #256x35x35 ra[1] = self.conv_factory(model, ra[1], 256, 256, kernel=3, pad=1, name='ra_2_2') #256x35x35 ra[1] = self.conv_factory(model, ra[1], 256, 384, kernel=3, stride=2, name='ra_2_3') #384x17x17 ra[2] = brew.max_pool(model, block35, 'ra_3_1_pool', kernel=3, stride=2) #320x17x17 ra = brew.concat(model, ra, blob_out='ra') #1088x35x35 # block17 = ra for i in range(20): block17 = self.block17(model, block17, num_in_channels=1088, scale=0.1, name='inception_resnet_v2_b%d'%(i+1)) #1088x35x35 # rb -reduction_b rb = [None, None, None, None] rb[0] = self.conv_factory(model, block17, 1088, 256, kernel=1, name='rb_1_1') #256x17x17 rb[0] = self.conv_factory(model, rb[0], 256, 384, kernel=3, stride=2, name='rb_1_2') #384x8x8 rb[1] = self.conv_factory(model, block17, 1088, 256, kernel=1, name='rb_2_1') #256x17x17 rb[1] = self.conv_factory(model, rb[1], 256, 288, kernel=3, stride=2, name='rb_2_2') #288x8x8 rb[2] = self.conv_factory(model, block17, 1088, 256, kernel=1, name='rb_3_1') #256x17x17 rb[2] = self.conv_factory(model, rb[2], 256, 288, kernel=3, pad=1, name='rb_3_2') #288x17x17 rb[2] = self.conv_factory(model, rb[2], 288, 320, kernel=3, stride=2, name='rb_3_3') #320x8x8 rb[3] = brew.max_pool(model, block17, 'rb_4_1_pool', kernel=3, stride=2) #1088x8x8 rb = brew.concat(model, rb, blob_out='rb') #2080x8x8 # block8 = rb for i in range(9): block8 = self.block8(model, block8, num_in_channels=2080, scale=0.2, name='inception_resnet_v2_c%d'%(i+1)) #2080x8x8 block8 = self.block8(model, block8, num_in_channels=2080, relu=False, name='inception_resnet_v2_c10') #2080x8x8 # conv6 = self.conv_factory(model, block8, 2080, 1536, kernel=1, name='conv6') #1536x8x8 pool8 = brew.average_pool(model, conv6, 'pool8', kernel=8, global_pool=True) #1536x1x1 drop8 = brew.dropout(model, pool8, 'dtop8', ratio=0.2, #1536x1x1 is_test=(self.phase == 'inference')) if not self.__run_with_resnet50_trainer: return self.add_head_nodes(model, drop8, 1536, 'classifier', loss_scale=loss_scale) else: return brew.fc(model, drop8, 'classifier', dim_in=1536, dim_out=self.num_classes)
def forward_pass_builder(self, model, loss_scale=1.0): """ This function adds the operators, layers to the network. It should return a list of loss-blobs that are used for computing the loss gradient. This function is also passed an internally calculated loss_scale parameter that is used to scale your loss to normalize for the number of GPUs. Signature: function(model, loss_scale) """ self.counts = defaultdict(lambda: 0) is_inference = self.phase == 'inference' v = 'data' # Input conv modules v = self.conv(model, 'conv', v, input_depth=3, num_filters=32, kernel=3, stride=2, pad=0, is_inference=is_inference) v = self.conv(model, 'conv', v, input_depth=32, num_filters=32, kernel=3, stride=1, pad=0, is_inference=is_inference) v = self.conv(model, 'conv', v, input_depth=32, num_filters=64, kernel=3, stride=1, pad=1, is_inference=is_inference) # Stem modules v = self.inception_v4_sa(model, inputs=v, input_depth=64, is_inference=is_inference) v = self.inception_v4_sb(model, inputs=v, input_depth=160, is_inference=is_inference) v = self.inception_v4_sc(model, inputs=v, input_depth=192, is_inference=is_inference) # Four Type A modules for _ in xrange(4): v = self.inception_v4_a(model, inputs=v, input_depth=384, is_inference=is_inference) # One Type A Reduction module v = self.inception_v4_ra(model, inputs=v, input_depth=384, k=192, l=224, m=256, n=384, is_inference=is_inference) # Seven Type B modules for _ in xrange(7): v = self.inception_v4_b(model, inputs=v, input_depth=1024, is_inference=is_inference) # One Type B Reduction module v = self.inception_v4_rb(model, inputs=v, input_depth=1024, is_inference=is_inference) # Three Type C modules for _ in xrange(3): v = self.inception_v4_c(model, inputs=v, input_depth=1536, is_inference=is_inference) # Final global pooling v = brew.average_pool(model, v, blob_out='pool', kernel=8, stride=1, pad=0) v = brew.dropout(model, v, 'drop', ratio=0.2, is_test=is_inference) # And classifier return self.add_head_nodes(model, v, 1536, 'classifier', loss_scale=loss_scale)
def Dropout(self, *args, **kwargs): return brew.dropout( self, *args, order=self.order, use_cudnn=self.use_cudnn, **kwargs )
def create_r3d( model, data, num_input_channels, num_labels, label=None, is_test=False, use_full_ft=True, no_bias=0, final_spatial_kernel=7, final_temporal_kernel=1, model_depth=18, block_type='3d', transformation_type='simple_block', channel_multiplier=1.0, bottleneck_multiplier=1.0, use_dropout=False, conv1_temporal_stride=1, conv1_temporal_kernel=3, spatial_bn_mom=0.9, clip_length=8, use_shuffle=False, use_convolutional_pred=False, use_pool1=False, ): assert conv1_temporal_kernel == 3 or conv1_temporal_kernel == 5 if not use_full_ft: is_test = True # conv1 + maxpool if block_type != '2.5d' and block_type != '2.5d-sep': model.ConvNd( data, 'conv1', num_input_channels, 64, [conv1_temporal_kernel, 7, 7], weight_init=("MSRAFill", {}), strides=[conv1_temporal_stride, 2, 2], pads=[1 if conv1_temporal_kernel == 3 else 2, 3, 3] * 2, no_bias=no_bias ) else: model.ConvNd( data, 'conv1_middle', num_input_channels, 45, [1, 7, 7], weight_init=("MSRAFill", {}), strides=[1, 2, 2], pads=[0, 3, 3] * 2, no_bias=no_bias ) model.SpatialBN( 'conv1_middle', 'conv1_middle_spatbn_relu', 45, epsilon=1e-3, momentum=spatial_bn_mom, is_test=is_test ) model.Relu('conv1_middle_spatbn_relu', 'conv1_middle_spatbn_relu') model.ConvNd( 'conv1_middle_spatbn_relu', 'conv1', 45, 64, [conv1_temporal_kernel, 1, 1], weight_init=("MSRAFill", {}), strides=[conv1_temporal_stride, 1, 1], pads=[1 if conv1_temporal_kernel == 3 else 2, 0, 0] * 2, no_bias=no_bias ) model.SpatialBN( 'conv1', 'conv1_spatbn_relu', 64, epsilon=1e-3, momentum=spatial_bn_mom, is_test=is_test ) last_conv1 = model.Relu('conv1_spatbn_relu', 'conv1_spatbn_relu') if use_pool1: last_conv1 = model.MaxPool( 'conv1_spatbn_relu', 'pool1', kernels=[1, 3, 3], strides=[1, 2, 2], pads=[0, 1, 1] * 2, ) (n1, n2, n3, n4) = BLOCK_CONFIG[model_depth] # Residual blocks... builder = VideoModelBuilder(model, last_conv1, no_bias=no_bias, is_test=is_test, spatial_bn_mom=spatial_bn_mom) if transformation_type == 'simple_block': transformation = builder.add_simple_block elif transformation_type == 'bottleneck': transformation = builder.add_bottleneck else: print('Unknown transformation type...') if model_depth <= 34: filter_config = SHALLOW_FILTER_CONFIG else: filter_config = DEEP_FILTER_CONFIG filter_config = np.multiply( filter_config, channel_multiplier).astype(np.int) # conv_2x transformation( 64, filter_config[0][0], int(filter_config[0][1] * bottleneck_multiplier), block_type=block_type, use_shuffle=use_shuffle) for _ in range(n1 - 1): transformation( filter_config[0][0], filter_config[0][0], int(filter_config[0][1] * bottleneck_multiplier), block_type=block_type, use_shuffle=use_shuffle) # conv_3x transformation( filter_config[0][0], filter_config[1][0], int(filter_config[1][1] * bottleneck_multiplier), down_sampling=True, block_type=block_type, use_shuffle=use_shuffle) for _ in range(n2 - 1): transformation( filter_config[1][0], filter_config[1][0], int(filter_config[1][1] * bottleneck_multiplier), block_type=block_type, use_shuffle=use_shuffle) # conv_4x if clip_length < 4: transformation( filter_config[1][0], filter_config[2][0], int(filter_config[2][1] * bottleneck_multiplier), down_sampling=True, down_sampling_temporal=False, block_type=block_type, use_shuffle=use_shuffle) else: transformation( filter_config[1][0], filter_config[2][0], int(filter_config[2][1] * bottleneck_multiplier), down_sampling=True, block_type=block_type, use_shuffle=use_shuffle) for _ in range(n3 - 1): transformation( filter_config[2][0], filter_config[2][0], int(filter_config[2][1] * bottleneck_multiplier), block_type=block_type, use_shuffle=use_shuffle) # conv_5x if clip_length < 8: transformation( filter_config[2][0], filter_config[3][0], int(filter_config[3][1] * bottleneck_multiplier), down_sampling=True, down_sampling_temporal=False, block_type=block_type, use_shuffle=use_shuffle) else: transformation( filter_config[2][0], filter_config[3][0], int(filter_config[3][1] * bottleneck_multiplier), down_sampling=True, block_type=block_type, use_shuffle=use_shuffle) for _ in range(n4 - 1): transformation( filter_config[3][0], filter_config[3][0], int(filter_config[3][1] * bottleneck_multiplier), block_type=block_type, use_shuffle=use_shuffle) # Final layers model.AveragePool( builder.prev_blob, 'final_avg', kernels=[ final_temporal_kernel, final_spatial_kernel, final_spatial_kernel ], strides=[1, 1, 1], ) if use_dropout: dropout = brew.dropout(model, 'final_avg', 'dropout', is_test=is_test) else: dropout = 'final_avg' if not use_full_ft: dropout = model.StopGradient(dropout, dropout) if use_convolutional_pred: assert is_test last_out = model.ConvNd( dropout, 'last_out_L{}'.format(num_labels), filter_config[3][0], num_labels, [1, 1, 1], weight_init=("MSRAFill", {}), strides=[1, 1, 1], pads=[0, 0, 0] * 2, no_bias=False ) else: last_out = model.FC( dropout, 'last_out_L{}'.format(num_labels), filter_config[3][0], num_labels ) return last_out
def Add_Action_Tufts_Model(model, num_classes, image_height, image_width, image_channels, is_test=0): ################################## Block 1 ############################ # Convolutional layer 1 # conv1_1 = brew.conv(model, 'data', 'conv1_1', dim_in=image_channels, dim_out=64, kernel=3, stride=2, pad=0) # h,w = update_dims(height=image_height, width=image_width, kernel=3, stride=2, pad=0) # ReLU layer 1 # relu1_1 = brew.relu(model, conv1_1, 'relu1_1') # Batch normalization layer 1 # bn1_1 = brew.spatial_bn(model, relu1_1, 'bn1_1', dim_in=64, epsilon=1e-3, momentum=0.1, is_test=is_test) # Drop out with p=0.25 # dropout1_1 = brew.dropout(model, bn1_1, 'dropout1_1', ratio=0.25, is_test=is_test) # Convolutional layer 2 # conv1_2 = brew.conv(model, dropout1_1, 'conv1_2', dim_in=64, dim_out=64, kernel=3, stride=1, pad=0) # h,w = update_dims(height=h, width=w, kernel=3, stride=1, pad=0) # ReLU layer 1 # relu1_2 = brew.relu(model, conv1_2, 'relu1_2') # Batch normalization layer 1 # bn1_2 = brew.spatial_bn(model, relu1_2, 'bn1_2', dim_in=64, epsilon=1e-3, momentum=0.1, is_test=is_test) # Drop out with p=0.25 # dropout1_2 = brew.dropout(model, bn1_2, 'dropout1_2', ratio=0.25, is_test=is_test) ##################################### Block 2 ########################## # Convolutional layer 3 conv2_1 = brew.conv(model, 'data', 'conv2_1', dim_in=image_channels, dim_out=128, kernel=3, stride=2, pad=0) h, w = update_dims(height=image_height, width=image_width, kernel=3, stride=2, pad=0) # ReLU layer 1 relu2_1 = brew.relu(model, conv2_1, 'relu2_1') # Batch normalization layer 1 bn2_1 = brew.spatial_bn(model, relu2_1, 'bn2_1', dim_in=128, epsilon=1e-3, momentum=0.1, is_test=is_test) # Drop out with p=0.25 dropout2_1 = brew.dropout(model, bn2_1, 'dropout2_1', ratio=0.25, is_test=is_test) # Convolutional layer 4 conv2_2 = brew.conv(model, dropout2_1, 'conv2_2', dim_in=128, dim_out=128, kernel=3, stride=1, pad=0) h, w = update_dims(height=h, width=w, kernel=3, stride=1, pad=0) # ReLU layer 1 relu2_2 = brew.relu(model, conv2_2, 'relu2_2') # Batch normalization layer 1 bn2_2 = brew.spatial_bn(model, relu2_2, 'bn2_2', dim_in=128, epsilon=1e-3, momentum=0.1, is_test=is_test) # Drop out with p=0.25 dropout2_2 = brew.dropout(model, bn2_2, 'dropout2_2', ratio=0.25, is_test=is_test) ##################################### Block 3 ############################ # Convolutional layer 5 conv3_1 = brew.conv(model, dropout2_2, 'conv3_1', dim_in=128, dim_out=256, kernel=3, stride=2, pad=0) h, w = update_dims(height=h, width=w, kernel=3, stride=2, pad=0) # ReLU layer 1 relu3_1 = brew.relu(model, conv3_1, 'relu3_1') # Batch normalization layer 1 bn3_1 = brew.spatial_bn(model, relu3_1, 'bn3_1', dim_in=256, epsilon=1e-3, momentum=0.1, is_test=is_test) # Drop out with p=0.25 dropout3_1 = brew.dropout(model, bn3_1, 'dropout3_1', ratio=0.25, is_test=is_test) # Convolutional layer 4 conv3_2 = brew.conv(model, dropout3_1, 'conv3_2', dim_in=256, dim_out=256, kernel=3, stride=1, pad=0) h, w = update_dims(height=h, width=w, kernel=3, stride=1, pad=0) # ReLU layer 1 relu3_2 = brew.relu(model, conv3_2, 'relu3_2') # Batch normalization layer 1 bn3_2 = brew.spatial_bn(model, relu3_2, 'bn3_2', dim_in=256, epsilon=1e-3, momentum=0.1, is_test=is_test) # Drop out with p=0.25 dropout3_2 = brew.dropout(model, bn3_2, 'dropout3_2', ratio=0.25, is_test=is_test) # Global average pooling pool1 = brew.average_pool(model, dropout3_2, 'pool1', global_pooling=True) # Fully connected layers pred = brew.fc(model, pool1, 'fc1', dim_in=256, dim_out=num_classes) # Softmax layer softmax, loss = model.SoftmaxWithLoss([pred, 'label'], ['softmax', 'loss']) brew.accuracy(model, [softmax, 'label'], 'accuracy') model.net.MultiClassAccuracy([softmax, 'label'], ['accuracy_per_class', 'amount_per_class']) return [loss]
def create_alexnet( model, data, num_input_channels, num_labels, is_test=False, ): conv1 = brew.conv( model, data, "conv1", num_input_channels, # dim_in 64, # dim_out 11, # kernel ('XavierFill', {}), ('ConstantFill', {}), stride=4, pad=2) relu1 = brew.relu(model, conv1, "conv1") norm1 = brew.lrn(model, relu1, "norm1", size=5, alpha=0.0001, beta=0.75) pool1 = brew.max_pool(model, norm1, "pool1", kernel=3, stride=2) conv2 = brew.conv(model, pool1, "conv2", 64, 192, 5, ('XavierFill', {}), ('ConstantFill', {}), pad=2) relu2 = brew.relu(model, conv2, "conv2") norm2 = brew.lrn(model, relu2, "norm2", size=5, alpha=0.0001, beta=0.75) pool2 = brew.max_pool(model, norm2, "pool2", kernel=3, stride=2) conv3 = brew.conv(model, pool2, "conv3", 192, 384, 3, ('XavierFill', {}), ('ConstantFill', {}), pad=1) relu3 = brew.relu(model, conv3, "conv3") conv4 = brew.conv(model, relu3, "conv4", 384, 256, 3, ('XavierFill', {}), ('ConstantFill', {}), pad=1) relu4 = brew.relu(model, conv4, "conv4") conv5 = brew.conv(model, relu4, "conv5", 256, 256, 3, ('XavierFill', {}), ('ConstantFill', {}), pad=1) relu5 = brew.relu(model, conv5, "conv5") pool5 = brew.max_pool(model, relu5, "pool5", kernel=3, stride=2) fc6 = brew.fc(model, pool5, "fc6", 256 * 6 * 6, 4096, ('XavierFill', {}), ('ConstantFill', {})) relu6 = brew.relu(model, fc6, "fc6") dropout1 = brew.dropout(model, relu6, 'dropout1', ratio=0.5, is_test=is_test) fc7 = brew.fc(model, dropout1, "fc7", 4096, 4096, ('XavierFill', {}), ('ConstantFill', {})) relu7 = brew.relu(model, fc7, "fc7") dropout2 = brew.dropout(model, relu7, 'dropout2', ratio=0.5, is_test=is_test) fc8 = brew.fc(model, dropout2, "fc8", 4096, num_labels, ('XavierFill', {}), ('ConstantFill', {})) # pred = brew.softmax(model, fc8, "pred") # xent = model.net.LabelCrossEntropy([pred, "label"], "xent") # model.net.AveragedLoss(xent, "loss") return fc8
def alexnet(): model = ModelHelper(name="r", arg_scope={"order": "NCHW", "is_test": True}) conv1 = brew.conv(model, "data", "conv1", 3, 64, 11, ('XavierFill', {}), ('ConstantFill', {}), stride=4, pad=2) relu1 = brew.relu(model, conv1, "conv1") pool1 = brew.max_pool(model, relu1, "pool1", kernel=3, stride=2, pad=0, legacy_pad=3) lrn1 = brew.lrn(model, pool1, "pool1_lrn", size=5, alpha=1.0e-4, beta=0.75, bias=1.0) conv2 = brew.conv(model, lrn1, "conv2", 64, 192, 5, ('XavierFill', {}), ('ConstantFill', {}), pad=2) relu2 = brew.relu(model, conv2, "conv2") pool2 = brew.max_pool(model, relu2, "pool2", kernel=3, stride=2) lrn2 = brew.lrn(model, pool2, "pool2_lrn", size=5, alpha=1.0e-4, beta=0.75, bias=1.0) conv3 = brew.conv(model, lrn2, "conv3", 192, 384, 3, ('XavierFill', {}), ('ConstantFill', {}), pad=1) relu3 = brew.relu(model, conv3, "conv3") conv4 = brew.conv(model, relu3, "conv4", 384, 256, 3, ('XavierFill', {}), ('ConstantFill', {}), pad=1) relu4 = brew.relu(model, conv4, "conv4") conv5 = brew.conv(model, relu4, "conv5", 256, 256, 3, ('XavierFill', {}), ('ConstantFill', {}), pad=1) relu5 = brew.relu(model, conv5, "conv5") pool5 = brew.max_pool(model, relu5, "pool5", kernel=3, stride=2) fc6 = brew.fc(model, pool5, "fc6", 256 * 6 * 6, 4096, ('XavierFill', {}), ('ConstantFill', {})) relu6 = brew.relu(model, fc6, "fc6") fc7 = brew.fc(model, relu6, "fc7", 4096, 4096, ('XavierFill', {}), ('ConstantFill', {})) relu7 = brew.relu(model, fc7, "fc7") drop7 = brew.dropout(model, relu7, "fc7_dropout", is_test=1, ratio=0.5) fc8 = brew.fc(model, drop7, "fc8", 4096, 1000, ('XavierFill', {}), ('ConstantFill', {})) relu8 = brew.relu(model, fc8, "fc8") _ = brew.dropout(model, relu8, "fc8_dropout", is_test=1, ratio=0.5) return model, [(1, 3, 224, 224)]
def forward_pass_builder(self, model, loss_scale=1.0): """ This function adds the operators, layers to the network. It should return a list of loss-blobs that are used for computing the loss gradient. This function is also passed an internally calculated loss_scale parameter that is used to scale your loss to normalize for the number of GPUs. Signature: function(model, loss_scale) """ v = 'data' v = conv_factory(model, v, self.input_shape[0], 64, kernel=7, stride=2, pad=3, name="conv1/7x7_s2") v = brew.max_pool(model, v, 'pool1/3x3_s2', kernel=3, stride=2) v = brew.lrn(model, v, 'pool1/norm1', size=5, alpha=0.0001, beta=0.75) v = conv_factory(model, v, 64, 64, kernel=1, stride=1, name="conv2/3x3_reduce") v = conv_factory(model, v, 64, 192, kernel=3, stride=1, pad=1, name="conv2/3x3") v = brew.lrn(model, v, 'conv2/norm2', size=5, alpha=0.0001, beta=0.75) v = brew.max_pool(model, v, 'pool2/3x3_s2', kernel=3, stride=2) v = inception_factory(model, v, 192, 64, 96, 128, 16, 32, 32, name="inception_3a") v = inception_factory(model, v, 256, 128, 128, 192, 32, 96, 64, name="inception_3b") v = brew.max_pool(model, v, 'pool3/3x3_s2', kernel=3, stride=2) v = inception_factory(model, v, 480, 192, 96, 208, 16, 48, 64, name="inception_4a") v = inception_factory(model, v, 512, 160, 112, 224, 24, 64, 64, name="inception_4b") v = inception_factory(model, v, 512, 128, 128, 256, 24, 64, 64, name="inception_4c") v = inception_factory(model, v, 512, 112, 144, 288, 32, 64, 64, name="inception_4d") v = inception_factory(model, v, 528, 256, 160, 320, 32, 128, 128, name="inception_4e") v = brew.max_pool(model, v, 'pool4/3x3_s2', kernel=3, stride=2, pad=1) v = inception_factory(model, v, 832, 256, 160, 320, 32, 128, 128, name="inception_5a") v = inception_factory(model, v, 832, 384, 192, 384, 48, 128, 128, name="inception_5b") v = brew.average_pool(model, v, 'pool5/7x7_s1', kernel=7, stride=1) v = brew.dropout(model, v, 'pool5/drop_7x7_s1', ratio=0.5, is_test=(self.phase == 'inference')) return self.add_head_nodes(model, v, 1024, 'classifier', loss_scale=loss_scale)
def create_av_resnet( model, data, num_input_channels, num_labels, acoustic_data="logmels", label=None, is_test=False, use_full_ft=True, no_bias=0, final_spatial_kernel=7, final_temporal_kernel=1, model_depth=18, block_type='3d', transformation_type='simple_block', bottleneck_multiplier=1.0, channel_multiplier=1.0, use_dropout=False, conv1_temporal_stride=1, conv1_temporal_kernel=3, spatial_bn_mom=0.9, clip_length=8, use_convolutional_pred=False, use_pool1=False, audio_input_3d=False, g_blend=False, ): # sanity checking of model params assert conv1_temporal_kernel == 3 or conv1_temporal_kernel == 5 # conv1 + maxpool for visual model if block_type != '2.5d' and block_type != '2.5d-sep': model.ConvNd( data, 'v_conv1', num_input_channels, 64, [conv1_temporal_kernel, 7, 7], weight_init=("MSRAFill", {}), strides=[conv1_temporal_stride, 2, 2], pads=[1 if conv1_temporal_kernel == 3 else 2, 3, 3] * 2, no_bias=no_bias ) else: model.ConvNd( data, 'v_conv1_middle', num_input_channels, 45, [1, 7, 7], weight_init=("MSRAFill", {}), strides=[1, 2, 2], pads=[0, 3, 3] * 2, no_bias=no_bias ) model.SpatialBN( 'v_conv1_middle', 'v_conv1_middle_spatbn_relu', 45, epsilon=1e-3, momentum=spatial_bn_mom, is_test=is_test ) model.Relu('v_conv1_middle_spatbn_relu', 'v_conv1_middle_spatbn_relu') model.ConvNd( 'v_conv1_middle_spatbn_relu', 'v_conv1', 45, 64, [conv1_temporal_kernel, 1, 1], weight_init=("MSRAFill", {}), strides=[conv1_temporal_stride, 1, 1], pads=[1 if conv1_temporal_kernel == 3 else 2, 0, 0] * 2, no_bias=no_bias ) model.SpatialBN( 'v_conv1', 'v_conv1_spatbn_relu', 64, epsilon=1e-3, momentum=spatial_bn_mom, is_test=is_test ) v_conv1 = model.Relu('v_conv1_spatbn_relu', 'v_conv1_spatbn_relu') if use_pool1: v_conv1 = model.MaxPool( 'v_conv1_spatbn_relu', 'pool1', kernels=[1, 3, 3], strides=[1, 2, 2], pads=[0, 1, 1] * 2, ) # conv1 equivalent of audio model # it is approximated by a bunch of 3x3 kernels if audio_input_3d: acoustic_data_swap = model.NCHW2NHWC( acoustic_data, acoustic_data + "_NHWC") acoustic_data_greyscale = model.ReduceBackMean( acoustic_data_swap, acoustic_data_swap + '_c_pool', num_reduce_dim=1) acoustic_data = acoustic_data_greyscale a_conv1 = model.Conv(acoustic_data, 'a_conv1', 1, 16, kernel=3, pad=1) a_builder = AudioModelBuilder( model, a_conv1, no_bias=no_bias, is_test=is_test, spatial_bn_mom=spatial_bn_mom, prefix='a_') a_builder.add_simple_block(16, 32, down_sampling=True) a_builder.add_simple_block(32, 32) a_builder.add_simple_block(32, 32) (n1, n2, n3, n4) = BLOCK_CONFIG[model_depth] # Residual blocks... v_builder = VideoModelBuilder( model, v_conv1, no_bias=no_bias, is_test=is_test, spatial_bn_mom=spatial_bn_mom, prefix='v_') if transformation_type == 'simple_block': v_transformation = v_builder.add_simple_block a_transformation = a_builder.add_simple_block elif transformation_type == 'bottleneck': v_transformation = v_builder.add_bottleneck a_transformation = a_builder.add_bottleneck else: print('Unknown transformation type...') if model_depth <= 34: filter_config = SHALLOW_FILTER_CONFIG else: filter_config = DEEP_FILTER_CONFIG filter_config = np.multiply( filter_config, channel_multiplier).astype(np.int) # conv_2x v_transformation( 64, filter_config[0][0], int(bottleneck_multiplier * filter_config[0][1]), block_type=block_type) a_transformation(32, filter_config[0][0], filter_config[0][1], down_sampling=True) for _ in range(n1 - 1): v_transformation( filter_config[0][0], filter_config[0][0], int(bottleneck_multiplier * filter_config[0][1]), block_type=block_type) a_transformation( filter_config[0][0], filter_config[0][0], filter_config[0][1]) # conv_3x v_transformation( filter_config[0][0], filter_config[1][0], int(bottleneck_multiplier * filter_config[1][1]), down_sampling=True, block_type=block_type) a_transformation( filter_config[0][0], filter_config[1][0], filter_config[1][1], down_sampling=True) for _ in range(n2 - 1): v_transformation( filter_config[1][0], filter_config[1][0], int(bottleneck_multiplier * filter_config[1][1]), block_type=block_type) a_transformation( filter_config[1][0], filter_config[1][0], filter_config[1][1]) # conv_4x v_transformation( filter_config[1][0], filter_config[2][0], int(bottleneck_multiplier * filter_config[2][1]), down_sampling=True, block_type=block_type) a_transformation( filter_config[1][0], filter_config[2][0], filter_config[2][1], down_sampling=True) for _ in range(n3 - 1): v_transformation( filter_config[2][0], filter_config[2][0], int(bottleneck_multiplier * filter_config[2][1]), block_type=block_type) a_transformation( filter_config[2][0], filter_config[2][0], filter_config[2][1]) # conv_5x if clip_length < 8: v_transformation( filter_config[2][0], filter_config[3][0], int(bottleneck_multiplier * filter_config[3][1]), down_sampling=True, down_sampling_temporal=False, block_type=block_type) else: v_transformation( filter_config[2][0], filter_config[3][0], int(bottleneck_multiplier * filter_config[3][1]), down_sampling=True, block_type=block_type) a_transformation( filter_config[2][0], filter_config[3][0], filter_config[3][1], down_sampling=True) for _ in range(n4 - 1): v_transformation( filter_config[3][0], filter_config[3][0], int(bottleneck_multiplier * filter_config[3][1]), block_type=block_type) a_transformation( filter_config[3][0], filter_config[3][0], filter_config[3][1]) # Final layers # final pool for visual model v_builder.prev_blob = model.AveragePool( v_builder.prev_blob, 'v_final_avg', kernels=[ final_temporal_kernel, final_spatial_kernel, final_spatial_kernel ], strides=[1, 1, 1], ) # final pool for audio model a_builder.prev_blob = model.MaxPool( a_builder.prev_blob, 'a_final_avg', kernels=[4, 2], stride=1 ) last_a_layer = a_builder.prev_blob last_v_layer = v_builder.prev_blob if use_convolutional_pred: assert is_test a_last_3D = model.ExpandDims( last_a_layer, last_a_layer + '_3D', dims=[4] ) a_last_tile_1 = model.Tile( a_last_3D, a_last_3D + '_tiled_1', tiles=2, axis=3 ) a_last_tile_2 = model.Tile( a_last_tile_1, a_last_3D + '_tiled_2', tiles=2, axis=4 ) av_concat = model.Concat( [last_v_layer, a_last_tile_2], 'av_concat', axis=1 ) if use_dropout: dropout = brew.dropout( model, av_concat, 'dropout', is_test=is_test ) else: dropout = av_concat if not use_full_ft: dropout = model.StopGradient(dropout, dropout) dim = 2 * filter_config[3][0] fc_dim = int(dim / 2) fc1 = model.ConvNd( dropout, 'av_fc1', dim, fc_dim, [1, 1, 1], weight_init=("MSRAFill", {}), strides=[1, 1, 1], pads=[0, 0, 0] * 2, no_bias=False ) relu1 = brew.relu(model, fc1, fc1) fc2 = model.ConvNd( relu1, 'av_fc2', fc_dim, fc_dim, [1, 1, 1], weight_init=("MSRAFill", {}), strides=[1, 1, 1], pads=[0, 0, 0] * 2, no_bias=False ) relu2 = brew.relu(model, fc2, fc2) last_out = model.ConvNd( relu2, 'last_out_L{}'.format(num_labels), fc_dim, num_labels, [1, 1, 1], weight_init=("MSRAFill", {}), strides=[1, 1, 1], pads=[0, 0, 0] * 2, no_bias=False ) return last_out else: # reduce to 4D tensor v_builder.prev_blob = model.Squeeze( v_builder.prev_blob, 'v_final_avg_squeezed', dims=[4]) last_v_layer = v_builder.prev_blob av_concat = model.Concat( [last_v_layer, last_a_layer], 'av_concat', axis=1 ) if use_dropout: dropout = brew.dropout( model, av_concat, 'dropout', is_test=is_test ) else: dropout = av_concat dim = 2 * filter_config[3][0] fc_dim = int(dim / 2) fc1 = brew.fc(model, dropout, 'av_fc1', dim, fc_dim) relu1 = brew.relu(model, fc1, fc1) fc2 = brew.fc(model, relu1, 'av_fc2', fc_dim, fc_dim) relu2 = brew.relu(model, fc2, fc2) last_out = brew.fc( model, relu2, 'last_out_L{}'.format(num_labels), fc_dim, num_labels) if g_blend: a_last_out = brew.fc( model, last_a_layer, 'a_last_out_L{}'.format(num_labels), filter_config[3][0], num_labels, ) v_last_out = brew.fc( model, last_v_layer, 'v_last_out_L{}'.format(num_labels), filter_config[3][0], num_labels, ) return [a_last_out, v_last_out, last_out] return last_out
def alexnet(): model = ModelHelper(name="r", arg_scope={"order": "NCHW", "is_test": True}) conv1 = brew.conv( model, "data", "conv1", 3, 64, 11, ('XavierFill', {}), ('ConstantFill', {}), stride=4, pad=2 ) relu1 = brew.relu(model, conv1, "conv1") pool1 = brew.max_pool(model, relu1, "pool1", kernel=3, stride=2, pad=0, legacy_pad=3) lrn1 = brew.lrn( model, pool1, "pool1_lrn", size=5, alpha=1.0e-4, beta=0.75, bias=1.0) conv2 = brew.conv( model, lrn1, "conv2", 64, 192, 5, ('XavierFill', {}), ('ConstantFill', {}), pad=2 ) relu2 = brew.relu(model, conv2, "conv2") pool2 = brew.max_pool(model, relu2, "pool2", kernel=3, stride=2) lrn2 = brew.lrn( model, pool2, "pool2_lrn", size=5, alpha=1.0e-4, beta=0.75, bias=1.0) conv3 = brew.conv( model, lrn2, "conv3", 192, 384, 3, ('XavierFill', {}), ('ConstantFill', {}), pad=1 ) relu3 = brew.relu(model, conv3, "conv3") conv4 = brew.conv( model, relu3, "conv4", 384, 256, 3, ('XavierFill', {}), ('ConstantFill', {}), pad=1 ) relu4 = brew.relu(model, conv4, "conv4") conv5 = brew.conv( model, relu4, "conv5", 256, 256, 3, ('XavierFill', {}), ('ConstantFill', {}), pad=1 ) relu5 = brew.relu(model, conv5, "conv5") pool5 = brew.max_pool(model, relu5, "pool5", kernel=3, stride=2) fc6 = brew.fc( model, pool5, "fc6", 256 * 6 * 6, 4096, ('XavierFill', {}), ('ConstantFill', {}) ) relu6 = brew.relu(model, fc6, "fc6") fc7 = brew.fc( model, relu6, "fc7", 4096, 4096, ('XavierFill', {}), ('ConstantFill', {}) ) relu7 = brew.relu(model, fc7, "fc7") drop7 = brew.dropout(model, relu7, "fc7_dropout", is_test=1, ratio=0.5) fc8 = brew.fc( model, drop7, "fc8", 4096, 1000, ('XavierFill', {}), ('ConstantFill', {}) ) relu8 = brew.relu(model, fc8, "fc8") _ = brew.dropout(model, relu8, "fc8_dropout", is_test=1, ratio=0.5) return model, [(1, 3, 224, 224)]
def create_unet_model(m, device_opts, is_test): base_n_filters = 16 kernel_size = 3 pad = (kernel_size - 1) / 2 do_dropout = True num_classes = 3 weight_init = ("MSRAFill", {}) with core.DeviceScope(device_opts): contr_1_1 = brew.spatial_bn(m, brew.relu( m, brew.conv(m, 'data', 'conv_1_1', dim_in=num_classes, dim_out=base_n_filters, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_1_1'), 'contr_1_1', dim_in=base_n_filters, epsilon=1e-3, momentum=0.1, is_test=is_test) contr_1_2 = brew.spatial_bn(m, brew.relu( m, brew.conv(m, contr_1_1, 'conv_1_2', dim_in=base_n_filters, dim_out=base_n_filters, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_1_2'), 'contr_1_2', dim_in=base_n_filters, epsilon=1e-3, momentum=0.1, is_test=is_test) pool1 = brew.max_pool(m, contr_1_2, 'pool1', kernel=2, stride=2) contr_2_1 = brew.spatial_bn(m, brew.relu( m, brew.conv(m, pool1, 'conv_2_1', dim_in=base_n_filters, dim_out=base_n_filters * 2, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_2_1'), 'contr_2_1', dim_in=base_n_filters * 2, epsilon=1e-3, momentum=0.1, is_test=is_test) contr_2_2 = brew.spatial_bn(m, brew.relu( m, brew.conv(m, contr_2_1, 'conv_2_2', dim_in=base_n_filters * 2, dim_out=base_n_filters * 2, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_2_2'), 'contr_2_2', dim_in=base_n_filters * 2, epsilon=1e-3, momentum=0.1, is_test=is_test) pool2 = brew.max_pool(m, contr_2_2, 'pool2', kernel=2, stride=2) contr_3_1 = brew.spatial_bn(m, brew.relu( m, brew.conv(m, pool2, 'conv_3_1', dim_in=base_n_filters * 2, dim_out=base_n_filters * 4, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_3_1'), 'contr_3_1', dim_in=base_n_filters * 4, epsilon=1e-3, momentum=0.1, is_test=is_test) contr_3_2 = brew.spatial_bn(m, brew.relu( m, brew.conv(m, contr_3_1, 'conv_3_2', dim_in=base_n_filters * 4, dim_out=base_n_filters * 4, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_3_2'), 'contr_3_2', dim_in=base_n_filters * 4, epsilon=1e-3, momentum=0.1, is_test=is_test) pool3 = brew.max_pool(m, contr_3_2, 'pool3', kernel=2, stride=2) contr_4_1 = brew.spatial_bn(m, brew.relu( m, brew.conv(m, pool3, 'conv_4_1', dim_in=base_n_filters * 4, dim_out=base_n_filters * 8, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_4_1'), 'contr_4_1', dim_in=base_n_filters * 8, epsilon=1e-3, momentum=0.1, is_test=is_test) contr_4_2 = brew.spatial_bn(m, brew.relu( m, brew.conv(m, contr_4_1, 'conv_4_2', dim_in=base_n_filters * 8, dim_out=base_n_filters * 8, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_4_2'), 'contr_4_2', dim_in=base_n_filters * 8, epsilon=1e-3, momentum=0.1, is_test=is_test) pool4 = brew.max_pool(m, contr_4_2, 'pool4', kernel=2, stride=2) if do_dropout: pool4 = brew.dropout(m, pool4, 'drop', ratio=0.4) encode_5_1 = brew.spatial_bn(m, brew.relu( m, brew.conv(m, pool4, 'conv_5_1', dim_in=base_n_filters * 8, dim_out=base_n_filters * 16, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_5_1'), 'encode_5_1', dim_in=base_n_filters * 16, epsilon=1e-3, momentum=0.1, is_test=is_test) encode_5_2 = brew.spatial_bn(m, brew.relu( m, brew.conv(m, encode_5_1, 'conv_5_2', dim_in=base_n_filters * 16, dim_out=base_n_filters * 16, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_5_2'), 'encode_5_2', dim_in=base_n_filters * 16, epsilon=1e-3, momentum=0.1, is_test=is_test) upscale5 = brew.conv_transpose(m, encode_5_2, 'upscale5', dim_in=base_n_filters * 16, dim_out=base_n_filters * 16, kernel=2, stride=2, weight_init=weight_init) concat6 = brew.concat(m, [upscale5, contr_4_2], 'concat6') #, axis=1) expand_6_1 = brew.spatial_bn( m, brew.relu( m, brew.conv(m, concat6, 'conv_6_1', dim_in=base_n_filters * 8 * 3, dim_out=base_n_filters * 8, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_6_1'), 'expand_6_1', dim_in=base_n_filters * 8, epsilon=1e-3, momentum=0.1, is_test=is_test) expand_6_2 = brew.spatial_bn(m, brew.relu( m, brew.conv(m, expand_6_1, 'conv_6_2', dim_in=base_n_filters * 8, dim_out=base_n_filters * 8, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_6_2'), 'expand_6_2', dim_in=base_n_filters * 8, epsilon=1e-3, momentum=0.1, is_test=is_test) upscale6 = brew.conv_transpose(m, expand_6_2, 'upscale6', dim_in=base_n_filters * 8, dim_out=base_n_filters * 8, kernel=2, stride=2, weight_init=weight_init) concat7 = brew.concat(m, [upscale6, contr_3_2], 'concat7') expand_7_1 = brew.spatial_bn( m, brew.relu( m, brew.conv(m, concat7, 'conv_7_1', dim_in=base_n_filters * 4 * 3, dim_out=base_n_filters * 4, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_7_1'), 'expand_7_1', dim_in=base_n_filters * 4, epsilon=1e-3, momentum=0.1, is_test=is_test) expand_7_2 = brew.spatial_bn(m, brew.relu( m, brew.conv(m, expand_7_1, 'conv_7_2', dim_in=base_n_filters * 4, dim_out=base_n_filters * 4, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_7_2'), 'expand_7_2', dim_in=base_n_filters * 4, epsilon=1e-3, momentum=0.1, is_test=is_test) upscale7 = brew.conv_transpose(m, expand_7_2, 'upscale7', dim_in=base_n_filters * 4, dim_out=base_n_filters * 4, kernel=2, stride=2, weight_init=weight_init) concat8 = brew.concat(m, [upscale7, contr_2_2], 'concat8') expand_8_1 = brew.spatial_bn( m, brew.relu( m, brew.conv(m, concat8, 'conv_8_1', dim_in=base_n_filters * 2 * 3, dim_out=base_n_filters * 2, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_8_1'), 'expand_8_1', dim_in=base_n_filters * 2, epsilon=1e-3, momentum=0.1, is_test=is_test) expand_8_2 = brew.spatial_bn(m, brew.relu( m, brew.conv(m, expand_8_1, 'conv_8_2', dim_in=base_n_filters * 2, dim_out=base_n_filters * 2, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_8_2'), 'expand_8_2', dim_in=base_n_filters * 2, epsilon=1e-3, momentum=0.1, is_test=is_test) upscale8 = brew.conv_transpose(m, expand_8_2, 'upscale8', dim_in=base_n_filters * 2, dim_out=base_n_filters * 2, kernel=2, stride=2, weight_init=weight_init) concat9 = brew.concat(m, [upscale8, contr_1_2], 'concat9') expand_9_1 = brew.spatial_bn(m, brew.relu( m, brew.conv(m, concat9, 'conv_9_1', dim_in=base_n_filters * 3, dim_out=base_n_filters, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_9_1'), 'expand_9_1', dim_in=base_n_filters, epsilon=1e-3, momentum=0.1, is_test=is_test) expand_9_2 = brew.spatial_bn(m, brew.relu( m, brew.conv(m, expand_9_1, 'conv_9_2', dim_in=base_n_filters, dim_out=base_n_filters, kernel=kernel_size, pad=pad, weight_init=weight_init), 'nonl_9_2'), 'expand_9_2', dim_in=base_n_filters, epsilon=1e-3, momentum=0.1, is_test=is_test) output_segmentation = brew.conv(m, expand_9_2, 'output_segmentation', dim_in=base_n_filters, dim_out=num_classes, kernel=1, pad=0, stride=1, weight_init=weight_init) m.net.AddExternalOutput(output_segmentation) output_sigmoid = m.Sigmoid(output_segmentation, 'output_sigmoid') m.net.AddExternalOutput(output_sigmoid) return output_segmentation
def create_alexnetv0( model, data, num_input_channels, num_labels, is_test=False, ): conv1 = brew.conv( model, data, "conv1", num_input_channels, # dim_in 96, # dim_out 11, # kernel ('XavierFill', {}), ('ConstantFill', {}), stride=4, pad=2) relu1 = brew.relu(model, conv1, "conv1") norm1 = brew.lrn(model, relu1, "norm1", size=5, alpha=0.0001, beta=0.75) pool1 = brew.max_pool(model, norm1, "pool1", kernel=3, stride=2) conv2 = brew.conv(model, pool1, "conv2", 96, 256, 5, ('XavierFill', {}), ('ConstantFill', {}), pad=2) relu2 = brew.relu(model, conv2, "conv2") norm2 = brew.lrn(model, relu2, "norm2", size=5, alpha=0.0001, beta=0.75) pool2 = brew.max_pool(model, norm2, "pool2", kernel=3, stride=2) conv3 = brew.conv(model, pool2, "conv3", 256, 384, 3, ('XavierFill', {}), ('ConstantFill', {}), pad=1) relu3 = brew.relu(model, conv3, "conv3") conv4 = brew.conv(model, relu3, "conv4", 384, 384, 3, ('XavierFill', {}), ('ConstantFill', {}), pad=1) relu4 = brew.relu(model, conv4, "conv4") conv5 = brew.conv(model, relu4, "conv5", 384, 256, 3, ('XavierFill', {}), ('ConstantFill', {}), pad=1) relu5 = brew.relu(model, conv5, "conv5") pool5 = brew.max_pool(model, relu5, "pool5", kernel=3, stride=2) fc6 = brew.fc(model, pool5, "fc6", 256 * 6 * 6, 4096, ('XavierFill', {}), ('ConstantFill', {})) relu6 = brew.relu(model, fc6, "fc6") dropout1 = brew.dropout(model, relu6, 'dropout1', ratio=0.5, is_test=is_test) fc7 = brew.fc(model, dropout1, "fc7", 4096, 4096, ('XavierFill', {}), ('ConstantFill', {})) relu7 = brew.relu(model, fc7, "fc7") dropout2 = brew.dropout(model, relu7, 'dropout2', ratio=0.5, is_test=is_test) fc8 = brew.fc(model, dropout2, "fc8", 4096, num_labels, ('XavierFill', {}), ('ConstantFill', {})) return fc8
def Dropout(self, *args, **kwargs): return brew.dropout(self, *args, **kwargs)
def make_forward_pass_ops( self, model: ModelHelper, input_blob: str, output_blob: str, is_test: bool = False, ) -> None: """ Performs a forward pass of a multi-layer perceptron. :param model: The ModelHelper object whose net will execute this pass :param input_blob: The blob containing the input data :param output_blob: The blob where the output data will be placed :param is_test: Indicates whether or not this forward pass should skip node dropout. """ model.net.NanCheck([input_blob], [input_blob]) model_states = [] num_layer_connections = len(self.layers) - 1 for x in range(num_layer_connections + 1): if x == 0: model_states.append(input_blob) elif x == num_layer_connections: model_states.append(output_blob) else: model_states.append( model.net.NextBlob("ModelState_" + str(x) + "_" + self.model_id)) for x in range(num_layer_connections): inputs = model_states[x] outputs = model_states[x + 1] activation = self.activations[x] dim_in = self.layers[x] dim_out = self.layers[x + 1] weight_name = self.weights[x] bias_name = self.biases[x] brew.fc_explicit_param_names( # type: ignore model, inputs, outputs, dim_in=dim_in, dim_out=dim_out, bias_name=bias_name, weight_name=weight_name, weight_init=( "GivenTensorFill", { "values": workspace.FetchBlob(weight_name) }, ), bias_init=( "GivenTensorFill", { "values": workspace.FetchBlob(bias_name) }, ), ) if activation == "relu": brew.relu(model, outputs, outputs) elif activation == "linear": pass else: raise Exception("Unknown activation function") if self.dropout_ratio > 0.01: brew.dropout(model, outputs, outputs, ratio=self.dropout_ratio, is_test=is_test) model.net.NanCheck([output_blob], [output_blob])