def normals_initializer(parameter: Parameter): if len(parameter.data.squeeze().shape) == 1: # Zero bias initializer parameter.data = 0 else: # Normal distribution parameter.data = np.random.normal(0, 0.1, parameter.data.shape)
def __init__(self, input_size: int, output_size: int, parent=None): super(LinearLayer, self).__init__(parent) self.bias = Parameter(np.zeros((1, output_size), dtype=np.float32)) self.weight = Parameter(0.01 * np.random.randn(input_size, output_size), dtype=np.float32) self.initialize() self.data = None
def __init__(self, input_size: int, output_size: int, parent=None): super(LinearLayer, self).__init__(parent) self.bias = Parameter(np.zeros((1, output_size), dtype=np.float32)) #arry = np.random.randn(input_size,output_size) self.weight = Parameter( np.zeros((input_size, output_size), dtype=np.float32)) # TODO create the weight parameter self.initialize()
def __init__(self, input_size: int, output_size: int, parent=None): super(LinearLayer, self).__init__(parent) self.bias = Parameter(np.zeros((1, output_size), dtype=np.float32)) self.weight = Parameter( np.ones((input_size, output_size), dtype=np.float32)) self.input = None self.input_size = input_size self.output_size = output_size self.initialize()
def __init__(self, input_channels, output_channels, kernel_size=3, stride=1, parent=None): super(ConvNumbaLayer, self).__init__(parent) self.weight = Parameter(np.zeros((input_channels, output_channels, kernel_size, kernel_size), dtype=np.float32)) self.bias = Parameter(np.zeros(output_channels, dtype=np.float32)) self.kernel_size = kernel_size self.padding = (kernel_size - 1) // 2 self.stride = stride self.initialize() self.conv_layer_input = np.array([]) self.conv_layer_reshaped = np.array([]) self.newwidth = 0 self.newheight = 0 self.initial = 0
def __init__(self, input_features: int, output_features: int, is_bias=True, parent=None): """ :param input_features: Size of each input feature. :param output_features: Size of each output feature. :param is_bias: If set to false, the layer will not learn an additive bias. Default: `True` :param parent: """ super(Linear, self).__init__(parent) self.data = None self.is_bias = is_bias if is_bias: self.bias = Parameter(np.zeros((1, output_features), dtype=np.float64)) self.weight = Parameter(np.random.randn(input_features, output_features), dtype=np.float64) self.initialize()
def __init__(self, input_channels, output_channels, kernel_size=3, stride=1, parent=None): super(ConvLayer, self).__init__(parent) self.weight = Parameter( np.zeros( (input_channels, output_channels, kernel_size, kernel_size), dtype=np.float32)) self.bias = Parameter(np.zeros(output_channels, dtype=np.float32)) self.kernel_size = kernel_size self.padding = (kernel_size - 1) // 2 self.stride = stride self.initialize()
def __init__(self, size: int, initial_slope: float = 0.1, parent=None): super(PReLULayer, self).__init__(parent) self.slope = Parameter(np.full(size, initial_slope)) self.momentum = 0.9 self.learning_rate = 0.01 self.dleta_slope = np.full(size, 0) self.input_data = 0 self.counter = 0
def forward(self, data: np.ndarray): # -> np.ndarray: ''' Linear layer (fully connected) forward pass :param data: n X d array (batch x features) :return: n X c array (batch x channels) ''' self.input = Parameter(data) return np.matmul(self.input.data, self.weight.data) + self.bias.data
def forward(self, data): ''' ReLU Layer forward pass :param data: n x d (batch x features) :return: n x d array relu activated (batch x channels) ''' self.data = Parameter(data) return np.clip(data, 0, None)
class PReLULayer(Layer): def __init__(self, size: int, initial_slope: float = 0.1, parent=None): super(PReLULayer, self).__init__(parent) self.slope = Parameter(np.full(size, initial_slope)) def forward(self, data): # TODO dataMove = np.moveaxis(data, 1, -1) batch = np.prod(dataMove.shape, axis=0) / data.shape[1] dataReshape = dataMove.reshape(int(batch), data.shape[1]) self.dataReshape = dataReshape output = np.maximum(dataReshape, 0) + np.minimum(dataReshape, 0) * self.slope.data outputReshape = output.reshape(dataMove.shape) outputMove = np.moveaxis(outputReshape, -1, 1) return outputMove def backward(self, previous_partial_gradient): # TODO # dL/dY and dL/dX # change dimension dyMove = np.moveaxis(previous_partial_gradient, 1, -1) batch = np.prod(dyMove.shape, axis=0) / previous_partial_gradient.shape[1] dyReshape = dyMove.reshape(int(batch), previous_partial_gradient.shape[1]) # compute self.slope.zero_grad() slopeGrad = np.sum(np.array(self.dataReshape < 0, dtype=int) * self.dataReshape * dyReshape, axis=0) if self.slope.grad.shape[0] > 1: self.slope.grad = slopeGrad if self.slope.grad.shape[0] == 1: self.slope.grad = np.sum(slopeGrad) dx = np.array(self.dataReshape > 0, dtype=int) * dyReshape + np.array( self.dataReshape < 0, dtype=int) * dyReshape * self.slope.data # change dimension dxReshape = dx.reshape(dyMove.shape) dxMove = np.moveaxis(dxReshape, -1, 1) print(self.slope.data) return dxMove
def __init__(self, input_channels, output_channels, kernel_size=3, stride=1, parent=None): ''' Each channel corresponds to a filter?. The first layer has 1 input channel (the original image) and we apply a # of filters = to # of output channels. Each filter consists of several kernels, which are what you think a filter is. - 32 x 32 raw image. 1 input channel, 3 (1x)1x1 filters (R, G, B) EXAMPLE: - 32 x 32 RGB image. First layer has 3 input channels (rgb), then height and width - A filter is then a stack of 3 kernels, one for each input channel, say 5x5 - If we have 7 output channels, our weight matrix is 3 x 7 x 5 x 5 - 3 x 5 x 5 is the filter (stack of 3 kernels) - 7 output channels, one for each filter. ''' super(ConvLayer, self).__init__(parent) self.weight = Parameter( np.zeros( (input_channels, output_channels, kernel_size, kernel_size), dtype=np.float32)) self.bias = Parameter(np.zeros(output_channels, dtype=np.float32)) self.input_channels = input_channels self.output_channels = output_channels self.kernel_size = kernel_size self.padding = (kernel_size - 1) // 2 self.stride = stride self.X_col = None self.height = None self.width = None self.initialize()
def forward(self, data): self.data = Parameter(data) return np.where(data > 0, data, self.slope * data)
def __init__(self, size: int, initial_slope: float = 0.1, parent=None): super(PReLULayer, self).__init__(parent) self.slope = Parameter(np.full(size, initial_slope)) self.data = None self.initialize()
def __init__(self, size: int, initial_slope: float = 0.1, parent=None): super(PReLULayer, self).__init__(parent) self.slope = Parameter(np.full(size, initial_slope)) self.activation = [] self.size = size