def ncac(target, embedding): """Return the sample wise NCA for classification method. This corresponds to the probability that a point is correctly classified with a soft knn classifier using leave-one-out. Each neighbour is weighted according to an exponential of its negative Euclidean distance. Afterwards, a probability is calculated for each class depending on the weights of the neighbours. For details, we refer you to 'Neighbourhood Component Analysis' by J Goldberger, S Roweis, G Hinton, R Salakhutdinov (2004). :param target: An array of shape `(n,)` where `n` is the number of samples. Each entry of the array should be an integer between `0` and `k-1`, where `k` is the number of classes. :param embedding: An array of shape `(n, d)` where each row represents a point in d dimensional space. :returns: Array of shape `(n, 1)`. """ # Matrix of the distances of points. dist = distance_matrix(embedding) thisid = T.identity_like(dist) # Probability that a point is neighbour of another point based on # the distances. top = T.exp(-dist) + 1e-8 # Add a small constant for stability. bottom = (top - thisid * top).sum(axis=0) p = top / bottom # Create a matrix that matches same classes. sameclass = T.eq(distance_matrix(target), 0) - thisid loss_vector = -(p * sameclass).sum(axis=1) # To be compatible with the API, we make this a (n, 1) matrix. return T.shape_padright(loss_vector)
def ncar(target, embedding): """Return the NCA for regression loss. This is similar to NCA for classification, except that not soft KNN classification but regression performance is maximized. (Actually, the negative performance is minimized.) For details, we refer you to 'Pose-sensitive embedding by nonlinear nca regression' by Taylor, G. and Fergus, R. and Williams, G. and Spiro, I. and Bregler, C. (2010) Parameters ---------- target : Theano variable An array of shape ``(n, d)`` where ``n`` is the number of samples and ``d`` the dimensionalty of the target space. embedding : Theano variable An array of shape ``(n, d)`` where each row represents a point in ``d``-dimensional space. Returns ------- res : Theano variable Array of shape ``(n, 1)``. """ # Matrix of the distances of points. dist = distance_matrix(embedding) ** 2 thisid = T.identity_like(dist) # Probability that a point is neighbour of another point based on # the distances. top = T.exp(-dist) + 1E-8 # Add a small constant for stability. bottom = (top - thisid * top).sum(axis=0) p = top / bottom # Create matrix of distances. target_distance = distance_matrix(target, target, 'soft_l1') # Set diagonal to 0. target_distance -= target_distance * T.identity_like(target_distance) loss_vector = (p * target_distance ** 2).sum(axis=1) # To be compatible with the API, we make this a (n, 1) matrix. return T.shape_padright(loss_vector)