def agreement(ci, buffsz=None): ''' Takes as input a set of vertex partitions CI of dimensions [vertex x partition]. Each column in CI contains the assignments of each vertex to a class/community/module. This function aggregates the partitions in CI into a square [vertex x vertex] agreement matrix D, whose elements indicate the number of times any two vertices were assigned to the same class. In the case that the number of nodes and partitions in CI is large (greater than ~1000 nodes or greater than ~1000 partitions), the script can be made faster by computing D in pieces. The optional input BUFFSZ determines the size of each piece. Trial and error has found that BUFFSZ ~ 150 works well. Parameters ---------- ci : MxN np.ndarray set of M (possibly degenerate) partitions of N nodes buffsz : int | None sets buffer size. If not specified, defaults to 1000 Returns ------- D : NxN np.ndarray agreement matrix ''' ci = np.array(ci) m, n = ci.shape if buffsz is None: buffsz = 1000 if m <= buffsz: ind = dummyvar(ci) D = np.dot(ind, ind.T) else: a = np.arange(0, m, buffsz) b = np.arange(buffsz, m, buffsz) if len(a) != len(b): b = np.append(b, m) D = np.zeros((n,)) for i, j in zip(a, b): y = ci[:, i:j + 1] ind = dummyvar(y) D += np.dot(ind, ind.T) np.fill_diagonal(D, 0) return D
def agreement(ci, buffsz=None): ''' Takes as input a set of vertex partitions CI of dimensions [vertex x partition]. Each column in CI contains the assignments of each vertex to a class/community/module. This function aggregates the partitions in CI into a square [vertex x vertex] agreement matrix D, whose elements indicate the number of times any two vertices were assigned to the same class. In the case that the number of nodes and partitions in CI is large (greater than ~1000 nodes or greater than ~1000 partitions), the script can be made faster by computing D in pieces. The optional input BUFFSZ determines the size of each piece. Trial and error has found that BUFFSZ ~ 150 works well. Parameters ---------- ci : MxN np.ndarray set of M (possibly degenerate) partitions of N nodes buffsz : int | None sets buffer size. If not specified, defaults to 1000 Returns ------- D : NxN np.ndarray agreement matrix ''' ci = np.array(ci) m, n = ci.shape if buffsz is None: buffsz = 1000 if m <= buffsz: ind = dummyvar(ci) D = np.dot(ind, ind.T) else: a = np.arange(0, m, buffsz) b = np.arange(buffsz, m, buffsz) if len(a) != len(b): b = np.append(b, m) D = np.zeros((n, )) for i, j in zip(a, b): y = ci[:, i:j + 1] ind = dummyvar(y) D += np.dot(ind, ind.T) np.fill_diagonal(D, 0) return D
def agreement_weighted(ci, wts): ''' D = AGREEMENT_WEIGHTED(CI,WTS) is identical to AGREEMENT, with the exception that each partitions contribution is weighted according to the corresponding scalar value stored in the vector WTS. As an example, suppose CI contained partitions obtained using some heuristic for maximizing modularity. A possible choice for WTS might be the Q metric (Newman's modularity score). Such a choice would add more weight to higher modularity partitions. NOTE: Unlike AGREEMENT, this script does not have the input argument BUFFSZ. Parameters ---------- ci : MxN np.ndarray set of M (possibly degenerate) partitions of N nodes wts : Mx1 np.ndarray relative weight of each partition Returns ------- D : NxN np.ndarray weighted agreement matrix ''' ci = np.array(ci) m, n = ci.shape wts = np.array(wts) / np.sum(wts) D = np.zeros((n, n)) for i in range(m): d = dummyvar(ci[i, :].reshape(1, n)) D += np.dot(d, d.T) * wts[i] return D