def kmeans(data,K, clusterType = "MeansPointRepresentative", distancePointToPoint = L2NormDistance, distancePointToSet = AveragePointSetDistance): # could also input tolerance clusters = [] if clusterType == 'MeansPointRepresentative': for k in range(K): clusters.append(MeansCluster(distancePointToPoint,data)) #initializes the clusters as MeansPointRepresentative elif clusterType == 'SetRepresentative': for k in range(K): clusters.append(SetCluster(distancePointToPoint,distancePointToSet,data)) #initializes the cluster as set representative else: print "Unknown type of cluster" return None hasConverged = False iterations = 0 while not hasConverged: #continues to run until clusters converge conv = [] for d in data: distanceFromCluster = scipy.array([c.distanceToPointOrSet(d) for c in clusters]) indexCluster = scipy.argmin(distanceFromCluster) # i have a weird error here when I use setCluster clusters[indexCluster].assign(d) for c in clusters: c.update() conv.append(c.reachedTolerance()) #tests to see if epsilon is below tolerance iterations = iterations + 1 hasConverged = all(conv) print "The number of iterations is: ", iterations clusterID =[] for d in data: #creates cluster ID's distanceFromCluster = scipy.array([c.distanceToPointOrSet(d) for c in clusters]) indexCluster = scipy.argmin(distanceFromCluster) clusterID.append(indexCluster) return [clusterID, clusters]
def relevant(self,state): """ returns the non-buffer (thus *relevant* part) of the concentration vector of the state """ return state.y[scipy.argmin(abs(state.x-self.bounds[0])):\ scipy.argmin(abs(state.x-self.bounds[1]))+1]
def crossvalidate(self, y, alphas, n_splits=10): """ lmmlasso cross-validation to get optimal alpha alphas = list of alphas to perform cross-validation over y = phenotype """ lasso = lmmlasso.LmmLasso(warm_start=True, fit_intercept=False, tol=0.5) X = self.E K = self.K assert K is not None, 'no kinship matrix defined' MSE_train, MSE_test, W_nonzero, rsquared = lmmlasso.runCrossValidation( lasso, self.E, y, alphas, n_splits=n_splits, K=K, verbose=True) train_inter = sp.interpolate.UnivariateSpline( x=alphas, y=(MSE_train.mean(axis=0))).derivative( n=2) #Interpolating the values for alphas within the range test_inter = sp.interpolate.UnivariateSpline( x=alphas, y=(MSE_test.mean(axis=0))).derivative(n=2) alphas_inter = (sp.linspace(min(alphas), max(alphas), 100)) idx_train = sp.argmin(train_inter(alphas_inter)) # :/ idx_test = sp.argmin(test_inter(alphas_inter)) # :/ alpha_cv = (float(alphas_inter[idx_train]) + float(alphas_inter[idx_test])) / 2 self.alpha = alpha_cv return self.alpha
def cosine_coefficient(self, target_FV): #print target_FV #print "temp = (self.nodes * target_FV)" temp = (self.nodes * target_FV) #print temp #print "temp = temp.sum(axis=2)" temp = temp.sum(axis=2) #print temp #print "temp_2 = (self.nodes**2)" temp_2 = (self.nodes**2) #print temp_2 #print "temp_2 = temp_2.sum(axis=2)" temp_2 = temp_2.sum(axis=2) #print temp_2 #print "temp_3 = (target_FV**2)" temp_3 = (target_FV**2) #print temp_3 #print "temp_3 = temp_3.sum()" temp_3 = temp_3.sum() #print temp_3 #print "temp_3 = temp_3**0.5" temp_3 = temp_3**0.5 #print temp_3 #print "temp_4 = temp_2*temp_3" temp_4 = temp_2*temp_3 #print temp_4 #print "temp_f = temp / temp_4" temp_f = temp / temp_4 #print temp_f return scipy.argmin(temp_f) #(a*b).sum(axis=2) / (((a**2).sum(axis=2) * (b**2).sum())**0.5) return scipy.argmin((self.nodes * target_FV).sum(axis=2) / ((self.nodes**2).sum(axis=2) * (target_FV**2).sum()**0.5))
def calc_probability_matrix(trains_a, trains_b, metric, tau, z): """ Calculates the probability matrix that one spike train from stimulus X will be classified as spike train from stimulus Y. :param list trains_a: Spike trains of stimulus A. :param list trains_b: Spike trains of stimulus B. :param str metric: Metric to base the classification on. Has to be a key in :const:`metrics.metrics`. :param tau: Time scale parameter for the metric. :type tau: Quantity scalar. :param float z: Exponent parameter for the classifier. """ with warnings.catch_warnings(): warnings.filterwarnings("ignore", "divide by zero") dist_mat = calc_single_metric(trains_a + trains_b, metric, tau) ** z dist_mat[sp.diag_indices_from(dist_mat)] = 0.0 assert len(trains_a) == len(trains_b) l = len(trains_a) classification_of_a = sp.argmin(sp.vstack(( sp.sum(dist_mat[:l, :l], axis=0) / (l - 1), sp.sum(dist_mat[l:, :l], axis=0) / l)) ** (1.0 / z), axis=0) classification_of_b = sp.argmin(sp.vstack(( sp.sum(dist_mat[:l, l:], axis=0) / l, sp.sum(dist_mat[l:, l:], axis=0) / (l - 1))) ** (1.0 / z), axis=0) confusion = sp.empty((2, 2)) confusion[0, 0] = sp.sum(classification_of_a == 0) confusion[1, 0] = sp.sum(classification_of_a == 1) confusion[0, 1] = sp.sum(classification_of_b == 0) confusion[1, 1] = sp.sum(classification_of_b == 1) return confusion / 2.0 / l
def heuristic_atmosphere(RT, instrument, x_RT, x_instrument, meas, geom): '''From a given radiance, estimate atmospheric state with band ratios. Used to initialize gradient descent inversions.''' # Identify the latest instrument wavelength calibration (possibly # state-dependent) and identify channel numbers for the band ratio. wl, fwhm = instrument.calibration(x_instrument) b865 = s.argmin(abs(wl-865)) b945 = s.argmin(abs(wl-945)) b1040 = s.argmin(abs(wl-1040)) if not (any(RT.wl > 850) and any(RT.wl < 1050)): return x_RT x_new = x_RT.copy() # Band ratio retrieval of H2O. Depending on the radiative transfer # model we are using, this state parameter could go by several names. for h2oname in ['H2OSTR', 'h2o']: if h2oname not in RT.statevec: continue # ignore unused names if h2oname not in RT.lut_names: continue # find the index in the lookup table associated with water vapor ind_lut = RT.lut_names.index(h2oname) ind_sv = RT.statevec.index(h2oname) h2os, ratios = [], [] # We iterate through every possible grid point in the lookup table, # calculating the band ratio that we would see if this were the # atmospheric H2O content. It assumes that defaults for all other # atmospheric parameters (such as aerosol, if it is there). for h2o in RT.lut_grids[ind_lut]: # Get Atmospheric terms at high spectral resolution x_RT_2 = x_RT.copy() x_RT_2[ind_sv] = h2o rhoatm_hi, sphalb_hi, transm_hi, transup_hi = RT.get(x_RT_2, geom) rhoatm = instrument.sample(x_instrument, RT.wl, rhoatm_hi) transm = instrument.sample(x_instrument, RT.wl, transm_hi) solar_irr = instrument.sample(x_instrument, RT.wl, RT.solar_irr) # Assume no surface emission. "Correct" the at-sensor radiance # using this presumed amount of water vapor, and measure the # resulting residual (as measured from linear interpolation across # the absorption feature) r = (meas*s.pi/(solar_irr*RT.coszen) - rhoatm) / (transm+1e-8) ratios.append((r[b945]*2.0)/(r[b1040]+r[b865])) h2os.append(h2o) # Finally, interpolate to determine the actual water vapor level that # would optimize the continuum-relative correction p = interp1d(h2os, ratios) bounds = (h2os[0]+0.001, h2os[-1]-0.001) best = min1d(lambda h: abs(1-p(h)), bounds=bounds, method='bounded') x_new[ind_sv] = best.x return x_new
def shift_minimums(data1, data2): """Receives two graphs, calculates the shift between y-axis minimums in x-axis and returns the shift in index and in x-axis units""" pos1 = scp.argmin(data1[1]) pos2 = scp.argmin(data2[1]) value1 = data1[0][pos1] value2 = data2[0][pos2] return [pos2 - pos1, value2 - value1]
def __convolveSphinx(self,star): ''' Convolve the Sphinx output with the SPIRE resolution. The convolution is done in wave number (cm^-1). @param star: The Star() object for which Sphinx profiles are loaded @type star: Star() ''' #- Get sphinx model output and merge, for all star models in star_grid if not self.resolution: print '* Resolution is undefined. Cannot convolve Sphinx.' return print '* Reading Sphinx model and merging.' sphinx_wav,sphinx_flux = star['LAST_GASTRONOOM_MODEL'] \ and self.mergeSphinx(star) \ or [[],[]] if not sphinx_wav: print '* No Sphinx data found.' return sphinx_wav = 1./array(sphinx_wav)*10**(4) sphinx_flux = array(sphinx_flux) sphinx_wav = sphinx_wav[::-1] sphinx_flux = sphinx_flux[::-1] #-- eliminate some of the zeroes in the grid to reduce calculation time # (can reduce the array by a factor up to 100!!) s = self.sigma lcs = array(sorted([1./line.wavelength for line in star['GAS_LINES']])) new_wav, new_flux = [sphinx_wav[0]],[sphinx_flux[0]] for w,f in zip(sphinx_wav[1:],sphinx_flux[1:]): if f != 0 or (w < 5*s+lcs[argmin(abs(lcs-w))] \ and w > lcs[argmin(abs(lcs-w))]-5*s): new_wav.append(w) new_flux.append(f) new_wav, new_flux = array(new_wav), array(new_flux) #-- convolve the model fluxes with a gaussian and constant sigma(spire) print '* Convolving Sphinx model for SPIRE.' convolution = Data.convolveArray(new_wav,new_flux,s) for data_wav,fn in zip(self.data_wave_list,self.data_filenames): rebinned = [] #-- Convert wavelengths to wave number for integration, and reverse data_cm = data_wav[::-1] data_cm = 1./data_cm*10**4 rebinned = [trapz(y=convolution[abs(new_wav-wavi)<=self.resolution/self.oversampling],\ x=new_wav[abs(new_wav-wavi)<=self.resolution/self.oversampling])\ /(self.resolution/self.oversampling) for wavi in data_cm] #-- Reverse the rebinned fluxes so they match up with the # wavelength grid. rebinned = array(rebinned)[::-1] self.sphinx_convolution[star['LAST_SPIRE_MODEL']][fn] = rebinned
def __convolveSphinx(self, star): ''' Convolve the Sphinx output with the SPIRE resolution. The convolution is done in wave number (cm^-1). @param star: The Star() object for which Sphinx profiles are loaded @type star: Star() ''' #- Get sphinx model output and merge, for all star models in star_grid if not self.resolution: print '* Resolution is undefined. Cannot convolve Sphinx.' return print '* Reading Sphinx model and merging.' sphinx_wav,sphinx_flux = star['LAST_GASTRONOOM_MODEL'] \ and self.mergeSphinx(star) \ or [[],[]] if not sphinx_wav: print '* No Sphinx data found.' return sphinx_wav = 1. / array(sphinx_wav) * 10**(4) sphinx_flux = array(sphinx_flux) sphinx_wav = sphinx_wav[::-1] sphinx_flux = sphinx_flux[::-1] #-- eliminate some of the zeroes in the grid to reduce calculation time # (can reduce the array by a factor up to 100!!) s = self.sigma lcs = array( sorted([1. / line.wavelength for line in star['GAS_LINES']])) new_wav, new_flux = [sphinx_wav[0]], [sphinx_flux[0]] for w, f in zip(sphinx_wav[1:], sphinx_flux[1:]): if f != 0 or (w < 5*s+lcs[argmin(abs(lcs-w))] \ and w > lcs[argmin(abs(lcs-w))]-5*s): new_wav.append(w) new_flux.append(f) new_wav, new_flux = array(new_wav), array(new_flux) #-- convolve the model fluxes with a gaussian and constant sigma(spire) print '* Convolving Sphinx model for SPIRE.' convolution = Data.convolveArray(new_wav, new_flux, s) for data_wav, fn in zip(self.data_wave_list, self.data_filenames): rebinned = [] #-- Convert wavelengths to wave number for integration, and reverse data_cm = data_wav[::-1] data_cm = 1. / data_cm * 10**4 rebinned = [trapz(y=convolution[abs(new_wav-wavi)<=self.resolution/self.oversampling],\ x=new_wav[abs(new_wav-wavi)<=self.resolution/self.oversampling])\ /(self.resolution/self.oversampling) for wavi in data_cm] #-- Reverse the rebinned fluxes so they match up with the # wavelength grid. rebinned = array(rebinned)[::-1] self.sphinx_convolution[star['LAST_SPIRE_MODEL']][fn] = rebinned
def getModel(self,teff,logg): """ Return the model atmosphere for given effective temperature and log g. Not yet scaled to the distance! Units returned are (micron,Jy) @param teff: the stellar effective temperature @type teff: float @param logg: the log g value @type logg: float @return: The model spectrum in (micron,Jy) @rtype: recarray """ c = 2.99792458e18 #in angstrom/s if self.modelgrid is None: self.readModelGrid() mg = self.modelgrid #- Find the closest temperature in the grid teff_prox = mg['TEFF'][argmin(abs(mg['TEFF']-teff))] #- Select all models with that temperature mgsel = mg[mg['TEFF']==teff_prox] #- Select the closest log g in the selection logg_prox = mgsel['LOGG'][argmin(abs(mgsel['LOGG']-logg))] #- Get the index of the model closest to teff and logg imodel = mgsel[mgsel['LOGG']==logg_prox]['INDEX'][0] self.teff_actual = teff_prox self.logg_actual = logg_prox wave = self.ff[imodel].data.field('wavelength') flux = self.ff[imodel].data.field('flux') if self.header['FLXUNIT'] == 'erg/s/cm2/A': #- Go to erg/s/cm2/Hz, lFl = nFn, then to Jy (factor 10**(23)) flux = flux * wave**2 / c * 10**(23) else: raise Error('Flux unit unknown in atmosphere model fits file.') if self.header['WAVUNIT'] == 'angstrom': wave = wave * 10**(-4) else: raise Error('Wavelength unit unknown in atmosphere model fits file.') model = rec.fromarrays([wave,flux],names=['wave','flux']) return model
def parse_elem(self, etype, srange): zvec = self.t.zaxis istrt = np.argmin(np.absolute(zvec - srange[0])) istop = np.argmin(np.absolute(zvec - srange[1])) if etype in self.support.keys(): zax = self.t.zaxis[istrt:istop + 1] fzax = self.support[etype][istrt:istop + 1] pt, ps = self.fit_ttf(zax, fzax) v0, pos = self.get_v0pos(zax, fzax) attr = list(pt) + list(ps) if etype == 'AccGap': attr += list(self.get_psync()) return [pos, v0, attr]
def getModel(self, teff, logg): """ Return the model atmosphere for given effective temperature and log g. Not yet scaled to the distance! Units returned are (micron,Jy) @param teff: the stellar effective temperature @type teff: float @param logg: the log g value @type logg: float @return: The model spectrum in (micron,Jy) @rtype: recarray """ c = 2.99792458e18 #in angstrom/s if self.modelgrid is None: self.readModelGrid() mg = self.modelgrid #- Find the closest temperature in the grid teff_prox = mg['TEFF'][argmin(abs(mg['TEFF'] - teff))] #- Select all models with that temperature mgsel = mg[mg['TEFF'] == teff_prox] #- Select the closest log g in the selection logg_prox = mgsel['LOGG'][argmin(abs(mgsel['LOGG'] - logg))] #- Get the index of the model closest to teff and logg imodel = mgsel[mgsel['LOGG'] == logg_prox]['INDEX'][0] self.teff_actual = teff_prox self.logg_actual = logg_prox wave = self.ff[imodel].data.field('wavelength') flux = self.ff[imodel].data.field('flux') if self.header['FLXUNIT'] == 'erg/s/cm2/A': #- Go to erg/s/cm2/Hz, lFl = nFn, then to Jy (factor 10**(23)) flux = flux * wave**2 / c * 10**(23) else: raise Error('Flux unit unknown in atmosphere model fits file.') if self.header['WAVUNIT'] == 'angstrom': wave = wave * 10**(-4) else: raise Error( 'Wavelength unit unknown in atmosphere model fits file.') model = rec.fromarrays([wave, flux], names=['wave', 'flux']) return model
def __call__(self, use_all=False, nsigmas=[0.5, 1.5, 2.]): for nsigma in nsigmas: self.logger.info( 'Finding points for nsigmas = {:.2f}.'.format(nsigma)) self._find_xy_points_(nsigmas=nsigma) self._find_diag_points_(nsigmas=nsigma) self.logger.info('Fitting Gaussian...') if use_all: xp = self._xp yp = self._yp else: keys = self.xp.keys() xp = [self.xp[key] for key in keys] yp = [self.yp[key] for key in keys] pini = [] nsigmas = nsigmas[scipy.argmin(scipy.array(nsigmas) - 1)] for iaxis1 in range(self.ndim): for iaxis2 in range(iaxis1, self.ndim): if iaxis1 == iaxis2: pini.append(1. / (self.xp['u_%s-%s' % (iaxis1, nsigmas)][iaxis1] - self.likelihood.argmax[iaxis1])**2.) else: pini.append(0.) precision = self._fit_gaussian_(xp, yp, pini=pini) return Likelihood.Gaussian(mean=self.likelihood.argmax, precision=precision)
def fit_best_init(self, x, C, n_init=10, M="M2", th=0.1, criteria='icl', n_jobs=-1): """This method fits the given model, given the number of clusters and the parameter th, for 10 model initializations, and returns the best model in terms of BIC or ICL. Input: -x: input data -C: number of clusters -M: model name (default M2) -th: threshold parameter (default 0.1) -criteria: model selection criteria (default: icl) -n_jobs (int): number of jobs to run in parallel (default -1) """ param = {'init': 'kmeans', 'tol': 0.00001, 'C': C, 'th': th} CRIT = Parallel(n_jobs=n_jobs, verbose=False)( delayed(worker_init)(i, x, M, param, criteria) for i in range(n_init)) if criteria == "bic": t = sp.argmin(CRIT) elif criteria == "icl": t = sp.argmax(CRIT) ## Return the best model param['init'] = 'kmeans' param['random_state'] = t self.fit(x, param=param)
def automatch(self): axy = array([self.ax, self.ay]).T pxy = self.peaks[:, :2] sf = sqrt(axy.var(0) + pxy.var(0)) nindex = [ nonzero((abs((pxy - axy[i, :])) / sf < 0.1).all(1))[0] for i in range(axy.shape[0]) ] d2 = [sorted(((axy - x)**2).sum(1))[1] for x in axy] dindex = [ x[(((pxy[x] - axy[i])**2).sum(1) < 0.25 * d2[i])] for i, x in enumerate(nindex) ] qindex = [(i, x, ((pxy[x] - axy[i])**2).sum(1)) for i, x in enumerate(dindex) if len(x) > 1] d2cutoff = 10 for i, x in [(i, [x[argmin(ds)]]) for i, x, ds in qindex if sorted(ds / min(ds))[1] > d2cutoff]: dindex[i] = x self.mpeaks = [ pxy[x[0]].tolist() + [self.alabels[i].get_text()] for i, x in enumerate(dindex) if len(x) == 1 ] print "Matched %d peaks out of %d/%d" % (len( self.mpeaks), len(axy), len(pxy))
def heuristic_surface(self, rfl_meas, Ls, geom): '''Given a reflectance estimate and one or more emissive parameters, fit a state vector.''' glint_band = s.argmin(abs(900 - self.wl)) glint = s.mean(rfl_meas[(glint_band - 2):glint_band + 2]) water_band = s.argmin(abs(400 - self.wl)) water = s.mean(rfl_meas[(water_band - 2):water_band + 2]) if glint > 0.05 or water < glint: glint = 0 glint = max(self.bounds[self.glint_ind][0] + eps, min(self.bounds[self.glint_ind][1] - eps, glint)) lrfl_est = rfl_meas - glint x = MultiComponentSurface.heuristic_surface(self, lrfl_est, Ls, geom) x[self.glint_ind] = glint return x
def dtw_path(s1, s2): l1 = s1.shape[0] l2 = s2.shape[0] cum_sum = sp.zeros((l1 + 1, l2 + 1)) cum_sum[1:, 0] = sp.inf cum_sum[0, 1:] = sp.inf predecessors = [([None] * l2) for i in range(l1)] for i in range(l1): for j in range(l2): if sp.isfinite(cum_sum[i + 1, j + 1]): dij = sp.linalg.norm(s1[i] - s2[j])**2 pred_list = [ cum_sum[i, j + 1], cum_sum[i + 1, j], cum_sum[i, j] ] argmin_pred = sp.argmin(pred_list) cum_sum[i + 1, j + 1] = pred_list[argmin_pred] + dij if i + j > 0: if argmin_pred == 0: predecessors[i][j] = (i - 1, j) elif argmin_pred == 1: predecessors[i][j] = (i, j - 1) else: predecessors[i][j] = (i - 1, j - 1) i = l1 - 1 j = l2 - 1 best_path = [(i, j)] while predecessors[i][j] is not None: i, j = predecessors[i][j] best_path.insert(0, (i, j)) return best_path
def predict_gmm(self, testSamples, tau=0): """ Function that predict the label for testSamples using the learned model Inputs: testSamples: the samples to be classified tau: regularization parameter Outputs: predLabels: the class scores: the decision value for each class """ # Get information from the data nbTestSpl = testSamples.shape[0] # Number of testing samples # Initialization scores = sp.empty((nbTestSpl,self.C)) # Start the prediction for each class for c in xrange(self.C): testSamples_c = testSamples - self.mean[c,:] regvp = self.vp[c,:] + tau logdet = sp.sum(sp.log(regvp)) cst = logdet - 2*sp.log(self.prop[c]) # Pre compute the constant term # compute ||lambda^{-0.5}q^T(x-mu)||^2 + cst for all samples scores[:,c] = sp.sum( sp.square( sp.dot( (self.Q[c,:,:][:,:]/sp.sqrt(regvp)).T, testSamples_c.T ) ), axis=0 ) + cst del testSamples_c # Assign the label to the minimum value of scores predLabels = sp.argmin(scores,1)+1 return predLabels,scores
def brute_force_2ref(ref1,ref2,data,res): [a,b,c] = data.shape[0],data.shape[1],data.shape[2] print a,b,c matrix_ref1 = np.copy(data) matrix_ref2 = np.copy(data) for i in range(c): matrix_ref1[:, :, i] = ref1[i] matrix_ref2[:, :, i] = ref2[i] total = 100/res + 1 total = int(total) factor = (np.linspace(0,1,total)) fRGB = np.zeros((3,total), dtype=np.float16) fRGB[0,:] = factor fRGB[1,:] = 1-factor sum_sqdata = np.sum(np.square(data),axis=2) R_ref = np.empty((a,b,total),dtype=np.float16) for i in range(total): print i matrix_ref_com = fRGB[0,i]*matrix_ref1 + fRGB[1,i]*matrix_ref2 sqr = np.square(data - matrix_ref_com) R_ref[:, :, i] = np.sum(sqr, axis=2) / sum_sqdata min_R = np.amin(R_ref, axis=2) index = scipy.argmin(R_ref, axis=2) return min_R, index, fRGB
def decode(file_name): border.rotate(file_name) image = Image.open("temp.png") q = border.find("temp.png") ind = sp.argmin(sp.sum(q, 1), 0) up_left = q[ind, 0] + 2 up_top = q[ind, 1] + 2 d_right = q[ind+1, 0] - 3 d_bottom = q[ind-1, 1] - 3 box = (up_left, up_top, d_right, d_bottom) region = image.crop(box) h_sum = sp.sum(region, 0) m = argrelmax(sp.correlate(h_sum, h_sum, 'same')) s = sp.average(sp.diff(m)) m = int(round(d_right - up_left)/s) if m % 3 != 0: m += 3 - m % 3 n = int(round(d_bottom - up_top)/s) if n % 4 != 0: n += 4 - n % 4 s = int(round(s))+1 region = region.resize((s*m, s*n), PIL.Image.ANTIALIAS) region.save("0.png") pix = region.load() matrix = mix.off(rec.matrix(pix, s, m, n)) str2 = hamming.decode(array_to_str(matrix)) return hamming.bin_to_str(str2)
def getclosest(self,coords,timelist=None): """This method will get the closest set of parameters in the coordinate space. It will return the parameters from all times. Input coords - A list of x,y and z coordinates. Output paramout - A NtxNp array from the closes output params sphereout - A Nc length array The sphereical coordinates of the closest point. cartout - Cartisian coordinates of the closes point. """ X_vec = self.Cart_Coords[:,0] Y_vec = self.Cart_Coords[:,1] Z_vec = self.Cart_Coords[:,2] xdiff = X_vec -coords[0] ydiff = Y_vec -coords[1] zdiff = Z_vec -coords[2] distall = xdiff**2+ydiff**2+zdiff**2 minidx = np.argmin(distall) paramout = self.Param_List[minidx] velout = self.Velocity[minidx] datatime = self.Time_Vector if sp.ndim(self.Time_Vector)>1: datatime = datatime[:,0] if timelist is not None: timeindx = [] for itime in timelist: timeindx.append(sp.argmin(sp.absolute(itime-datatime))) paramout=paramout[timeindx] velout=velout[timeindx] sphereout = self.Sphere_Coords[minidx] cartout = self.Cart_Coords[minidx] return (paramout,velout,sphereout,cartout,np.sqrt(distall[minidx]))
def _init_params(self, X): init = self.init n_samples, n_features = X.shape n_components = self.n_components if (init == 'kmeans'): km = Kmeans(n_components) clusters, mean, cov = km.cluster(X) coef = sp.array([c.shape[0] / n_samples for c in clusters]) comps = [multivariate_normal(mean[i], cov[i], allow_singular=True) for i in range(n_components)] elif (init == 'rand'): coef = sp.absolute(sprand.randn(n_components)) coef = coef / coef.sum() means = X[sprand.permutation(n_samples)[0: n_components]] clusters = [[] for i in range(n_components)] for x in X: idx = sp.argmin([spla.norm(x - mean) for mean in means]) clusters[idx].append(x) comps = [] for k in range(n_components): mean = means[k] cov = sp.cov(clusters[k], rowvar=0, ddof=0) comps.append(multivariate_normal(mean, cov, allow_singular=True)) self.coef = coef self.comps = comps
def fit(self, X): n_samples, n_features = X.shape n_classes = self.n_classes max_iter = self.max_iter tol = self.tol rand_center_idx = sprand.permutation(n_samples)[0:n_classes] center = X[rand_center_idx].T responsilibity = sp.zeros((n_samples, n_classes)) for iter in range(max_iter): # E step dist = sp.expand_dims(X, axis=2) - sp.expand_dims(center, axis=0) dist = spla.norm(dist, axis=1)**2 min_idx = sp.argmin(dist, axis=1) responsilibity.fill(0) responsilibity[sp.arange(n_samples), min_idx] = 1 # M step center_new = sp.dot(X.T, responsilibity) / sp.sum(responsilibity, axis=0) diff = center_new - center print('K-Means: {0:5d} {1:4e}'.format(iter, spla.norm(diff) / spla.norm(center))) if (spla.norm(diff) < tol * spla.norm(center)): break center = center_new self.center = center.T self.responsibility = responsilibity return self
def onMousePointViewer(event, _x, _y, flags, param): x = _x / SHOW_IMAGE_SCALE y = _y / SHOW_IMAGE_SCALE if event == cv2.EVENT_MOUSEMOVE: return elif event == cv2.EVENT_RBUTTONDOWN: return elif event == cv2.EVENT_LBUTTONDOWN: global curKeypoint2DLists global curKeypointIdLists if len(curKeypoint2DLists) > 0: query = np.array([x, y]) nearestIndex = scipy.argmin([ scipy.inner(query - point, query - point) for point in curKeypoint2DLists ]) nearest = curKeypoint2DLists[nearestIndex] if (math.sqrt(scipy.inner(query - nearest, query - nearest)) < THRESHOLD_FIND_CLICKED_KEYPOINT): print "selected point index : " + str(nearestIndex) print "selected point id : " + str( curKeypointIdLists[nearestIndex]) print "selected point : " + str(nearest[0]) + "," + str( nearest[1]) showPointViewer(curKeypointIdLists[nearestIndex]) return print "There is no keypoint around clicked points. Clicked point is (" + str( x) + "," + str(y) + ")" return
def fit_params(self, rfl_meas, geom, *args): """Given a reflectance estimate and one or more emissive parameters, fit a state vector.""" glint_band = s.argmin(abs(900-self.wl)) glint = s.mean(rfl_meas[(glint_band-2):glint_band+2]) water_band = s.argmin(abs(400-self.wl)) water = s.mean(rfl_meas[(water_band-2):water_band+2]) if glint > 0.05 or water < glint: glint = 0 glint = max(self.bounds[self.glint_ind][0]+eps, min(self.bounds[self.glint_ind][1]-eps, glint)) lamb_est = rfl_meas - glint x = ThermalSurface.fit_params(self, lamb_est, geom) x[self.glint_ind] = glint return x
def main(infile): inhdr = infile + ".hdr" outhdr = os.path.basename(infile)[0:-11] + "_glint900metric.hdr" img = envi.open(inhdr, infile) wl = s.array([float(w) for w in img.metadata['wavelength']]) if (wl[0] < 100): wl = wl * 1000 fwhm = s.array([float(w) for w in img.metadata['fwhm']]) b900 = s.argmin(abs(wl - 900.0)) ## b1050 = s.argmin(abs(wl-1050.0)) b900imgs = img.read_bands([b900 - 1, b900, b900 + 1]) ## b1050imgs = img.read_bands([b1050-1, b1050, b1050+1]) glintm900 = np.median(b900imgs, axis=2).reshape( (b900imgs.shape[0], 1, b900imgs.shape[1])) ## glintm1050 = np.median(b1050imgs, axis=2).reshape((b1050imgs.shape[0], 1, b1050imgs.shape[1])) # make output glint metric file and open memmap metadata = img.metadata.copy() metadata['bands'] = '%i' % 1 metadata['interleave'] = 'bil' metadata['data type'] = '4' out = envi.create_image(outhdr, metadata, ext='', force=True) outmm = out.open_memmap(interleave='source', writable=True) outmm[:, :, :] = glintm900 del outmm, out, glintm900, img
def waterFraction1StepProfiler(model_id, path_gastronoom, fraction, rfrac): ''' Create a 1-step fractional profile for water. The original water abundance profile is taken from the output of the original model without fractional abundances. These fraction profiles can be used for CHANGE_ABUNDANCE_FRACTION in mline @param model_id: The model id of the original cooling model @type model_id: string @param path_gastronoom: The model subfolder in ~/GASTRoNOoM/ @type path_gastronoom: string @param fraction: the fraction used @type fraction: float @param rfrac: the radius at the step to the fractional abundance [cm] @type rfrac: float ''' rfrac = float(rfrac) fraction = float(fraction) filename = os.path.join(cc.path.gastronoom,path_gastronoom,'models',\ model_id,'coolfgr_all%s.dat'%model_id) rad = Gastronoom.getGastronoomOutput(filename=filename,keyword='RADIUS',\ return_array=1) fraction_profile = scipy.ones(len(rad)) step_index = scipy.argmin(abs(rad - rfrac)) fraction_profile[step_index:] = fraction output_filename = os.path.join(cc.path.gastronoom,path_gastronoom,\ 'profiles',\ 'water_fractions_%s_%.2f_r%.3e.dat'\ %(model_id,fraction,rfrac)) DataIO.writeCols(output_filename, [rad, fraction_profile])
def lplot(beta, info, cols=range(X.shape[1])): # Find best fitting model bestAIC, bestIdx = min(info[3].AIC), argmin(info[3].AIC) best_s = info[5].s[bestIdx] best_beta = beta[:, bestIdx] xx = np.array([info[5].s[i][0] for i in range(len(info[5].s))]) x = xx.reshape(len(info[5].s), 1) beta = beta.T print('-----------------------') print('Feature importance') print('-----------------------') cols = cols for col, coef in zip(cols, best_beta.tolist()): print('{}: {}'.format(col, coef)) # Plot results f, ax = plt.subplots(figsize=(6, 4)) ax.plot(x, beta, '.-') plt.xlabel(r"$s$", fontsize=18) plt.ylabel(r"$\beta$", fontsize=18, rotation=90) plt.xticks(color='k', fontsize=18) plt.yticks(color='k', fontsize=18) ax.legend(list(range(1, len(beta) + 1))) plt.axvline(best_s, -6, 14, linewidth=0.25, color='r', linestyle=':') #plt.show() plt.savefig('larsplot')
def fit(proba, labels, beta=4, th=0.000001): """ """ # Get some parameters and do initialization diff = [1] niter = 0 [nl, nc, C] = proba.shape # Iterate until convergence while (diff[-1] > th) and (niter < 100): old_labels = labels.copy() # Make a copy of the old labels for i in xrange(1, nl - 1): # Scan each line for j in xrange(1, nc - 1): # Scan each column energy = [] labels_ = old_labels[i - 1:i + 2, j - 1:j + 2].copy() for c in xrange( C): # Compute the energy for the different classes labels_[1, 1] = c + 1 energy.append(compute_energy(proba[i, j, c], labels_, beta)) arg = sp.argmin( energy ) # Get the maximum energy term for the local configuration labels[i, j] = arg + 1 diff.append(1 - sp.sum(old_labels == labels).astype(float) / nc / nl) # Compute the changes niter += 1 # Clean data del old_labels return diff
def set_timestamp(self, ts=None): if (ts == None): i = 0 else: i = argmin(abs(self.times - ts)) self.times_index = i
def run_halo(file): halo = [int(x) for x in file.readline().split()] ra = scipy.zeros(halo[1]) dec = scipy.zeros(halo[1]) z = scipy.zeros(halo[1]) mag = scipy.zeros(halo[1]) for i in xrange(halo[1]): line = file.readline().split() ra[i] = float(line[0]) dec[i] = float(line[1]) z[i] = float(line[2]) mag[i] = float(line[3]) ra *= 180/scipy.pi dec *= 180/scipy.pi z /= c zo = stattools.Cbi(z) bcg = scipy.argmin(mag) center = (ra[bcg], dec[bcg]) dist = astCoords.calcAngSepDeg(center[0], center[1], ra, dec) rproj = 1e3 * scipy.array(map(cosmology.dProj, zo*scipy.ones(len(z)), dist)) failed = (63, 86, 202, 276, 396, 401, 558, 625, 653, 655, 665, 676, 693, 836) if halo[0] in failed: print '' print 'Halo %d' %halo[0] verbose = True else: verbose = False zo, s, m200, r200, n200 = pytools.M200(rproj, z, errors=True, converge=False, verbose=verbose) outpath = 'phase1/plots/true-members/rv/' if n200 == -1: outpath += 'failed/' plot_rv(halo[0], rproj, z, zo, n200, m200, r200, outpath) return halo[0], center, halo[1], n200, zo, s, m200, r200
def __init__(self, func, pop0, args=(), crossover_rate=0.5, scale=None, strategy=("rand", 2, "bin"), eps=1e-6): self.func = func self.population = sp.array(pop0) # added by Minh-Tri Pham for n in xrange(len(self.population)): self.refine(self.population[n]) self.npop, self.ndim = self.population.shape self.args = args self.crossover_rate = crossover_rate self.strategy = strategy self.eps = eps self.pop_values = [self.func(m, *args) for m in self.population] bestidx = sp.argmin(self.pop_values) self.best_vector = self.population[bestidx] self.best_value = self.pop_values[bestidx] if scale is None: self.scale = self.calculate_scale() else: self.scale = scale self.generations = 0 self.best_val_history = [] self.best_vec_history = [] self.jump_table = { ("rand", 1, "bin"): (self.choose_rand, self.diff1, self.bin_crossover), ("rand", 2, "bin"): (self.choose_rand, self.diff2, self.bin_crossover), ("best", 1, "bin"): (self.choose_best, self.diff1, self.bin_crossover), ("best", 2, "bin"): (self.choose_best, self.diff2, self.bin_crossover), ("rand-to-best", 1, "bin"): (self.choose_rand_to_best, self.diff1, self.bin_crossover), }
def check_if_click_is_on_an_existing_point(mouse_x_coord,mouse_y_coord): # First, figure out how many points we have. # Each point is one row in the coords_array, # so we count the number of rows, which is dimension-0 for Python number_of_points = scipy.shape(coords_array)[0] this_coord = scipy.array([[ mouse_x_coord, mouse_y_coord ]]) # The double square brackets above give the this_coord array # an explicit structure of having rows and also columns if number_of_points > 0: # If there are some points, we want to calculate the distance # of the new mouse-click location from every existing point. # One way to do this is to make an array which is the same size # as coords_array, and which contains the mouse x,y-coords on every row. # Then we can subtract that xy_coord_matchng_matrix from coords_array ones_vec = scipy.ones((number_of_points,1)) xy_coord_matching_matrix = scipy.dot(ones_vec,this_coord) distances_from_existing_points = (coords_array - xy_coord_matching_matrix) squared_distances_from_existing_points = distances_from_existing_points**2 sum_sq_dists = scipy.sum(squared_distances_from_existing_points,axis=1) # The axis=1 means "sum over dimension 1", which is columns for Python euclidean_dists = scipy.sqrt(sum_sq_dists) distance_threshold = 0.5 within_threshold_points = scipy.nonzero(euclidean_dists < distance_threshold ) num_within_threshold_points = scipy.shape(within_threshold_points)[1] if num_within_threshold_points > 0: # We only want one matching point. # It's possible that more than one might be within threshold. # So, we take the unique smallest distance point_to_be_deleted = scipy.argmin(euclidean_dists) return point_to_be_deleted else: # If there are zero points, then we are not deleting any point_to_be_deleted = -1 return point_to_be_deleted
def assign(points, means): """return a 1-d array assigning each point to the nearest mean (by Euclidean distance)""" cd = cdist(points, means, 'euclidean') # each row has distance to all means # get indices of closest mean return scipy.argmin(cd, axis=1)
def component(self, x_surface, geom): """ We pick a surface model component using the Mahalanobis distance. This always uses the Lambertian (non-specular) version of the surface reflectance.""" if len(self.components) <= 1: return 0 # Get the (possibly normalized) reflectance lrfl = self.calc_lrfl(x_surface, geom) ref_lrfl = lrfl[self.refidx] ref_lrfl = ref_lrfl / self.norm(ref_lrfl) # Mahalanobis or Euclidean distances mds = [] for ci in range(self.ncomp): ref_mu = self.mus[ci] ref_Cinv = self.Cinvs[ci] if self.selection_metric == 'Mahalanobis': md = (ref_lrfl - ref_mu).T.dot(ref_Cinv).dot(ref_lrfl - ref_mu) else: md = sum(pow(ref_lrfl - ref_mu, 2)) mds.append(md) closest = s.argmin(mds) return closest
def __init__(self, geneseq, *, seed=1, wt_latent=5, norm_weights=((0.4, -0.7, 1.5), (0.6, -7, 3.5)), stop_effect=-15, min_observed_enrichment=0.001): """See main class docstring for how to initialize.""" self.wt_latent = wt_latent if not (0 <= min_observed_enrichment < 1): raise ValueError('not 0 <= `min_observed_enrichment` < 1') self.min_observed_enrichment = min_observed_enrichment # simulate muteffects from compound normal distribution self.muteffects = {} if seed is not None: random.seed(seed) weights, means, sds = zip(*norm_weights) cumweights = scipy.cumsum(weights) for icodon in range(len(geneseq) // 3): wt_aa = CODON_TO_AA[geneseq[3 * icodon: 3 * icodon + 3]] for mut_aa in AAS_WITHSTOP: if mut_aa != wt_aa: if mut_aa == '*': muteffect = stop_effect else: # choose Gaussian from compound normal i = scipy.argmin(cumweights < random.random()) # draw mutational effect from chosen Gaussian muteffect = random.gauss(means[i], sds[i]) self.muteffects[f"{wt_aa}{icodon + 1}{mut_aa}"] = muteffect
def makepulse(ptype,plen,ts): """ This will make the pulse array. Inputs ptype - The type of pulse used. plen - The length of the pulse in seconds. ts - The sampling rate of the pulse. Output pulse - The pulse array that will be used as the window in the data formation. plen - The length of the pulse with the sampling time taken into account. """ nsamps = int(sp.round_(plen/ts)) if ptype.lower()=='long': pulse = sp.ones(nsamps) plen = nsamps*ts elif ptype.lower()=='barker': blen = sp.array([1,2, 3, 4, 5, 7, 11,13]) nsampsarg = sp.argmin(sp.absolute(blen-nsamps)) nsamps = blen[nsampsarg] pulse = GenBarker(nsamps) plen = nsamps*ts #elif ptype.lower()=='ac': else: raise ValueError('The pulse type %s is not a valide pulse type.' % (ptype)) return (pulse,plen)
def makepulse(ptype, plen, ts): """ This will make the pulse array. Inputs ptype - The type of pulse used. plen - The length of the pulse in seconds. ts - The sampling rate of the pulse. Output pulse - The pulse array that will be used as the window in the data formation. plen - The length of the pulse with the sampling time taken into account. """ nsamps = int(sp.floor(plen / ts)) if ptype.lower() == 'long': pulse = sp.ones(nsamps) plen = nsamps * ts elif ptype.lower() == 'barker': blen = sp.array([1, 2, 3, 4, 5, 7, 11, 13]) nsampsarg = sp.argmin(sp.absolute(blen - nsamps)) nsamps = blen[nsampsarg] pulse = GenBarker(nsamps) plen = nsamps * ts #elif ptype.lower()=='ac': else: raise ValueError('The pulse type %s is not a valide pulse type.' % (ptype)) return (pulse, plen)
def compute_loocv_gmm(variable,model,x,y,ids,K_u,alpha,beta,log_prop_u): """ Function that computes the estimation of the loocv for the GMM model with variables ids + variable(i) Inputs: model : the GMM model x,y : the training samples and the corresponding label ids : the pool of selected variables variable : the variable to be tested from the set of available variable K_u : the initial prediction values computed with all the samples alpha, beta and log_prop_u : constant that are computed outside of the loop to increased speed Outputs: loocv_temp : the loocv Used in GMM.forward_selection() """ n = x.shape[0] ids.append(variable) # Iteratively add one of the remaining variables Kp = model.predict_gmm(x,ids=ids)[1]# Predict with all the samples with ids loocv_temp=0.0; # Initialization of the temporary loocv for j in range(n): # Predict the class with the model ids_t Kloo = Kp[j,:] + K_u # Initialization of the decision rule for sample "j" #--- Change for only not C---# c = int(y[j]-1) # Update of parameter of class c m = (model.ni[c]*model.mean[c,ids] -x[j,ids])*alpha[c] # Update the mean value xb = x[j,ids] - m # x centered cov_u = (model.cov[c,ids,:][:,ids] - sp.outer(xb,xb)*alpha[c])*beta # Update the covariance matrix logdet,rcond = safe_logdet(cov_u) Kloo[c] = logdet - 2*log_prop_u[c] + sp.vdot(xb,mylstsq(cov_u,xb.T,rcond)) # Compute the new decision rule del cov_u,xb,m,c yloo = sp.argmin(Kloo)+1 loocv_temp += float(yloo==y[j]) # Check the correct/incorrect classification rule ids.pop() # Remove the current variable return loocv_temp/n # Compute loocv for variable
def IDrefsub_BrainSync(sub_data): ''' Input: sub_data: input vector x time x subject data matrix containing reference subjects load data by using module stats_utils.load_bfp_data Ouput: subRef_data = vector x time matrix the of most representative subject q = # of reference subject according to order of sub_data input ''' nSub = sub_data.shape[2] print('calculating pairwise correlations between all pairs of ' + str(nSub) + ' subjects') dist_all_orig = sp.zeros([nSub, nSub]) dist_all_rot = dist_all_orig.copy() for ind1 in range(nSub): for ind2 in range(nSub): dist_all_orig[ind1, ind2] = sp.linalg.norm(sub_data[:, :, ind1] - sub_data[:, :, ind2]) sub_data_rot, _ = brainSync(X=sub_data[:, :, ind1], Y=sub_data[:, :, ind2]) dist_all_rot[ind1, ind2] = sp.linalg.norm(sub_data[:, :, ind1] - sub_data_rot) print(ind1, ind2, dist_all_rot[ind1, ind2]) q = sp.argmin(dist_all_rot.sum(1)) subRef_data = sub_data[:, :, q] print('Subject number ' + str(q) + ' identified as most representative subject') return subRef_data, q
def fit2D_2ref(ref1,ref2,data,res): [a,b,c] = data.shape[0],data.shape[1],data.shape[2] print a,b,c matrix_ref1 = np.copy(data) matrix_ref2 = np.copy(data) for i in range(c): matrix_ref1[:, :, i] = ref1[i] matrix_ref2[:, :, i] = ref2[i] total = 100/res + 1 total = int(total) factor = (np.linspace(0,1,total)) fRGB = np.zeros((3,total), dtype=np.float16) fRGB[0,:] = factor fRGB[1,:] = 1-factor sum_sqdata = np.sum(np.square(data),axis=2) R_ref = np.empty((a,b,total),dtype=np.float16) for i in range(total): print i combination = i matrix_ref_com = fRGB[0,i]*matrix_ref1 + fRGB[1,i]*matrix_ref2 sqr = np.square(data - matrix_ref_com) R_ref[:, :, i] = np.sum(sqr, axis=2) / sum_sqdata min_R = np.amin(R_ref, axis=2) index = scipy.argmin(R_ref, axis=2) save_dir = 'D:/Research/BNL_2014_Summer_Intern/xanes_PyQT' f = open(save_dir+'/index.txt','w') for i in range(a): f.write("%14.5f\n"%( index[i,500])) f.close() return min_R, index, fRGB
def __init__(self, func, pop0, args=(), crossover_rate=0.5, scale=None, strategy=('rand', 2, 'bin'), eps=1e-6): self.func = func self.population = sp.array(pop0) self.npop, self.ndim = self.population.shape self.args = args self.crossover_rate = crossover_rate self.strategy = strategy self.eps = eps self.pop_values = [self.func(m, *args) for m in self.population] bestidx = sp.argmin(self.pop_values) self.best_vector = self.population[bestidx] self.best_value = self.pop_values[bestidx] if scale is None: self.scale = self.calculate_scale() else: self.scale = scale self.generations = 0 self.best_val_history = [] self.best_vec_history = [] self.jump_table = { ('rand', 1, 'bin'): (self.choose_rand, self.diff1, self.bin_crossover), ('rand', 2, 'bin'): (self.choose_rand, self.diff2, self.bin_crossover), ('best', 1, 'bin'): (self.choose_best, self.diff1, self.bin_crossover), ('best', 2, 'bin'): (self.choose_best, self.diff2, self.bin_crossover), ('rand-to-best', 1, 'bin'): (self.choose_rand_to_best, self.diff1, self.bin_crossover), }
def _set_reach_dist(setofobjects, point_index, epsilon): # Assumes that the query returns ordered (smallest distance first) # entries. This is the case for the balltree query... dists, indices = setofobjects.query(setofobjects.data[point_index], setofobjects._nneighbors[point_index]) # Checks to see if there more than one member in the neighborhood ## if sp.iterable(dists): # Masking processed values ## # n_pr is 'not processed' n_pr = indices[(setofobjects._processed[indices] < 1)[0].T] rdists = sp.maximum(dists[(setofobjects._processed[indices] < 1)[0].T], setofobjects.core_dists_[point_index]) new_reach = sp.minimum(setofobjects.reachability_[n_pr], rdists) setofobjects.reachability_[n_pr] = new_reach # Checks to see if everything is already processed; # if so, return control to main loop ## if n_pr.size > 0: # Define return order based on reachability distance ### return n_pr[sp.argmin(setofobjects.reachability_[n_pr])] else: return point_index
def detect_intronreten(genes, gidx, log=False, edge_limit=1000): # [idx_intron_reten,intron_intron_reten] = detect_intronreten(genes) ; idx_intron_reten = [] intron_intron_reten = [] for iix, ix in enumerate(gidx): if log: sys.stdout.write('.') if (iix + 1) % 50 == 0: sys.stdout.write(' - %i/%i, found %i\n' % (iix + 1, genes.shape[0] + 1, len(idx_intron_reten))) sys.stdout.flush() genes[iix].from_sparse() num_exons = genes[iix].splicegraph.get_len() vertices = genes[iix].splicegraph.vertices edges = genes[iix].splicegraph.edges.copy() genes[iix].to_sparse() if edges.shape[0] > edge_limit: print '\nWARNING: not processing gene %i (%s); has %i edges; current limit is %i; adjust edge_limit to include.' % (ix, genes[iix].name, edges.shape[0], edge_limit) continue #introns = [] introns = sp.zeros((0, 2), dtype='int') for exon_idx in range(num_exons - 1): # start of intron idx = sp.where(edges[exon_idx, exon_idx + 1 : num_exons] == 1)[0] if idx.shape[0] == 0: continue idx += (exon_idx + 1) for exon_idx2 in idx: # end of intron #is_intron_reten = False if sp.sum((introns[:, 0] == vertices[1, exon_idx]) & (introns[:, 1] == vertices[0, exon_idx2])) > 0: continue ### find shortest fully overlapping exon iidx = sp.where((vertices[0, :] < vertices[1, exon_idx]) & (vertices[1, :] > vertices[0, exon_idx2]))[0] if len(iidx) > 0: iidx = iidx[sp.argmin(vertices[1, iidx] - vertices[0, iidx])] idx_intron_reten.append(ix) intron_intron_reten.append([exon_idx, exon_idx2, iidx]) introns = sp.r_[introns, [[vertices[1, exon_idx], vertices[0, exon_idx2]]]] #for exon_idx1 in range(num_exons): # exon # # check that the exon covers the intron # if (vertices[1, exon_idx] > vertices[0, exon_idx1]) and (vertices[0, exon_idx2] < vertices[1, exon_idx1]): # is_intron_reten = True # long_exon = exon_idx1 # for l in range(len(introns)): # if (vertices[1, exon_idx] == introns[l][0]) and (vertices[0, exon_idx2] == introns[l][1]): # is_intron_reten = False #if is_intron_reten: # idx_intron_reten.append(ix) # intron_intron_reten.append([exon_idx, exon_idx2, long_exon]) # introns.append([vertices[1, exon_idx], vertices[0, exon_idx2]]) if log: print '\nNumber of intron retentions:\t\t\t\t\t%d' % len(idx_intron_reten) return (idx_intron_reten, intron_intron_reten)
def nearestNeighborDist(mySet,dType): dMatX=sd.cdist(mySet,mySet,dType) minD=[] j=0 for i in range(len(dMatX)): arr=dMatX[i] ind = sc.argmin(arr) if ind == j: arr = np.delete(arr,ind) myMin,ind = np.min(arr),sc.argmin(arr) minD.append(myMin) j+=1 nnDist = float(np.sum(minD)/len(minD)) return nnDist #older and slower implementation of nearest-neighbor distance """
def pso(func, nswarm, lbound, ubound, vmax, args=(), maxiter=1000, cp=2.0, cg=2.0): ndim = len(lbound) lbound = sp.asarray(lbound) ubound = sp.asarray(ubound) vmax = sp.asarray(vmax) # initialize the swarm swarm = lbound + sp.rand(nswarm, ndim)*(ubound-lbound) # initialize the "personal best" values pbestv = sp.zeros(nswarm, sp.Float) for i in sp.arange(nswarm): pbestv[i] = func(swarm[i]) pbest = sp.array(swarm) # initialize the "global best" values gbesti = sp.argmin(pbestv) gbestv = pbestv[gbesti] gbest = pbest[gbesti] # initialize velocities velocities = 2*vmax*sp.randn(nswarm, ndim) - vmax for i in sp.arange(maxiter): values = sp.zeros(nswarm, sp.Float) for j in sp.arange(nswarm): values[j] = func(swarm[j]) mask = values < pbestv mask2d = sp.repeat(mask, ndim) mask2d.shape = (nswarm, ndim) pbestv = sp.where(mask, values, pbestv) pbest = sp.where(mask2d, swarm, pbest) if sp.minimum.reduce(pbestv) < gbestv: gbesti = sp.argmin(pbestv) gbestv = pbestv[gbesti] gbest = pbest[gbesti] velocities += (cp*sp.rand()*(pbest - swarm) + cg*sp.rand()*(gbest - swarm)) velocities = sp.clip(velocities, -vmax, vmax) swarm += velocities swarm = sp.clip(swarm, lbound, ubound) yield gbest
def computeDistances(data, centroids, f): N = centroids.shape[0] T = data.shape[0] clusterAssignments = sp.zeros(T) for i in xrange(T): dists = sp.array([f(data[i,:], centroids[j,:]) for j in xrange(N)]) clusterAssignments[i] = sp.argmin(dists) return clusterAssignments
def best_match(self, target_FV): loc = scipy.argmin((((self.nodes - target_FV)**2).sum(axis=2))**0.5) r = 0 while loc > self.width: loc -= self.width r += 1 c = loc return (r, c)
def calc_levels(self,data,fraction=(0.1,0.9),nbins=500,samples=None): """ fraction is a tuple with (low, high) in the range of 0 to 1 nbins is the number of bins for the histogram resolution if samples: draw this number of samples (random inds) for faster calculation """ if samples: data = data.flatten()[random.randint(sp.prod(data.shape),size=samples)] else: data = data.flatten() y,x = sp.histogram(data,bins=nbins) cy = sp.cumsum(y).astype('float32') cy = cy / cy.max() minInd = sp.argmin(sp.absolute(cy - fraction[0])) maxInd = sp.argmin(sp.absolute(cy - fraction[1])) levels = (x[minInd],x[maxInd]) return levels
def calculateThreshold(image, coveragePercent): import scipy data = image.data histogram = scipy.histogram(data, len(scipy.unique(data))) cumsum = scipy.cumsum(histogram[0]) targetValue = cumsum[-1] * coveragePercent index = scipy.argmin(scipy.absolute(cumsum - targetValue)) threshold = histogram[1][index] return threshold * image.unit
def rotate(file_name): box = find(file_name) m = sp.argmin(sp.sum(box, 1), 0) img = cv2.imread(file_name, 0) rows, cols = img.shape angle = degrees(atan(float((box[m, 0] - box[m-1, 0]))/(box[m-1, 1] - box[m, 1]))) M = cv2.getRotationMatrix2D((cols/2, rows/2), angle, 1) dst = cv2.warpAffine(img, M, (cols, rows)) cv2.imwrite("temp.png", dst)
def matchfinder(stuff): i,k = stuff q = (k - lutable)**2 q_sum = scipy.sum(q,1) q_n = scipy.sqrt(q_sum) q_d = scipy.sqrt(sum(k**2,0)) if scipy.all(k ==0): return (i, lutable_zero_index) else: q = q_n / q_d * 100 return (i, scipy.argmin(q))
def classify(self,data): if type(data) is not scipy.ndarray: print "The input data is of the wrong type, it should be an array" return None nnMinDistance = [] for d in data : if len(d) == self.dataTraining.shape[1]: #ensures each observation has the correct number of variables distance = [] distance.append(scipy.array([c.distanceToSet(d) for c in self.clusterLabel])) #employs previously written SetCLuster class nnMinDistance.append(scipy.argmin(distance)) return nnMinDistance
def best_match(self, target_FV): """ Returns y,x location of node that best matches target_FV, Uses Euclidean distance target_FV is a scipy array """ loc = scipy.argmin((((self.nodes - target_FV)**2).sum(axis=2))**0.5) x = loc y = 0 while x >= self.width: x -= self.width y += 1 return (y, x)
def getclosest(self,coords,timelist=None): """ This method will get the closest set of parameters in the coordinate space. It will return the parameters from all times. Input coords - A list of x,y and z coordinates. Output paramout - A NtxNp array from the closes output params sphereout - A Nc length array The sphereical coordinates of the closest point. cartout - Cartisian coordinates of the closes point. distance - The spatial distance between the returned location and the desired location. minidx - The spatial index point. tvec - The times of the returned data. """ X_vec = self.Cart_Coords[:,0] Y_vec = self.Cart_Coords[:,1] Z_vec = self.Cart_Coords[:,2] xdiff = X_vec -coords[0] ydiff = Y_vec -coords[1] zdiff = Z_vec -coords[2] distall = xdiff**2+ydiff**2+zdiff**2 minidx = np.argmin(distall) paramout = self.Param_List[minidx] velout = self.Velocity[minidx] datatime = self.Time_Vector tvec = self.Time_Vector if sp.ndim(self.Time_Vector)>1: datatime = datatime[:,0] if isinstance(timelist,list): timelist=sp.array(timelist) if timelist is not None: timeindx = [] for itime in timelist: if sp.isscalar(itime): timeindx.append(sp.argmin(sp.absolute(itime-datatime))) else: # look for overlap log1 = (tvec[:,0]>=itime[0]) & (tvec[:,0]<itime[1]) log2 = (tvec[:,1]>itime[0]) & (tvec[:,1]<=itime[1]) log3 = (tvec[:,0]<=itime[0]) & (tvec[:,1]>itime[1]) log4 = (tvec[:,0]>itime[0]) & (tvec[:,1]<itime[1]) tempindx = sp.where(log1|log2|log3|log4)[0] timeindx = timeindx +tempindx.tolist() paramout=paramout[timeindx] velout=velout[timeindx] tvec = tvec[timeindx] sphereout = self.Sphere_Coords[minidx] cartout = self.Cart_Coords[minidx] return (paramout,velout,sphereout,cartout,np.sqrt(distall[minidx]),minidx,tvec)
def predict_gmm(self, testSamples, featIdx=None, tau=0): """ Function that predict the label for testSamples using the learned model Inputs: testSamples: the samples to be classified featIdx: indices of features to use for classification tau: regularization parameter Outputs: predLabels: the class scores: the decision value for each class """ # Get information from the data nbTestSpl = testSamples.shape[0] # Number of testing samples # Initialization scores = sp.empty((nbTestSpl,self.C)) # If not specified, predict with all features if featIdx is None: idx = range(testSamples.shape[1]) else: idx = list(featIdx) # Allocate storage for decomposition in eigenvalues if self.idxDecomp != idx: self.vp = sp.empty((self.C,len(idx))) # array of eigenvalues self.Q = sp.empty((self.C,len(idx),len(idx))) # array of eigenvectors flagDecomp = True else: flagDecomp = False # Start the prediction for each class for c in xrange(self.C): testSamples_c = testSamples[:,idx] - self.mean[c,idx] if flagDecomp: self.vp[c,:],self.Q[c,:,:],_ = self.decomposition(self.cov[c,idx,:][:,idx]) regvp = self.vp[c,:] + tau logdet = sp.sum(sp.log(regvp)) cst = logdet - 2*sp.log(self.prop[c]) # Pre compute the constant term # compute ||lambda^{-0.5}q^T(x-mu)||^2 + cst for all samples scores[:,c] = sp.sum( sp.square( sp.dot( (self.Q[c,:,:][:,:]/sp.sqrt(regvp)).T, testSamples_c.T ) ), axis=0 ) + cst del testSamples_c self.idxDecomp = idx # Assign the label to the minimum value of scores predLabels = sp.argmin(scores,1)+1 return predLabels,scores