def fast_ica(image, components): """Reconstruct an image from Fast ICA compression using specific number of components to use Args: image: PIL Image, Numpy array or path of 3D image components: Number of components used for reconstruction Returns: Reconstructed image Example: >>> from PIL import Image >>> import numpy as np >>> from ipfml.processing import reconstruction >>> image_values = Image.open('./images/test_img.png') >>> reconstructed_image = reconstruction.fast_ica(image_values, 25) >>> reconstructed_image.shape (200, 200) """ lab_img = transform.get_LAB_L(image) lab_img = np.array(lab_img, 'uint8') ica = FastICA(n_components=50) # run ICA on image ica.fit(lab_img) # reconstruct image with independent components image_ica = ica.fit_transform(lab_img) restored_image = ica.inverse_transform(image_ica) return restored_image
def test_inverse_transform(): # Test FastICA.inverse_transform n_features = 10 n_samples = 100 n1, n2 = 5, 10 rng = np.random.RandomState(0) X = rng.random_sample((n_samples, n_features)) expected = {(True, n1): (n_features, n1), (True, n2): (n_features, n2), (False, n1): (n_features, n2), (False, n2): (n_features, n2)} for whiten in [True, False]: for n_components in [n1, n2]: n_components_ = (n_components if n_components is not None else X.shape[1]) ica = FastICA(n_components=n_components, random_state=rng, whiten=whiten) with warnings.catch_warnings(record=True): # catch "n_components ignored" warning Xt = ica.fit_transform(X) expected_shape = expected[(whiten, n_components_)] assert_equal(ica.mixing_.shape, expected_shape) X2 = ica.inverse_transform(Xt) assert_equal(X.shape, X2.shape) # reversibility test in non-reduction case if n_components == X.shape[1]: assert_array_almost_equal(X, X2)
def ICASingleTrial(X): """ OD-ICA 单个trial样本数据 去眼电 输入参数 ---------- X : T 采样点 x N 通道数 返回值 ---------- reconSetZero: T 采样点 x N 通道数 """ # pca = PCA() pca_w = PCA(whiten=True) ica = FastICA() # fit_transform # X: array - like, shape(n_samples, n_features) # returns: array-like, shape (n_samples, n_components) # PCAsig = pca.fit_transform(X) PCAsig_W = pca_w.fit_transform(X) ICAsig = ica.fit_transform(PCAsig_W) setZero = np.zeros(ICAsig.shape) for i in range(ICAsig.shape[1]): isOutlier = ADJBP(ICAsig[:,i]) vector = ICAsig[:,i] # 取矩阵的列 vector[isOutlier] = 0 setZero[:,i] = vector reconSetZero = pca_w.inverse_transform(ica.inverse_transform(setZero)) return reconSetZero
def test_inverse_transform(): """Test FastICA.inverse_transform""" rng = np.random.RandomState(0) X = rng.random_sample((100, 10)) rng = np.random.RandomState(0) X = rng.random_sample((100, 10)) n_features = X.shape[1] expected = {(True, 5): (n_features, 5), (True, 10): (n_features, 10), (False, 5): (n_features, 10), (False, 10): (n_features, 10)} for whiten in [True, False]: for n_components in [5, 10]: ica = FastICA(n_components=n_components, random_state=rng, whiten=whiten) Xt = ica.fit_transform(X) expected_shape = expected[(whiten, n_components)] assert_equal(ica.mixing_.shape, expected_shape) X2 = ica.inverse_transform(Xt) assert_equal(X.shape, X2.shape) # reversibility test in non-reduction case if n_components == X.shape[1]: assert_array_almost_equal(X, X2)
def _ica(self, batch, num_comps=4): ica = FastICA(n_components=batch.shape[1], max_iter=300) comps = ica.fit_transform(batch) m = (-np.abs(comps)).min(axis=0) selected = np.argsort(m)[:num_comps] comps[:, selected] = 0 return ica.inverse_transform(comps)
def PerformIca(X,Y,num_components,random_state): result = {} algo = FastICA(random_state=random_state,max_iter=800) algo.fit(X) full_mixing_matrix = algo.mixing_ full_unmixing_matrix = algo.components_ _x = algo.transform(X) kt_value = np.abs(kt(_x)) largest_kt_values_idx = np.argsort(kt_value)[::-1] result["ica_kt_all"] = kt_value for n in num_components: prefix = "ica_" + str(n) + "_" component_idx_to_select = largest_kt_values_idx[0:n] mixing_matrix = full_mixing_matrix.T[component_idx_to_select,:].T unmixing_matrix = full_unmixing_matrix[component_idx_to_select,:] algo.components_ = unmixing_matrix algo.mixing_ = mixing_matrix result[prefix+"mm"] = mixing_matrix result[prefix+"umm"] = unmixing_matrix _x = algo.transform(X) result[prefix+"data"] = _x X_recons = algo.inverse_transform(_x) result[prefix+"reconstruction_error"] = ComputeReconstructionSSE(X,X_recons) n_kt_value = kt_value[component_idx_to_select] avg_kt = n_kt_value.mean() #print("ICA num dim {0} : reconstruction error {1} avg kt {2}".format(str(n),str(result[prefix+"reconstruction_error"]),str(avg_kt))) #print(np.sort(n_kt_value)) return result
def test_inverse_transform(whiten, n_components, expected_mixing_shape, global_random_seed, global_dtype): # Test FastICA.inverse_transform n_samples = 100 rng = np.random.RandomState(global_random_seed) X = rng.random_sample((n_samples, 10)).astype(global_dtype) ica = FastICA(n_components=n_components, random_state=rng, whiten=whiten) with warnings.catch_warnings(): # For some dataset (depending on the value of global_dtype) the model # can fail to converge but this should not impact the definition of # a valid inverse transform. warnings.simplefilter("ignore", ConvergenceWarning) Xt = ica.fit_transform(X) assert ica.mixing_.shape == expected_mixing_shape X2 = ica.inverse_transform(Xt) assert X.shape == X2.shape # reversibility test in non-reduction case if n_components == X.shape[1]: # XXX: we have to set atol for this test to pass for all seeds when # fitting with float32 data. Is this revealing a bug? if global_dtype: # XXX: dividing by a smaller number makes # tests fail for some seeds. atol = np.abs(X2).mean() / 1e5 else: atol = 0.0 # the default rtol is enough for float64 data assert_allclose(X, X2, atol=atol)
def ica_experiment(X, name, dims, max_iter=5000, tol=1e-04): """Run ICA on specified dataset and saves mean kurtosis results as CSV file. Args: X (Numpy.Array): Attributes. name (str): Dataset name. dims (list(int)): List of component number values. """ ica = FastICA(random_state=0, max_iter=max_iter, tol=tol) kurt = [] loss = [] X = StandardScaler().fit_transform(X) for dim in dims: print(dim) ica.set_params(n_components=dim) tmp = ica.fit_transform(X) df = pd.DataFrame(tmp) df = df.kurt(axis=0) kurt.append(kurtosistest(tmp).statistic.mean()) proj = ica.inverse_transform(tmp) loss.append(((X - proj)**2).mean()) res = pd.DataFrame({"kurtosis": kurt, "loss": loss}) # save results as CSV resdir = 'results/ICA' resfile = get_abspath('{}_kurtosis.csv'.format(name), resdir) res.to_csv(resfile, index_label='n')
def reconstruction_error(X): pca = PCA(n_components=0.95, svd_solver='full') X_pca = pca.fit_transform(X) X_pca_proj = pca.inverse_transform(X_pca) print(pca.n_components_) pca_mse = calc_mse(X, X_pca_proj) print("Recontruction error for PCA : ", pca_mse) ica = FastICA(n_components=17, tol=0.0001) X_ica = ica.fit_transform(X) X_ica_proj = ica.inverse_transform(X_ica) ica_mse = calc_mse(X, X_ica_proj) print("Recontruction error for ICA : ", ica_mse) # rp = GaussianRandomProjection(n_components=16) # X_rp = rp.fit_transform(X) # X_rp_proj = np.matmul(X_rp, rp.components_) # rp_mse = calc_mse(X, X_rp_proj) # # print("Recontruction error for RP : ", rp_mse) tsvd = TruncatedSVD(n_components=16) X_tsvd = tsvd.fit_transform(X) X_tsvd_proj = tsvd.inverse_transform(X_tsvd) tsvd_mse = calc_mse(X, X_tsvd_proj) print("Recontruction error for TSVD : ", tsvd_mse)
def test_inverse_transform(): # Test FastICA.inverse_transform n_features = 10 n_samples = 100 n1, n2 = 5, 10 rng = np.random.RandomState(0) X = rng.random_sample((n_samples, n_features)) expected = { (True, n1): (n_features, n1), (True, n2): (n_features, n2), (False, n1): (n_features, n2), (False, n2): (n_features, n2) } for whiten in [True, False]: for n_components in [n1, n2]: n_components_ = (n_components if n_components is not None else X.shape[1]) ica = FastICA(n_components=n_components, random_state=rng, whiten=whiten) with warnings.catch_warnings(record=True): # catch "n_components ignored" warning Xt = ica.fit_transform(X) expected_shape = expected[(whiten, n_components_)] assert ica.mixing_.shape == expected_shape X2 = ica.inverse_transform(Xt) assert X.shape == X2.shape # reversibility test in non-reduction case if n_components == X.shape[1]: assert_array_almost_equal(X, X2)
class ICA(method.Method): def __init__(self, params): self.params = params self.ica = FastICA(**params) def __str__(self): return "FastICA" def train(self, data): """ Train the FastICA on the withened data :param data: whitened data, ready to use """ self.ica.fit(data) def encode(self, data): """ Encodes the ready to use data :returns: encoded data with dimension n_components """ return self.ica.transform(data) def decode(self, components): """ Decode the data to return whitened reconstructed data :returns: reconstructed data """ return self.ica.inverse_transform(components)
def remove_blinks(x, fs=500 / 3): ica = FastICA(tol=0.1, max_iter=500).fit(x) x_ica = ica.transform(x) # applies unmixing matrix to x win_length = 200 for i, col in enumerate(x_ica.T): # iterate columns num_blinks = 0 for j in range(0, len(col) - win_length, win_length): window = col[j:j + win_length] # window through signal is_blink = detect_blink_peak(window) # detect blink in window # count number of blinks if is_blink: num_blinks += 1 # compute kurtosis of whole signal kur = kurtosis(col) # compute hjorth complexity comp = hjorth_complexity(col)[0][0] # blink artifact source decision rule if num_blinks >= len(x_ica) / (fs * 40) and kur > 5 and comp > 3: x_ica[:, i] = 0 # apply inverse transformation x_clean = ica.inverse_transform(x_ica) return x_clean
def ICA(input_data_folder, experiment): path_input_images = f"{input_data_folder}/{experiment}/images/" try: os.mkdir(f'{input_data_folder}/{experiment}/ICAimages/') except FileExistsError: pass path_output = f"{input_data_folder}/{experiment}/ICAimages" images = images_list(path_input_images) for img, image_nb in tqdm(zip(images, range(len(images)))): # image = images[0] image = io.imread(f"{path_input_images}/{img}", as_gray=True) ica = FastICA(n_components=20) # whiten=True, # max_iter = 2000, # tol = 0.01) image_ica = ica.fit_transform(image) image_restored = ica.inverse_transform(image_ica) # image_restored = image_restored.astype(np.uint8) # show image to screen # io.imshow(image_ica) io.imsave(f'{path_output}/image{image_nb}.png', image_ica)
def solve_ica2(A, B, p=3, q=None): """Solve AX=B with ICA """ ica = FastICA(n_components=p) ica.fit(A) Ap = ica.inverse_transform(ica.transform(A)) if q: ica = FastICA(n_components=q) Bq = ica.inverse_transform(ica.transform(B)) Y = LA.lstsq(Ap, Bq, rcond=None)[0] X = Y @ Wh[:q,:] else: Y = LA.lstsq(W, B, rcond=None)[0] X = Y return X, ''
def ICA_decomposition(data,keepVar=0.95): ica=FastICA(whiten = True, max_iter = 300, tol = 0.0001) demixed=ica.fit_transform(data) M=ica.mixing_ var_scores=ordered_ICAcomps(data, M, demixed) varSum=0 n_comp=0 for value in var_scores[:,0]: varSum = varSum + value n_comp = n_comp + 1 if varSum >= keepVar: break selected_comps=var_scores[:n_comp,1] return ica, selected_comps remix=ica.inverse_transform(demixed) varSum = 0 for nc in range(0,data.shape[1]): varSum = varSum + explained_variance_score(data,data-demixed[:,nc]) n_comp = n_comp + 1 if varSum >= keepVar: break return ica, n_comp
def perform_ica(features, datasetLabel, components): print('ica start for ', datasetLabel) transformer = FastICA(max_iter=550, random_state=0, whiten=True) X_transformed = transformer.fit_transform(features) print(X_transformed.shape) unmodified_kurtosis = kurtosis(features) print('Unmodified_kurtosis ', unmodified_kurtosis) X_transformed = pd.DataFrame(X_transformed) feature_kurtosis = X_transformed.kurt(axis=0) print('Modified_kurtosis ', feature_kurtosis) recon_error = [] # validate RMSE for reconstruction for component in components: transformer = FastICA(max_iter=550, random_state=20, whiten=True, n_components=component) X_transformed = transformer.fit_transform(features) X_recon = transformer.inverse_transform(X_transformed) rmse = sqrt(mean_squared_error(features, X_recon)) recon_error.append(rmse) print('recon_error => ', recon_error) plt.style.use("seaborn") plt.plot(components, recon_error, marker='o') plt.xticks(components, rotation="90") plt.xlabel("ICA Components") plt.ylabel('Reconstruction error') plt.savefig('plots/dr/ica/' + datasetLabel + '/ica_recon_error.png') plt.clf()
def ica_dim_red(x_train_scaled, dataset_name, features_num = 12): ica = FastICA(random_state=random_state) temp = ica.fit_transform(x_train_scaled) order = [-abs(kurtosis(temp[:,i])) for i in range(temp.shape[1])] temp = temp[:,np.array(order).argsort()] ica_res = pd.Series([abs(kurtosis(temp[:,i])) for i in range(temp.shape[1])]); l = plt.bar(list(range(len(ica_res))),ica_res, log = True) plt.title("ICA Feature Kurtosis ("+str(dataset_name)+")") plt.ylabel("Kurtosis") plt.xlabel("Features (ordered by kurtosis)") plt.savefig((str(dataset_name))+' ica analysis.png') plt.show() # print(temp) print("List of of features with kurtosis <= 3") print(np.where(np.log10(ica_res) <= 0.5)[0]) ica = FastICA(n_components=features_num, random_state=random_state) ica_result = ica.fit_transform(x_train_scaled) print(ica_result.shape) x_projected_ica = ica.inverse_transform(ica_result) print(x_projected.shape) print(x_train_scaled.shape) loss = ((x_train_scaled - x_projected_ica) ** 2).mean() print(loss) return ica_result,x_projected_ica
def ica_filter(field, nmodes, return_filter=False, **kwargs_ica): """ Apply an Independent Component Analysis (ICA) filter to a field. This subtracts off functions in the frequency direction that correspond to the highest SNR *statistically independent* modes of the empirical frequency-frequency covariance. Uses `sklearn.decomposition.FastICA`. For more details, see: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.FastICA.html Parameters: field (array_like): 3D array containing the field that the filter will be applied to. NOTE: This assumes that the 3rd axis of the array is frequency. nmodes (int): Number of eigenmodes to filter out. return_filter (bool, optional): Whether to also return the linear FG filter operator and coefficients. **kwargs_ica (dict, optional): Keyword arguments for the `sklearn.decomposition.FastICA` Returns: cleaned_field (array_like), transformer (sklearn.decomposition.FastICA instance, optional): Foreground-filtered field and ICA filter object. - ``cleaned_field (array_like)``: Foreground-cleaned field. - ``transformer (sklearn.decomposition.FastICA instance, optional)``: Contains the ICA filter. Only returned if `return_operator = True`. To get the foreground model, you can do the following: ``` x = field - mean_field # shape (Npix, Nfreq) x_trans = transformer.fit_transform(x.T) # mode amplitudes per pixel x_fg = transformer.inverse_transform(x_trans).T # foreground model ``` """ # Subtract mean vs. frequency x = mean_spectrum_filter(field).reshape((-1, field.shape[-1])).T # Build ICA model and get amplitudes for each mode per pixel transformer = FastICA(n_components=nmodes, **kwargs_ica) x_trans = transformer.fit_transform(x.T) # Construct foreground operator x_fg = transformer.inverse_transform(x_trans).T # Subtract foreground operator x_clean = (x - x_fg).T.reshape(field.shape) # Return FG-subtracted data (and, optionally, the ICA filter instance) if return_filter: return x_clean, transformer else: return x_clean
def dump_data(data_path, task, reduce_sizes, trials=10): X, y, _, _ = load_data(data_path, is_shuffle=True, is_split=False) pca_components = reduce_sizes[0] pca = PCA(n_components=pca_components, random_state=10) X_PCA = pca.fit_transform(X) X_reconstructed = pca.inverse_transform(X_PCA) print("Reconstruction Error for PCA: %.6f" % np.mean( (X - X_reconstructed)**2)) data = np.hstack((X_PCA, np.array([y]).T)) PCA_path = create_path('data', task, filename='PCA.csv') np.savetxt(PCA_path, data, delimiter=",") ica_components = reduce_sizes[1] ica = FastICA(n_components=ica_components, random_state=10) X_ICA = ica.fit_transform(X) X_reconstructed = ica.inverse_transform(X_ICA) print("Reconstruction Error for ICA: %.6f" % np.mean( (X - X_reconstructed)**2)) data = np.hstack((X_ICA, np.array([y]).T)) ICA_path = create_path('data', task, filename='ICA.csv') np.savetxt(ICA_path, data, delimiter=",") rp_components = reduce_sizes[2] re_list = [] min_re_error = float("inf") X_RP = None for i in range(trials): rp = GaussianRandomProjection(n_components=rp_components) rp.fit(X) X_transformed = rp.transform(X) c_square = np.dot(rp.components_.T, rp.components_) X_reconstructed = np.dot(X_transformed, rp.components_) error = np.mean((X - X_reconstructed)**2) if error < min_re_error: min_re_error = error X_RP = X_transformed re_list.append(error) print(np.mean(re_list)) print(np.std(re_list)) print("Reconstruction Error for RP: %.6f" % min_re_error) data = np.hstack((X_RP, np.array([y]).T)) RP_path = create_path('data', task, filename='RP.csv') np.savetxt(RP_path, data, delimiter=",") mi_components = reduce_sizes[3] X_MI = SelectKBest(mutual_info_classif, k=mi_components).fit_transform(X, y) data = np.hstack((X_MI, np.array([y]).T)) MI_path = create_path('data', task, filename='MI.csv') np.savetxt(MI_path, data, delimiter=",")
def main(): # features = ['age', 'workclass', 'fnlwgt', 'education', 'education-num', 'marital-status', 'occupation', # 'relationship', 'race', 'sex', 'capital-gain', 'capital-loss', 'hours-per-week', # 'native-country', '<=50k'] # df = pd.read_csv('./adult-small.data', # names=features) # df.dropna() # df.drop_duplicates() # df = df[df['workclass'] != '?'] # df = df[df['occupation'] != '?'] # df = df[df['education'] != '?'] # df = df[df['marital-status'] != '?'] # df = df[df['relationship'] != '?'] # df = df[df['race'] != '?'] # df = df[df['sex'] != '?'] # df = df[df['native-country'] != '?'] # X = pd.get_dummies(df, columns=['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race', 'sex', 'native-country']) # X['<=50k'] = X['<=50k'].map({'<=50K':1, '>50K': 0}) # y = X['<=50k'] # X = X.drop(['<=50k'], axis=1) df = pd.read_csv('./bank-additional.csv', delimiter=';') df.dropna() df.drop_duplicates() X = pd.get_dummies(df, columns=[ 'job', 'marital', 'education', 'default', 'housing', 'loan', 'contact', 'month', 'day_of_week', 'poutcome' ]) X.dropna() X['y'].value_counts() X['y'] = X['y'].map({'yes': 1, 'no': 0}) y = X['y'] X = X.drop(['y'], axis=1) kurt_arr = [] loss_arr = [] fig, (ax1, ax2) = plt.subplots(2, 1, sharex=True) for i in range(X.shape[1]): print("Doing {} components...".format(i + 1)) ica = FastICA(whiten=True, n_components=i + 1) X_transformed = ica.fit_transform(X) print(kurtosis(X_transformed, axis=1)) print(np.average(kurtosis(X_transformed, axis=1))) kurt_arr.append(np.average(kurtosis(X_transformed, axis=1))) X_reconstructed = ica.inverse_transform(X_transformed) loss_arr.append(((X - X_reconstructed)**2).mean().sum()) ax1.plot(np.arange(X.shape[1]), kurt_arr, label="kurtosis") ax1.set_ylabel("Avg Kurtosis") ax2.plot(np.arange(X.shape[1]), loss_arr, label="reconstruction error") ax2.set_ylabel("Avg RMSE") plt.xlabel("n_components") fig.suptitle("Kurtosis vs ICA Dimensionality Reduction DS2") plt.savefig("icads2.png")
def recon_ica(data, targeet, targets, name): components = 42 if len(data[0]) < 42: components = 7 ica = FastICA(n_components=components, random_state=0, max_iter=100000) X_r = ica.fit(data).transform(data) recon = ica.inverse_transform(X_r) mse = ((data - recon)**2).mean() print 'ICA reconstruction with %s components is %s' % (components, mse)
def var_test_ica(flux_arr_orig, exposure_list, wavelengths, low_n=3, hi_n=100, n_step=1, show_plots=False, show_summary_plot=False, save_summary_plot=True, test_ind=7, real_time_progress=False, idstr=None): start_ind = np.min(np.nonzero(flux_arr_orig[test_ind])) end_ind = np.max(np.nonzero(flux_arr_orig[test_ind])) perf_table = Table(names=["n", "avg_diff2", "max_diff_scaled"], dtype=["i4", "f4", "f4"]) if hi_n > flux_arr_orig.shape[0]-1: hi_n = flux_arr_orig.shape[0]-1 for n in range(low_n, hi_n, n_step): ica = FastICA(n_components = n, whiten=True, max_iter=750, random_state=1234975) test_arr = flux_arr_orig[test_ind].copy() flux_arr = np.vstack([flux_arr_orig[:test_ind], flux_arr_orig[test_ind+1:]]) ica_flux_arr = flux_arr.copy() #keep back one for testing ica.fit(ica_flux_arr) ica_trans = ica.transform(test_arr.copy(), copy=True) ica_rev = ica.inverse_transform(ica_trans.copy(), copy=True) avg_diff2 = np.ma.sum(np.ma.power(test_arr-ica_rev[0],2)) / (end_ind-start_ind) max_diff_scaled = np.ma.max(np.ma.abs(test_arr-ica_rev[0])) / (end_ind-start_ind) perf_table.add_row([n, avg_diff2, max_diff_scaled]) if real_time_progress: print "n: {:4d}, avg (diff^2): {:0.5f}, scaled (max diff): {:0.5f}".format(n, avg_diff2, max_diff_scaled) if show_plots: plt.plot(wavelengths, test_arr) plt.plot(wavelengths, ica_rev[0]) plt.plot(wavelengths, test_arr-ica_rev[0]) plt.legend(['orig', 'ica', 'orig-ica']) plt.xlim((wavelengths[start_ind], wavelengths[end_ind])) plt.title("n={}, avg (diff^2)={}".format(n, avg_diff2)) plt.tight_layout() plt.show() plt.close() if show_summary_plot or save_summary_plot: plt.plot(perf_table['n'], perf_table['avg_diff2']) plt.plot(perf_table['n'], perf_table['max_diff_scaled']) plt.title("performance") plt.tight_layout() if show_summary_plot: plt.show() if save_summary_plot: if idstr is None: idstr = random.randint(1000000, 9999999) plt.savefig("ica_performance_{}.png".format(idstr)) plt.close() return perf_table
def filter_with_ica( X ): #pca = PCA(0.90).fit(X) pca = PCA().fit(X) n_c = pca.n_components_ #n_c = 2 ica = FastICA(n_components=n_c) ica.fit(X) eeg_ica = ica.fit_transform(X) eeg_restored = ica.inverse_transform(eeg_ica) return eeg_restored
def fast_ICA_fit(n_components, train, test, shape): # Set and fit FastICA fica = FastICA(n_components=n_components) fica.fit(train) # Reduce dimension test_reduced = fica.transform(test) # Recover data from the lower dimension test_recovered = fica.inverse_transform(test_reduced) # Calculate the MSE mse = np.mean((test_recovered - test)**2) # Reshape into a matrix test_recovered = test_recovered.reshape(shape) return fica, test_recovered, mse
def ica_analysis(spectra, nComps=5): sdssFluxes = get_data_spectra(spectra) ica = FastICA(n_components=nComps) ica.fit(sdssFluxes) weights = ica.fit_transform(sdssFluxes) comps = ica.components_.transpose() reconFluxes = ica.inverse_transform( weights) # weights.dot(ica.mixing_) + ica.mean_ loss = ((sdssFluxes - reconFluxes)**2).mean() spectraICA = ica_spectra_dict(spectra, weights, reconFluxes) return spectraICA, ica.mixing_, ica.mean_
class FastICA(): '''This class implements the Fast ICA algorithm to remove artifacts it may do this automatically or the user can select the components to remove''' def __init__(self, signals, t, fr): self.signals = signals self.components = [] self.amUnits = [] self.duration = t self.frequency = fr self.selectedComponents = [] self.icaParameters = [] # actual FastICA algorithm part 1: just creating matrix of independent components def separateComponents(self): self.ica = ICA(n_components=len(self.signals[0]), max_iter=300) self.components = np.matrix.transpose( self.ica.fit_transform(self.signals)) self.amUnits = [np.amax(self.components), np.amin(self.components)] self.selectedComponents = list(range(len(self.components))) # recreates signals with the independent components selected def recreateSignals(self): # modify the mixing matrix so it only adds the selected components for i in range(len(self.components)): if not self.isSelected(i): # turn to 0 all this component self.components[i] = [0.0] * len(self.components[i]) self.signals = self.ica.inverse_transform( np.matrix.transpose(np.array(self.components))) # transposing self.signals = np.matrix.transpose(self.signals) def isSelected(self, i): for selected in self.selectedComponents: if i == selected: return True return False # returns the components so user can see them and select manually def getComponents(self): return self.components # sets the components to recreate signals after user selects them def setComponents(self, components): self.selectedComponents = components # returns the signals in there current state def getSignals(self): return self.signals
def test_inverse_transform(whiten, n_components, expected_mixing_shape): # Test FastICA.inverse_transform n_samples = 100 rng = np.random.RandomState(0) X = rng.random_sample((n_samples, 10)) ica = FastICA(n_components=n_components, random_state=rng, whiten=whiten) Xt = ica.fit_transform(X) assert ica.mixing_.shape == expected_mixing_shape X2 = ica.inverse_transform(Xt) assert X.shape == X2.shape # reversibility test in non-reduction case if n_components == X.shape[1]: assert_array_almost_equal(X, X2)
def PerformIca2(X,Y,num_components,random_state): result = {} for n in num_components: prefix = "ica_" + str(n) + "_" algo = FastICA(n_components=n,random_state=random_state) algo.fit(X) result[prefix+"algo"] = algo _x = algo.transform(X) X_recons = algo.inverse_transform(_x) result[prefix+"reconstruction_error"] = ComputeReconstructionSSE(X,X_recons) kt_value = np.abs(kt(_x)) avg_kt = kt_value.mean() print("ICA num dim {0} : reconstruction error {1} avg kt {2}".format(str(n),str(result[prefix+"reconstruction_error"]),str(avg_kt))) print(np.sort(kt_value)) return result
def wavelet_BSS(data): raw = data tp,channels = raw.shape wv = pywt.Wavelet('dmey') reduced_tp = int(pywt.dwt_coeff_len(data_len=tp, filter_len=wv.dec_len, mode='symmetric')) WT = np.zeros((reduced_tp,channels-1)) #coeff = np.zeros((22)) #coeffcients from wavelet transformation coeff = np.zeros((reduced_tp,channels-1)) for c in range(channels-1): #exclude the markers A,B = wavelet_trans(raw[:,c],wv) WT[:,c] = A #coeff[c] = B coeff[:,c] = B n=19 #general neurology techniques say to keep as many channels as possible #Essentially, remove the noisiest channel #The remaining channels will be cleaned # Z = WT #Define source space as wavelet data ica = FastICA(n_components = n-1,whiten=True) #use all components but one Z_ = ica.fit_transform(Z) #Estimated Source Space A_ = ica.mixing_ #Retrieve estimated mixing matrix with reduced dimension W_p = np.linalg.pinv(A_) #forced inverse of modified mixing matrix is the unmixing matrix #X_filt = np.matmul(Z_,A_) ####create cleaned stack with outlier_detection#### weights = np.zeros((reduced_tp,Z_.shape[1])) stacks = np.zeros((reduced_tp,Z_.shape[1])) for c_ in range(Z_.shape[1]): inp = Z_[:,c_] weights[:,c_],stacks[:,c_] = outlier_detection(inp) stacks = robust_referencing(stacks,weights) cleaned_Z_ = stacks.reshape((Z_.shape[0],Z_.shape[1])) inv_cleaned_Z_ = ica.inverse_transform(cleaned_Z_) inv_wavelet_cleaned_ica = np.zeros((tp,channels-1)) for c in range(channels-1): #exclude the markers cA = inv_cleaned_Z_[:,c] cD = coeff[:,c] inv_data = inverse_wavelet(cA,cD) inv_wavelet_cleaned_ica[:,c] = inv_data return inv_wavelet_cleaned_ica;
def reconstruction_similarity(X,method_name,comp = 2): if method_name == 'PCA': pca = PCA(n_components=comp) X_r = pca.fit(X).transform(X) X_inverse = pca.inverse_transform(X_r) if method_name == 'ICA': ica = FastICA(n_components=comp) X_r = ica.fit(X).transform(X) X_inverse = ica.inverse_transform(X_r) if method_name == 'RCA': rca = GaussianRandomProjection(n_components=comp) X_r = rca.fit(X).transform(X) X_inverse = np.matmul(X_r, rca.components_) similarity = cosine_similarity(X_inverse,X)[0][0] return similarity
def run_ica_and_plot(X, name, number_of_features, classification_labels): # plot recunstruction error reconstruction_error = [] for n_components in np.arange(1, number_of_features + 1): ica = FastICA(n_components=n_components) transformed_data = ica.fit_transform(X) reconstruction_error.append( np.sum(np.square(X - ica.inverse_transform(transformed_data))) / X.size) plt.figure() plt.plot(np.arange(1, number_of_features + 1), reconstruction_error) plt.xticks(np.arange(1, number_of_features + 1)) plt.xlabel('Components') plt.ylabel('Reconstruction Error') plt.title('{} : ICA Reconstruction Error'.format(name)) plt.grid() plt.savefig('plots/{}-ica-reconstruction.png'.format(name)) plt.clf()
class trafo_ica: def __init__(self, num_comp=0): self.num_comp = num_comp def transform(self, data): try: self.num_comp = len(data[0]) if self.num_comp == 0 else self.num_comp self.ica = FastICA(n_components=self.num_comp, max_iter=500, tol=0.01).fit(data) except: return data else: return self.ica.transform(data) def transform_back(self, points): try: return self.ica.inverse_transform(points) except: return points
def applyICA(label, method, X, n_components, usen, reconstructimages=False, seed=seed): print("doing %s..." % (method)) mse = [] firstimages = [] model = None n = -1 Xt = None ngratio = [] meank = [] for n in n_components: model = FastICA(n_components=n, random_state=seed) Xt = model.fit_transform(X) Xr = model.inverse_transform(Xt) mse.append(mean_squared_error(X, Xr)) firstimages.append(Xr[0, :]) k = pd.DataFrame(Xt).kurtosis().abs() meank.append(k.mean()) ngratio.append(len(k[k > 2]) * 1. / len(k)) print("done. plotting...") plot_re(label, method, mse, n_components) if reconstructimages: firstimages.insert(0, np.array(X.iloc[0, :])) plot_first_images(firstimages, n_components, method, label) print("mean kurtosis = %.5f" % (k.mean())) print("total no of components = %d" % (len(k))) print("no of components that are non-gaussian = %d" % (len(k[k > 2.]))) plot_2axis(meank, ngratio, n_components, 'mean kurtosis', 'n non-gaussian/n components', 'n components', '%s kurtosis and non-gaussian sources' % (method), '%s-%s-kurt-ng.png' % (label.replace(" ", "-"), method)) model = FastICA(n_components=usen, random_state=seed) model = model.fit(X) return model
def rec_err_plot(x, comps, title, filnam): components = np.arange(1, comps + 1, 1) err = [] for c in components: temp = [] for i in range(5): ica = FastICA(n_components=c, max_iter=500, tol=0.01) dataset = ica.fit_transform(x) transformed = ica.inverse_transform(dataset) temp.append(reconstruction_error(x, transformed)) err.append(float(sum(temp)) / len(temp)) plt.clf() plt.plot(components, err, marker='o') plt.xlabel("Components") plt.ylabel("Reconstruction Error") plt.title(title) plt.grid(b=True) plt.savefig(filnam) return
def ica(input_matrix: np.array, inverse: bool=False): """ Performs ICA on an input in order to reduce dimensionality. """ trials, channels, samples = np.shape(input_matrix) ica = FastICA(n_components=None) transform = ica.fit_transform(np.vstack(input_matrix)) transform = np.reshape(transform, (-1, channels, samples)) covariance = np.corrcoef(np.mean(transform, axis=0)) high, low = _correlate(covariance) stacked = np.vstack(transform[:, low, :]) if inverse: stacked = ica.inverse_transform(stacked) return np.reshape(inverse, (trials, -1, samples))
sp1.plot(t,signal2) plt.show() cpath='/home/bejar/MEG/Data/control/' cres='/home/bejar/Documentos/Investigacion/MEG/res/' #name='MMN-201205251030' name='control1-MEG' mats=scipy.io.loadmat( cpath+name+'.mat') data= mats['data'] chann=mats['names'] samplerate=500 length=5000 chann=120 width=25 dbuffer=data[[149,43,48,27,29,109,128],5000:10000] fica=FastICA(n_components=6,algorithm='deflation',fun='exp',max_iter=1000) res=fica.fit_transform(dbuffer.transpose()) #for i in range(res.shape[1]): # plotOneSignal(res[:,i]) plotOneSignal(res[:,0]) res[:,0]=0 inv=fica.inverse_transform(res) plotTwoSignal(data[149,5000:10000],inv[:,0])
def ica(self): fica = FastICA() utility_normal_fica = fica.fit_transform(self.ds.utility_normal) self.utility_normal_back = fica.inverse_transform(utility_normal_fica)
class ICA(object): """ Wrapper for sklearn package. Performs fast ICA (Independent Component Analysis) ICA has 4 methods: - fit(waveforms) update class instance with ICA fit - fit_transform() do what fit() does, but additionally return the projection onto ICA space - inverse_transform(A) inverses the decomposition, returns waveforms for an input A, using Z - get_params() returns metadata used for fits. """ def __init__(self, num_components=10, catalog_name='unknown', whiten=True, fun = 'logcosh', fun_args = None, max_iter = 600, tol = .00001, w_init = None, random_state = None, algorithm = 'parallel'): self._decomposition = 'Fast ICA' self._num_components = num_components self._catalog_name = catalog_name self._whiten = whiten self._fun = fun self._fun_args = fun_args self._max_iter = max_iter self._tol = tol self._w_init = w_init self._random_state = random_state self._algorithm = algorithm self._ICA = FastICA(n_components=self._num_components, whiten = self._whiten, fun = self._fun, fun_args = self._fun_args, max_iter = self._max_iter, tol = self._tol, w_init = self._w_init, random_state = self._random_state, algorithm = self._algorithm) def fit(self,waveforms): # TODO make sure there are more columns than rows (transpose if not) # normalize waveforms self._waveforms = waveforms self._ICA.fit(self._waveforms) def fit_transform(self,waveforms): # TODO make sure there are more columns than rows (transpose if not) # normalize waveforms self._waveforms = waveforms self._A = self._ICA.fit_transform(self._waveforms) return self._A def inverse_transform(self,A): # convert basis back to waveforms using fit new_waveforms = self._ICA.inverse_transform(A) return new_waveforms def get_params(self): # TODO know what catalog was used! (include waveform metadata) params = self._ICA.get_params() params['num_components'] = params.pop('n_components') params['Decompositon'] = self._decomposition return params def get_basis(self): """ Return the ICA basis vectors (Z^\dagger)""" return self._ICA.get_mixing_matrix()
class SpatialFilter: def __init__(self, chanNum): global numComponents if chanNum == 14: numComponents = 4 else: numComponents = 8 self.ica = FastICA(n_components=numComponents,max_iter=800) self.chanNum = chanNum # funkcia vykonáva filtrovanie pomocou metódy ICA, pre 64 el. signál # obsahuje funkčné heuristiky na identifikáciu P300 komponentov def icaFilter (self,signal): global numComponents if self.chanNum == 64: numComponents = 8 channelPositions = channelPositionsDataset frontBottomBorder = 2 centerLeftBorder = 2 centerRightBorder = 9 centerBottomBorder = 7 centerTopBorder = 2 leftBorder = 2 rightBorder = 9 else: numComponents = 4 channelPositions = channelPositionsEpoc frontBottomBorder = 1 centerLeftBorder = 1 centerRightBorder = 7 centerBottomBorder = 6 centerTopBorder = 1 leftBorder = 1 rightBorder = 7 components = self.ica.fit_transform(signal) ############################ Výber P300 komponentov ############################ toRejectFound = [] for i in range(numComponents): component = self.ica.mixing_[:,i] ## výpočet priemerných potenciálov na jednotlivých miestach hlavy ## ## Front average potential ## frontChannels = channelPositions[:frontBottomBorder] frontChannels = [list(y for y in x if y) for x in frontChannels] sumFront = 0 sumChanFront = 0 for m in range(len(frontChannels)): for n in frontChannels[m]: sumFront+=abs(component[n-1]) sumChanFront+=1 averageFront = sumFront/sumChanFront ## Center average potential ## centerChannels = channelPositions[centerTopBorder:centerBottomBorder,centerLeftBorder:centerRightBorder] centerChannels = [list(y for y in x if y) for x in centerChannels] sumCenter = 0 sumChanCenter = 0 for m in range(len(centerChannels)): for n in centerChannels[m]: sumCenter+=abs(component[n-1]) sumChanCenter+=1 averageCenter = sumCenter/sumChanCenter ## Left average potential ## leftChannels = channelPositions[:,:leftBorder] leftChannels = [list(y for y in x if y) for x in leftChannels] sumLeft = 0 sumChanLeft = 0 for m in range(len(leftChannels)): for n in leftChannels[m]: sumLeft+=abs(component[n-1]) sumChanLeft+=1 averageLeft = sumLeft/sumChanLeft ## Right average potential ## rightChannels = channelPositions[:,rightBorder:] rightChannels = [list(y for y in x if y) for x in rightChannels] sumRight = 0 sumChanRight = 0 for m in range(len(rightChannels)): for n in rightChannels[m]: sumRight+=abs(component[n-1]) sumChanRight+=1 averageRight = sumRight/sumChanRight ratioSum = (averageFront+averageRight+averageLeft)/averageCenter ratioFrontalBack = (averageFront+averageCenter)/(averageLeft+averageRight) toUse = 0 ### heuristiky na základe priestorového rozloženia if self.chanNum == 64: if int(ratioSum) == 1: toUse = 1 if ratioFrontalBack > 1 and ratioFrontalBack < 3 and not (averageFront/averageCenter > 2): toUse = 1 if ratioSum < 1: toUse = 1 else: totalAverage = abs(component.mean()) maxChan = abs(max(component.min(), component.max(), key=abs)) if totalAverage*3 > maxChan and ratioSum < 5: toUse = 1 if averageCenter > averageFront and averageCenter > averageRight and averageCenter > averageLeft and ratioSum < 5: toUse = 1 if ratioSum < 2: toUse = 1 if toUse == 0: toRejectFound.append(i+1) print "IC"+str(i+1)+"\t Front:"+str(int(averageFront))+"\t Center:"+str(int(averageCenter))\ +"\t Right:"+str(int(averageRight))+"\t Left:"+str(int(averageLeft))+"\t Ratio:"+str(ratioSum) else: print "USED: IC"+str(i+1)+"\t Front:"+str(int(averageFront))+"\t Center:"+str(int(averageCenter))\ +"\t Right:"+str(int(averageRight))+"\t Left:"+str(int(averageLeft))+"\t Ratio:"+str(ratioSum) #### Zobrazenie komponentov (Heatmapa, časový rad, ) ###### # visualizer = Visualizer() # visualizer.plotComponents(self.ica,channelPositions,components) #################### Odstránenie nežiadúcich komponentov z mixovacej matice ################## compNums = toRejectFound compNums.sort(reverse=True) ### Výber komponentov ručne # rejected = raw_input("Enter numbers of rejected components") # compNums = rejected.split() # for m in range (len(compNums)): # compNums[m] = int(compNums[m]) # compNums.sort(reverse=True) for i in compNums: self.ica.mixing_[:,i-1] = 0 reconstructed = self.ica.inverse_transform(components) ### Zobrazenie grafu filtrovaného signálu ### # visualizer.plotSignalRepairing(signal,reconstructed) return reconstructed.T # spriemerovany char list pre jeden cielovy typ vysvietenia, vystupom je single element z povodnych N elektrod # mode = 0 - reduce, z 64 elektrod zober len tie z najdeneho subsetu # mode = 1 - guess, zober vsetky elektrody, signal co si dostal uz je len zo subsetu def grandAveragingFilter (self,isiBinList,subset,mode, isiCount = 12): chl = [IsiBin() for i in range(isiCount)] chl = isiBinList output = [] for i in range(len(chl)): #print "Grand averaging letter:",i,"\n" charChannList = chl[i] epoch = [] for m in range(len(charChannList.channelsSignalsAveraged[0])): tmp = [] if mode == 0: for l in subset: tmp.append(charChannList.channelsSignalsAveraged[l][m]) else: for l in range (len(charChannList.channelsSignalsAveraged)): tmp.append(charChannList.channelsSignalsAveraged[l][m]) epoch.append(np.mean(tmp)) output.append(epoch) return output