Exemplos de PreProcessing.remove_stop_words em Python

Linguagem de programação: Python

Espaço para nome / nome do pacote: pre_processing

Classe / Tipo: PreProcessing

Método / Função: remove_stop_words

Exemplos em hotexamples.com: 2

PreProcessing.remove_stop_words em Python - 2 exemplos encontrados. Esses são os exemplos do mundo real mais bem avaliados de pre_processing.PreProcessing.remove_stop_words em Python extraídos de projetos de código aberto. Você pode avaliar os exemplos para nos ajudar a melhorar a qualidade deles.

Métodos Frequentes

Exibir Ocultar

PreProcessing(30)

__normalize__(4)

__obfuscate__(3)

build_bow(3)

compute_tfidf(3)

remove_stop_words(2)

preprocess_reviews(2)

normalize_dictionary(2)

get_stemmed_text(2)

stem(2)

clean(2)

clean_and_stem(2)

filter_and_combine(1)

pre_processing_text_for_similarity(1)

__remove_stopwords__(1)

process2(1)

process(1)

change_categories_column(1)

preprocess(1)

pre_processing_page_rank_file(1)

pre_processing_text_for_neural_network(1)

get_binary_image(1)

load_calibration_params(1)

get_undistorted_image(1)

denoise(1)

get_df_reviews(1)

get_df_meta(1)

get_sites_info(1)

Métodos Frequentes

PreProcessing (30)

__normalize__ (4)

__obfuscate__ (3)

build_bow (3)

compute_tfidf (3)

remove_stop_words (2)

preprocess_reviews (2)

normalize_dictionary (2)

get_stemmed_text (2)

stem (2)

Métodos Frequentes

clean (2)

clean_and_stem (2)

filter_and_combine (1)

pre_processing_text_for_similarity (1)

__remove_stopwords__ (1)

process2 (1)

process (1)

change_categories_column (1)

preprocess (1)

pre_processing_page_rank_file (1)

pre_processing_text_for_neural_network (1)

get_binary_image (1)

load_calibration_params (1)

get_undistorted_image (1)

denoise (1)

get_df_reviews (1)

get_df_meta (1)

get_sites_info (1)

Métodos Frequentes

pre_processing_text_for_neural_network (1)

get_binary_image (1)

load_calibration_params (1)

get_undistorted_image (1)

denoise (1)

get_df_reviews (1)

get_df_meta (1)

get_sites_info (1)

Exemplo n.º 1

0

Exibir arquivo

try: review_limit = int(args.review_limit) except ValueError: raise Exception("Review limit must be a number") if review_limit < 100: raise Exception("Review limit must be over 100") # step 1 - pre processing the training data # convert to combined pandas dataframe # remving stopwords and stemming the review text pre_processing = PreProcessing(limit_reviews=review_limit) df_reviews = pre_processing.get_df_reviews() df_meta = pre_processing.get_df_meta() combined = pre_processing.filter_and_combine(df_reviews, df_meta) reviews_clean = pre_processing.preprocess_reviews( combined['reviewTextProcessed'].tolist()) no_stop_words = pre_processing.remove_stop_words(reviews_clean) stemmed_reviews = pre_processing.get_stemmed_text(no_stop_words) combined['reviewTextProcessed'] = stemmed_reviews combined = pre_processing.change_categories_column(combined) combined.to_csv(args.output_file, sep='\t', encoding='utf-8') #pickle the list of preprocessed reviews to file # with open(args.output_file, 'wb') as fp: # pickle.dump(stemmed_reviews, fp)

Exemplo n.º 2

0

Exibir arquivo

Arquivo: lstm-preprocessing.py Projeto: jacksimmonds0/com3025_coursework

if review_limit < 100: raise Exception("Review limit must be over 100") # step 1 - pre processing the training data # convert to combined pandas dataframe # remving stopwords and stemming the review text pre_processing = PreProcessing(limit_reviews=review_limit) df_reviews = pre_processing.get_df_reviews() df_meta = pre_processing.get_df_meta() combined = pre_processing.filter_and_combine(df_reviews, df_meta) combined['reviewTextProcessed'] = pre_processing.preprocess_reviews( combined['reviewTextProcessed']) combined['reviewTextProcessed'] = pre_processing.remove_stop_words( combined['reviewTextProcessed']) combined['reviewTextProcessed'] = pre_processing.get_stemmed_text( combined['reviewTextProcessed']) reviews_and_sentiment = combined[['reviewTextProcessed', 'overall']] # convert string rating values to numerical values reviews_and_sentiment['overall'] = pd.to_numeric( reviews_and_sentiment['overall']) # convert the rating value to 1 or 0 (sentiment value) # if the average rating is 1, 2, 3 then 0 (negative sentiment) # if the average rating is 4 or 5 then 1 (positive sentiment) reviews_and_sentiment['sentiment'] = reviews_and_sentiment[ 'overall'].apply(lambda x: 1 if x > 3 else 0) reviews_and_sentiment['sentiment'] = [