Esempi in Python per Preprocessing.create_n_gram

Linguaggio di programmazione: Python

Spazio dei nomi/nome del pacchetto: preprocessing

Classe/tipologia: Preprocessing

Metodo/funzione: create_n_gram

Esempi su hotexamples.com: 1

Preprocessing.create_n_gram in Python: 1 esempio trovato. Questo è il miglior esempio reale in Python per preprocessing.Preprocessing.create_n_gram, estratto da progetti open source. Lo puoi valutare, per aiutarci a migliorare la qualità dei nostri esempi.

Metodi utilizzati di frequente

Mostra Nascondi

Preprocessing(30)

dopreprocess(3)

cleanHeaders(3)

GetInputShape(3)

do(3)

dummy(2)

convert_lable_int(2)

SplitDataset(2)

convert_lable_string(2)

ReshapeInputData3D(2)

adjustBrightness(2)

denoiseImage(2)

preprocess_tweets(2)

GetTargetData(2)

decode_label(2)

encoding(2)

addhead(2)

der_features(1)

digital_img_processing(1)

decrease_time_channels(1)

datasplit(1)

data(1)

draw_image(1)

FeatureEncoding(1)

dummylist(1)

encode(1)

encoding_reduction(1)

cut_out_backgound(1)

get_location(1)

output_category_num_scale(1)

output_city_rank(1)

output_coor_scale(1)

output_dumps_scale(1)

output_missing_scale(1)

preprocess_tweet(1)

query(1)

read_with_numpy(1)

ss_treat(1)

dG0_prime(1)

create_dictionary(1)

customized_word_tokenizer(1)

binarization(1)

GetImageDataFormat(1)

GetInputData(1)

GetMaxLength(1)

GetTargetShape(1)

MissingData(1)

RemoveRedundantZero(1)

ReshapeInputData1D(1)

Split(1)

Esempio n. 1

Mostra file

File: usage.py Progetto: rungroj-m/NG-Autosklearn

if __name__ == "__main__":
    # Example of preprocessing usage
    df = pd.read_csv('dataset/Sample of removed SATD comments - RQ2.csv')
    prep = Preprocessing()

    # remove special character
    df['clean_comment'] = df['SATD comment'].apply(
        prep.special_character_removal)

    # apply lemmatization
    df['clean_comment'] = df['clean_comment'].apply(prep.lemmatization)
    df['On-hold or not'] = df['On-hold or not'].map(dict(yes=1, no=0))

    # create n-gram using ngweight
    ngweight_folder = 'path_to_ngweight/'
    n_gram = prep.create_n_gram(df['clean_comment'], df['On-hold or not'],
                                ngweight_folder, 'dataset/n_gram')

    df = df[['clean_comment', 'On-hold or not']]
    df.to_csv('dataset/clean_dataset.csv')

    # Example of ten-fold classification usage
    df = pd.read_csv('dataset/clean_dataset.csv')
    classification = Classification('onhold')

    # load n-gram and use it as corpus
    classification.set_n_gram("dataset/n_gram")

    # vectorize comment word frequency
    X = classification.vectorization(df['clean_comment'])
    y = df['On-hold or not']