Python DataProcessor.getWords Examples

Programming Language: Python

Namespace/Package Name: DataProcessor

Class/Type: DataProcessor

Method/Function: getWords

Examples at hotexamples.com: 1

Python DataProcessor.getWords - 1 examples found. These are the top rated real world Python examples of DataProcessor.DataProcessor.getWords extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

DataProcessor(30)

ProcessTestData(18)

execute(6)

get_model(4)

ProcessTrainData(4)

get_labels(3)

get_dev_examples(3)

get_geo_twi_target_info(2)

create_tuples(2)

create_txt_files(2)

applyPipeline(2)

get_X(2)

create_dataframe(2)

GetTimeStampsFromTestData(2)

ExtractValidationLabels(2)

generate_valtrain_batch(1)

getAuraWithId(1)

generate_valtest_batch(1)

getAuraWithIdLine(1)

getData(1)

fivefold(1)

getDuration(1)

filterAndNormalizeFullDataset(1)

getListOfTuples(1)

CreateGreyPickle(1)

getMaxLength(1)

getPdLocation(1)

getTags(1)

getWords(1)

get_accommodation_stats(1)

get_column_np(1)

get_features(1)

get_matrix(1)

get_medicine_examples(1)

loadInput(1)

newDataProcessor(1)

printOutput(1)

filter(1)

divide_xy(1)

evalfscore_v17(1)

calc_expected_return_probablity_based_on_monte_carlo(1)

GetOutputsFromTestData(1)

GetPitchFromTestData(1)

GetRollFromTestData(1)

ProcessInferenceData(1)

ProcessTestDataGray(1)

_get_valtest_mul_data(1)

_get_valtrain_mul_data(1)

_internal_validate_predict_best_param(1)

add_row(1)

Example #1

Show file

from matplotlib import pyplot as plt
from keras.preprocessing.text import text_to_word_sequence
from keras.preprocessing.text import Tokenizer
import tensorflowjs as tfjs

# Params
BATCH_SIZE = 512  # Number of examples used in each iteration
EPOCHS = 100  # Number of passes through entire dataset
EMBEDDING = 40  # Dimension of word embedding vector

# importing the data
dir_path = 'annotated/corpus'
dataProcessor = DataProcessor(dir_path, 'tei')
sentences = dataProcessor.getListOfTuples()

word2idx = {w: i + 2 for i, w in enumerate(dataProcessor.getWords())}
word2idx['unk'] = 1
word2idx['pad'] = 0

idx2word = {i: w for w, i in word2idx.items()}

tag2idx = {t: i + 1 for i, t in enumerate(dataProcessor.getTags())}
tag2idx['pad'] = 0

idx2tag = {i: w for w, i in tag2idx.items()}

# Write dictionary
import json
with open('model4_js/vocab/word2idx.json', 'w') as fp:
    json.dump(word2idx, fp)
with open('model4_js/vocab/idx2word.json', 'w') as fp: