Python tokenize 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: model

메소드/함수: tokenize

hotexamples.com에서의 예제들: 11

Python tokenize - 11개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 model.tokenize에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: app.py 프로젝트: sim-my/intelligent-bot

def sendData():
    if (request.form.get('input-url')):
        value = request.form.get('input-url')
        session['sentence_list'] = tokenize('url', value)
    elif (request.form.get('input-text')):
        value = request.form.get('input-text')
        session['sentence_list'] = tokenize('text', value)
    return render_template('ask.html', sentence_list=session['sentence_list'])

예제 #2

파일 보기

파일: updater.py 프로젝트: chatch/pinyin-toolkit

 def updatefactalways(self, fact, reading):
     # We better still give it a miss if the update will fail
     if 'reading' not in fact:
         return
 
     # Identify probable pinyin in the user's freeform input, reformat them according to the
     # current rules, and pop the result back into the field
     fact['reading'] = preparetokens(self.config, [model.Word(*model.tokenize(reading))])

예제 #3

파일 보기

파일: updater.py 프로젝트: idavydov/pinyin-toolkit

    def updatefactalways(self, fact, reading):
        # We better still give it a miss if the update will fail
        if 'reading' not in fact:
            return

        # Identify probable pinyin in the user's freeform input, reformat them according to the
        # current rules, and pop the result back into the field
        fact['reading'] = preparetokens(self.config,
                                        [model.Word(*model.tokenize(reading))])

예제 #4

파일 보기

파일: updater.py 프로젝트: idavydov/pinyin-toolkit

 def reformataudio(self, audio):
     output = u""
     for recognised, match in utils.regexparse(
             re.compile(ur"\[sound:([^\]]*)\]"), audio):
         if recognised:
             # Must be a sound tag - leave it well alone
             output += match.group(0)
         else:
             # Process as if this non-sound tag were a reading, in order to turn it into some tags
             output += generateaudio(self.notifier, self.mediamanager,
                                     self.config,
                                     [model.Word(*model.tokenize(match))])

예제 #5

파일 보기

파일: updater.py 프로젝트: chatch/pinyin-toolkit

 def reformataudio(self, audio):
     output = u""
     for recognised, match in utils.regexparse(re.compile(ur"\[sound:([^\]]*)\]"), audio):
         if recognised:
             # Must be a sound tag - leave it well alone
             output += match.group(0)
         else:
             # Process as if this non-sound tag were a reading, in order to turn it into some tags
             output += generateaudio(self.notifier, self.mediamanager, self.config, [model.Word(*model.tokenize(match))])

예제 #6

파일 보기

파일: updatergraph.py 프로젝트: yinzi/pinyin-toolkit

 def reformatreading(self, reading):
     return preparetokens(self.config, [model.Word(*model.tokenize(reading))])

예제 #7

파일 보기

파일: updatergraph.py 프로젝트: yinzi/pinyin-toolkit

def unpreparetokens(flat):
    return [model.Word(*model.tokenize(striphtml(flat)))]

예제 #8

파일 보기

파일: train.py 프로젝트: Trailblazer97/Text-classification

plt.switch_backend('agg')
from keras import backend as K
from keras.engine.topology import Layer
from keras import initializers
#%matplotlib inline
import pickle
import model


texts = []
labels = []



df = pd.read_csv('../dataset/dataset.csv')
df = df.dropna()
df = df.reset_index(drop=True)
print("Information on the dataset")
print('Shape of dataset ', df.shape)
print(df.columns)
print('No. of unique news types: ', len(set(df['Type'])))
print(df.head())


texts, labels, sorted_type, indexed_type = model.df_to_list(df, texts, labels)
pickle.dump(indexed_type, open('indexed_type.sav', 'wb'))
word_index, embedding_matrix, data, labels, sequences = model.tokenize(texts, labels)
model, history = model.model(word_index, embedding_matrix, sorted_type, data, labels)
model.save_model(model)
model.plot(history)

예제 #9

파일 보기

def predict(text: str):
    input_id, attention_mask = model.tokenize(text)
    prediction = model.predict(input_id, attention_mask)
    return {"prediction": prediction}

예제 #10

파일 보기

 def reformatreading(self, reading):
     return preparetokens(self.config, [model.Word(*model.tokenize(reading))])

예제 #11

파일 보기

def unpreparetokens(flat):
    return [model.Word(*model.tokenize(striphtml(flat)))]