Python Corpus.preprocessの例

プログラミング言語: Python

名前空間/パッケージ名: Corpus

クラス/型: Corpus

メソッド/関数: preprocess

hotexamples.comのコード掲載数: 3

Python Corpus.preprocess - 3件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのCorpus.Corpus.preprocessの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

Corpus(30)

find(5)

get_postag_set(4)

read(3)

__init__(2)

verificarPlagio(2)

add_source_document(2)

add_target_document(2)

get_file_name(2)

buildCorpus(2)

emails_as_string(2)

dump(2)

preprocess(2)

get_data(2)

read_ner(2)

outputWords(1)

pickledumpwords(1)

output_rules(1)

ner(1)

outputPOStags(1)

nettoyer_texte(1)

most_frequent_word_by_year(1)

most_frequent_word_by_month(1)

most_frequent_word_by_day(1)

most_frequent_word(1)

most_frequent_trigrams(1)

most_frequent_content_words(1)

picklegetwords(1)

read_label(1)

prepapre_to_matrix(1)

search_ambiguous(1)

vectoriserDocCorpus(1)

url_to_dir(1)

train_word2vec(1)

tag_words_with_most_likely_parses(1)

spanishTags(1)

set_lista_texto(1)

save_json(1)

process(1)

save(1)

results(1)

resetSentStats(1)

read_word2vec(1)

read_prediction(1)

load_json(1)

read_data(1)

most_frequent_bigrams(1)

get_instances(1)

lemmatiserCorpus(1)

calculSimilarite(1)

コード例 #1

ファイルを表示

ファイル: app.py プロジェクト: Tuan-Lee-23/Vietnamese-corpus-search-and-analysis-Web-app

import dash_html_components as html
import plotly.express as px
from dash.dependencies import Input, Output
import plotly.graph_objects as go
import plotly.figure_factory as ff
from statsmodels.graphics.gofplots import qqplot

import pandas as pd
import numpy as np

app = dash.Dash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP])

corpus = Corpus()  # khởi tạo
dirr = "resources/vn_express.txt"  # chọn đường dẫn corpus
corpus.read(dirr)  # đọc đường dẫn
corpus.preprocess()  # Tiền xử lí
corpus.read_word2vec()  # Đọc model word2vec để tìm từ đồng nghĩa
corpus.read_ner()  # Đọc dữ liệu đã xử lí tên thực thể và từ loại


def genResult(res):
    result = [
        html.P(children=sen,
               style={
                   'backgroundColor': 'white',
                   'borderBottom': '2px solid #4F2992',
                   'margin': '30px',
                   'padding': '10px'
               }) for sen in res
    ]
    res.append(html.Br())

コード例 #2

ファイルを表示

ファイル: main.py プロジェクト: LemDes/Deft-2012

	print "\nLoading %i files"%nb_files
	print "-------------------------------------------"
	
	docs = []
	
	# Load the files
	for i, f in enumerate(files):
		sys.stdout.write( "\r%3i/%i %s"%( i+1, nb_files, '{:<70}'.format(f) ) )
		sys.stdout.flush()
		docs.append(Document(f))
	
	corpus = Corpus(docs, termino)
	
	print "\n\nCorpus preprocessing"
	print "-------------------------------------------"
	corpus.preprocess()
	
	print "\n\nExtracting the keywords"
	print "-------------------------------------------"
	corpus.process()
	
	if Config().testing:
		print "\n\nResults (%s average)"%("Macro" if config.macro_average else "Micro")
		print "-------------------------------------------"
		corpus.results()
	else:
		print "\n"

	if config.save_file != "":
		print "\nResults saved in %s"%config.save_file
		corpus.save(config.save_file)

コード例 #3

ファイルを表示

ファイル: create_NER_pickle.py プロジェクト: Tuan-Lee-23/Vietnamese-corpus-search-and-analysis-Web-app

import sys
from Corpus import Corpus

corpus = Corpus()
dirr = input("Input directory of corpus:")
# dirr = "resources/corpus_mini.txt"
corpus.read(dirr)
corpus.preprocess()
corpus.ner()
corpus.train_word2vec()