Esempi in Python per Preprocessor.split_space

Linguaggio di programmazione: Python

Spazio dei nomi/nome del pacchetto: other_tools.preprocessor

Classe/tipologia: Preprocessor

Metodo/funzione: split_space

Esempi su hotexamples.com: 2

Preprocessor.split_space in Python: 2 esempi trovati. Questi sono i migliori esempi reali in Python per other_tools.preprocessor.Preprocessor.split_space, estratti da progetti open source. Li puoi valutare, per aiutarci a migliorare la qualità dei nostri esempi.

Metodi utilizzati di frequente

Mostra Nascondi

Preprocessor(2)

remove_mark_docs(2)

remove_mark(1)

split_space(1)

Esempio n. 1

Mostra file

File: doc_to_vector.py Progetto: silverwolfceh/nlp

class DocToVector():

    def __init__(self):
        self.current_dir = os.path.dirname(__file__)
        self.sorted_words = SortedWords()
        self.get_dictionary()
        self.preprocess = Preprocessor()

    def get_dictionary(self):
        pd_file = pd.read_csv(self.current_dir + '/../resources/word_frequency.csv')
        self.sorted_words.set(pd_file['word'])

    def add_to_dict_counter(self, docs):
        docs = self.preprocess.split_space(docs)

    def tf_idf(self, frequency_sentence, frequency_docs, num_words):
        return (1 + math.log(frequency_sentence))*math.log(frequency_docs*1.0/num_words)

Esempio n. 2

Mostra file

File: doc_to_vector.py Progetto: lijielife/nlp-7

class DocToVector():
    def __init__(self):
        self.current_dir = os.path.dirname(__file__)
        self.sorted_words = SortedWords()
        self.get_dictionary()
        self.preprocess = Preprocessor()

    def get_dictionary(self):
        pd_file = pd.read_csv(self.current_dir +
                              '/../resources/word_frequency.csv')
        self.sorted_words.set(pd_file['word'])

    def add_to_dict_counter(self, docs):
        docs = self.preprocess.split_space(docs)

    def tf_idf(self, frequency_sentence, frequency_docs, num_words):
        return (1 + math.log(frequency_sentence)) * math.log(
            frequency_docs * 1.0 / num_words)