Python SqliteVocabulary.insert_vocabulary Examples

Programming Language: Python

Namespace/Package Name: sqlite_vocabulary

Class/Type: SqliteVocabulary

Method/Function: insert_vocabulary

Examples at hotexamples.com: 2

Python SqliteVocabulary.insert_vocabulary - 2 examples found. These are the top rated real world Python examples of sqlite_vocabulary.SqliteVocabulary.insert_vocabulary extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

query_words_with_sql(3)

insert_vocabulary(2)

check_existed_word(2)

commit(2)

close(2)

update_word_status(2)

delete_word(1)

get_col_names(1)

get_uk_sound(1)

get_us_sound(1)

clear_local_count(1)

query_words_with_status(1)

query_words_with_status_and_date(1)

update_uk_pron(1)

update_uk_sound(1)

update_us_pron(1)

update_us_sound(1)

update_word_count(1)

Example #1

Show file

File: vocabulary.py Project: tuyendothanh/nlp

def main():
    sent_tokenizer=nltk.data.load('tokenizers/punkt/english.pickle')
    text = open('document.txt').read() # nltk.corpus.gutenberg.raw('document.txt')
    sents = sent_tokenizer.tokenize(text)

    sqlVocab = SqliteVocabulary("studyenglish.db", "vocabulary")
    #sqlVocab.delete_vocabulary()

    for sent in sents:
        tokens = nltk.word_tokenize(sent)
        words = [w.lower() for w in tokens]
        vocab = sorted(set(words))

        for v in vocab:
            existed_word = sqlVocab.check_existed_word(v)
            if not existed_word:
                sqlVocab.insert_vocabulary(v, 1, "", "", strftime("%Y-%m-%d", gmtime()), sent)

    sqlVocab.commit()
    sqlVocab.close()

Example #2

Show file

File: editor.py Project: tuyendothanh/nlp

        def nature_language_processing(self):
            sent_tokenizer=nltk.data.load('tokenizers/punkt/english.pickle')
            text = st.get(1.0, END) # open('document.txt').read() # nltk.corpus.gutenberg.raw('document.txt')
            sents = sent_tokenizer.tokenize(text)
            words = nltk.word_tokenize(text)
            #fdist = FreqDist(words)

            sqlVocab = SqliteVocabulary("studyenglish.db", "vocabulary")
            sqlVocab.clear_local_count()
            for sent in sents:
                tokens = nltk.word_tokenize(sent)
                #words = [w.lower() for w in tokens]
                #vocab = sorted(set(words))
                tagged = nltk.pos_tag(tokens)

                for v, t in tagged:
                    #print(v,)
                    #print(t)
                    #print fdist.freq(v)
                    existed_word = sqlVocab.check_existed_word(v.lower())
                    #if (not v.isdigit()) and v.isalpha():
                    if (not existed_word):
                        sqlVocab.insert_vocabulary(v.lower(), "", "", t, "", "", sent, -2, strftime("%Y-%m-%d", gmtime()), 1, 1)
                    else:
                        sqlVocab.update_word_count(v.lower(), 1, 1)
            '''
            for v in fdist.keys():
                existed_word = sqlVocab.check_existed_word(v.lower())
                if existed_word:
                    sqlVocab.update_word_freq(v.lower(), fdist.freq(v), fdist[v])
            '''
            sqlVocab.commit()
            sqlVocab.close()

            self.show_all_words()
            self.master.destroy()