Python ConditionalFreqDist.items 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: nltk

클래스/타입: ConditionalFreqDist

메소드/함수: items

hotexamples.com에서의 예제들: 4

Python ConditionalFreqDist.items - 4개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 nltk.ConditionalFreqDist.items에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

ConditionalFreqDist(30)

conditions(16)

tabulate(16)

plot(6)

items(4)

keys(2)

sent(1)

__getitem__(1)

get(1)

tablate(1)

예제 #1

파일 보기

파일: jaccard.py 프로젝트: baibai25/NLP-YouTube-Live

    def conditional_freq(self):
        result = []
        cfd = ConditionalFreqDist(self.bigram_list)

        for key, values in cfd.items():
            for word, freq in values.items():
                result.append((key, word, freq))

        return result

예제 #2

파일 보기

class BigramWordCandidateProvider(object):
    """Provides candidate next words given a word using a bigram model."""
    def __init__(self, corpus):
        """Initializer of the BigramWordCandidateProvider.

        Args:
            corpus: An iterable of word strings.
        """
        _bigrams = bigrams(corpus)
        self._cfd = ConditionalFreqDist(_bigrams)

    def candidates(self, word_sequence):
        """Returns a list of candidate next words given a word sequence.
        """
        word = word_sequence[-1]
        candidates = [
            candidate for (candidate, _) in self._cfd[word].most_common()
        ]
        return candidates

    def random_word(self):
        return random.choice(list(self._cfd.items()))[0]

예제 #3

파일 보기

#%%
from nltk.corpus import inaugural
from nltk import ConditionalFreqDist
from nltk.probability import FreqDist

fd3 = FreqDist([s for s in inaugural.words()])
print(fd3.freq('freedom'))

# count frequency of words length in decending order
cfd = ConditionalFreqDist((fileid, len(w)) for fileid in inaugural.fileids()
                          for w in inaugural.words(fileid)
                          if fileid > '1980' and fileid < '2010')

print(cfd.items())
cfd.plot()
# %%

예제 #4

파일 보기

        UNK += lpt.prob(r[0])
print('UNK   |       ', UNK)
print('=========== BIGRAMS ===========')
file = open('sampledata.txt', 'r')
filetext = file.read()
filetext = filetext.replace('</s>', '')
filetext = filetext.replace('<s>', '')
tokens = word_tokenize(filetext)
tokens.append('<s>')
print(set(tokens))
vocab2 = vocab
vocab2.append('</s>')
vocab2.append('UTK')
big = bigrams(tokens)
cfds = ConditionalFreqDist((w0, w1) for w0, w1 in big)
print(cfds.items())
for v3 in vocab2:
    Unk2 = 0
    fr2 = cfds.get(v3)
    if (fr2 != None):
        for i in fr2.items():
            unigramCount = 0
            for s in fr.items():
                if v3 == s[0]:
                    unigramCount = s[1]
            print('P(' + v3 + '|' + str(i[0]) + ') = ' +
                  str((i[1] / unigramCount).__round__(2)))
    else:
        Unk2 += 1
    print('P(' + v3 + '|UNK) = ' + str(Unk2))
print('======= BIGRAMS SMOOTHING =======')