Python ConllCorpusReader.tagged_words示例

编程语言: Python

命名空间/包名称: nltk.corpus.reader

方法/功能: tagged_words

hotexamples.com的示例: 4

Python ConllCorpusReader.tagged_words - 已找到4个示例。这些是从开源项目中提取的最受好评的nltk.corpus.reader.ConllCorpusReader.tagged_words现实Python示例。您可以评价示例，以帮助我们提高示例质量。

常用方法

显示隐藏

ConllCorpusReader(15)

tagged_sents(8)

iob_sents(4)

sents(3)

tagged_words(3)

words(2)

chunked_sents(1)

chunked_words(1)

fileids(1)

iob_words(1)

parsed_sents(1)

raw(1)

srl_instances(1)

srl_spans(1)

示例#1

显示文件

 def tagged_words(self, fileids=None, categories=None):
     return ConllCorpusReader.tagged_words(
         self, self._resolve(fileids, categories))

示例#2

显示文件

文件： Train.py 项目： gonchandrei/ANLP_viterbi

from __future__ import division
from nltk.corpus.reader import ConllCorpusReader
from nltk.probability import FreqDist, DictionaryProbDist, LaplaceProbDist, SimpleGoodTuringProbDist, MLEProbDist

conllreader = ConllCorpusReader(".", "de-train.tt", ('words', 'pos'))  # getting a train corpus from file
states = ('VERB', 'NOUN', 'PRON', 'ADJ', 'ADV', 'ADP', 'CONJ', 'DET', 'NUM', 'PRT', 'X', '.')  # list of 12 POS tags
sentslen = len(conllreader.tagged_sents())  # getting number of sentences

tagfdist = FreqDist(pair[1] for pair in conllreader.tagged_words())   # getting frequence of (word,tag)

firsttagfdist = FreqDist(pair[0][1] for pair in conllreader.tagged_sents())  # getting frequence of first tags
A0j = DictionaryProbDist(dict(map(lambda (k, x): (k, x/sentslen), firsttagfdist.iteritems())))
A0jLap = LaplaceProbDist(firsttagfdist)
A0jGT = SimpleGoodTuringProbDist(firsttagfdist)
A0jMLE = MLEProbDist(firsttagfdist)

TagPair = []
words = conllreader.tagged_words()
for i in range(0, len(words)-1):
    TagPair.append((words[i][1], words[i+1][1]))

TagPairfdist = FreqDist(TagPair)
Aij = DictionaryProbDist(dict(map(lambda (k, x): (k, x/tagfdist.get(k[0])), TagPairfdist.iteritems())))
AijLap = LaplaceProbDist(TagPairfdist)
AijGT = SimpleGoodTuringProbDist(TagPairfdist)
AijMLE = MLEProbDist(TagPairfdist)

TagWordfdist = FreqDist(conllreader.tagged_words())
Biw = DictionaryProbDist(dict(map(lambda (k, x): (k, x/tagfdist.get(k[1])), TagWordfdist.iteritems())))
BiwLap = LaplaceProbDist(TagWordfdist)
BiwGT = SimpleGoodTuringProbDist(TagWordfdist)

示例#3

显示文件

文件： catchunked.py 项目： RomanZacharia/python_text_processing_w_nltk2_cookbook

	def tagged_words(self, fileids=None, categories=None):
		return ConllCorpusReader.tagged_words(self, self._resolve(fileids, categories))

示例#4

显示文件

## Function to add an adjective to a noun key
def add_adj(noun_param, adj_param):
    if (noun_param in a):
        a[noun_param].append(adj_param)
    else:
        a[noun_param] = [adj_param]


filedir = '/Users/fnascime/Documents/Sicily_Project/texts/'
filename = 'ilgattopardo_prima'

mycorpus = ConllCorpusReader(filedir, filename + '.conll',
                             ('ignore', 'words', 'ignore', 'pos', 'ignore',
                              'ignore', 'ignore', 'ignore'))

words = mycorpus.tagged_words()
list_len = len(words)

## Loop through file and retrieve adjetives directly associated with nouns (adjunct words)
for i in range(list_len):

    if (words[i][1] == 'S'):
        if ((i > 0) and (words[i - 1][1] == 'A')):
            add_adj(words[i][0], words[i - 1][0])
        elif ((i < list_len - 1) and (words[i + 1][1] == 'A')):
            add_adj(words[i][0], words[i + 1][0])

## Loop throught the list of words and verify the ones with more adjective

nouns_counting = len(a)
adj_counting = 0