Python nGramsToKWICDict示例

编程语言: Python

命名空间/包名称: obo

方法/功能: nGramsToKWICDict

hotexamples.com的示例: 2

Python nGramsToKWICDict - 已找到2个示例。这些是从开源项目中提取的最受好评的obo.nGramsToKWICDict现实Python示例。您可以评价示例，以帮助我们提高示例质量。

示例#1

显示文件

文件： html-to-kwic.py 项目： jdungan/oscn-scrape

# html-to-kwic.py

import obo

# create dictionary of n-grams
n = 7
url = 'http://www.oldbaileyonline.org/browse.jsp?id=t17800628-33&div=t17800628-33'

text = obo.webPageToText(url)
fullwordlist = ('# ' * (n//2)).split()
fullwordlist += obo.stripNonAlphaNum(text)
fullwordlist += ('# ' * (n//2)).split()
ngrams = obo.getNGrams(fullwordlist, n)
worddict = obo.nGramsToKWICDict(ngrams)

# output KWIC and wrap with html
target = 'black'
outstr = '<pre>'
if worddict.has_key(target):
    for k in worddict[target]:
        outstr += obo.prettyPrintKWIC(k)
        outstr += '<br />'
else:
    outstr += 'Keyword not found in source'

outstr += '</pre>'
obo.wrapStringInHTMLMac('html-to-kwic', url, outstr)

示例#2

显示文件

#get-keywords.py

import obo

test = 'this test sentence has eight words in it'
ngrams = obo.getNGrams(test.split(), 5)

print(obo.nGramsToKWICDict(ngrams))