Python conc示例

编程语言: Python

命名空间/包名称: corpkit

方法/功能: conc

hotexamples.com的示例: 2

Python conc - 已找到2个示例。这些是从开源项目中提取的最受好评的corpkit.conc现实Python示例。您可以评价示例，以帮助我们提高示例质量。

示例#1

显示文件

文件： session-3.py 项目： datakid/nltk

merged = merger(aust.results, [1, 2],  newname = 'australian(s)')
plot('After merging Australian and Australians', merged, num_to_plot = 2)

# <headingcell level=4>
# conc()

# <markdowncell>
# The final function is *conc()*, which produces concordances of a subcorpus based on a Tregex query. Its main arguments are:

# 1. A subcorpus to search *(remember to put it in quotation marks!)*
# 2. A Tregex query

# <codecell>
# here, we use a subcorpus of politics articles,
# rather than the total annual editions.
conc(os.path.join(path,'1966'), r'/(?i)\baustral.?/') # adj containing a risk word

# <markdowncell>
# You can set *conc()* to print *n* random concordances with the *random = n* parameter. You can also store the output to a variable for further searching.

# <codecell>
randoms = conc(os.path.join(path,'1963'), r'/(?i)\baustral.?/', random = 5)
randoms

# <markdowncell>
# *conc()* takes another argument, window, which alters the amount of cowordsext appearing either side of the match.

# <codecell>
conc(os.path.join(path,'1981'), r'/(?i)\baustral.?/', random = 5, window = 50)

# <markdowncell>

示例#2

显示文件

文件： orientation.py 项目： agogear/corpkit

# <codecell>
#

# <markdowncell>
# ### conc()

# <markdowncell>
# `conc()` produces concordances of a subcorpus. Its main arguments are:

# 1. A subcorpus to search *(remember to put it in quotation marks!)*
# 2. A query

# If your data consists of parse trees, you can use a Tregex query. If your data is one or more plain-text files, you can just a regex. We'll show Tregex style here.

# <codecell>
lines = conc('data/nyt/years/1999', r'/JJ.?/ << /(?i).?\brisk.?\b/') # adj containing a risk word

# <markdowncell>
# You can set `conc()` to print only the first ten examples with `n = 10`, or ten random these with the `n = 15, random = True` parameter.

# <codecell>
lines = conc('data/nyt/years/2007', r'/VB.?/ < /(?i).?\brisk.?\b/', n = 15, random = True)

# <markdowncell>
# `conc()` takes another argument, window, which alters the amount of co-text appearing either side of the match. The default is 50 characters

# <codecell>
lines = conc('data/nyt/topics/health/2013', r'/VB.?/ << /(?i).?\brisk.?\b/', n = 15, random = True, window = 20)

# <markdowncell>
# `conc()` also allows you to view parse trees. By default, it's false: