# <markdowncell> # Note that the tags are a little bit different from the last parser we were using: # <codecell> quicktree("Melbourne has been transformed over the let 18 months in preparation for the visitors") # <markdowncell> # Neither parse is perfect, but the one we just generated has a major flaw: *Melbourne* is parsed as an adverb! Stanford CoreNLP correctly identifies it as a proper noun, and also, did a better job of handling the 'let' mistake. # <markdowncell> # *searchtree()* is a tiny function that searches a syntax tree. We'll use the sample sentence and *searchtree()* to practice our Tregex queries. We can feed it either *tags* (S, NP, VBZ, DT, etc.) or *tokens* enclosed in forward slashes. # <codecell> # any plural noun query = r'NNS' searchtree(melbtree, query) # <codecell> # A token matching the regex *Melb.?\** query = r'/Melb.?/' searchtree(melbtree, query) # <codecell> query = r'NP' searchtree(melbtree, query) # <markdowncell> # To make things more specific, we can create queries with multiple criteria to match, and specify the relationship between each criterion we want to match. Tregex will print everything matching **the leftmost criterion**. # <codecell> # NP with 18 as a descendent
# <markdowncell> # There are a number of different parsers, with some better than others: # <codecell> quicktree("Melbourne has been transformed over the let 18 months in preparation for the visitors") # <markdowncell> # Neither parse is perfect, but the one we just generated has a major flaw: *Melbourne* is parsed as an adverb! Stanford CoreNLP correctly identifies it as a proper noun, and also, did a better job of handling the 'let' mistake. # <markdowncell> # *searchtree()* is a tiny function that searches a syntax tree. We'll use the sample sentence and *searchtree()* to practice our Tregex queries. We can feed it either *tags* (S, NP, VBZ, DT, etc.) or *tokens* enclosed in forward slashes. # <codecell> # any plural noun query = r'NNS' searchtree(melbtree, query) # <markdowncell> # Here's some more documentation about Tregex queries: # <codecell> HTML('<iframe src=http://nlp.stanford.edu/~manning/courses/ling289/Tregex.html width=700 height=350></iframe>') # <codecell> #,,, # <codecell> #,,, # <codecell> #,,,