Skip to content

Alex-X-W/jettie

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Jettie

a light weight python information extraction framework, a descendant of NYU's JET IE system

Motivation

Tipster architecture good for

  • pipelining
  • systematic experiment on IE tasks
  • alignment of annotations which might facilitates future visualization work.

In this project, we consider a minimal architecture of Tipster.

Code Structure

Basic Design

Primary architecture: Tipster

Tipster is implemented in the subpackage of jettie, which is the core of ensuring annotations are well aligned. While python is dynamically typed language, much effort can be saved from writing getter and setter methods.

NLP component interface

In order for NLP task components to comply with Tipster, a natural way to do is require the customized components to provide methods which annotate and align text in a Tipster way. While in Java, such requirements can be implemented through interface, but in python, we do it in a base class and ask the customized component to inherent from the base class, and implement the methods defined in the base class. To see an example, see the BaseTokenizer class defined in ./tokenizer/base_tokenizer.py, and its inherent subclass SimpleTokenizer defined in ./demo/tokenizer.py.

Tests

We performed unittest with automatic unit test tool nose, for an unittest case example please see /tests/test_tipster.py. Since we would still develop and use the project code in our own future work, writting automatic tests can make our life a little easier and neater.

One can run all the unittests at the ./jettie/ directorie by issueing $ nosetests ., and ideal output indicating test cases passed should look like ./doc/unittest.png.

Distribution and install

We have distributed jettie with pip, to install it, run:

$ pip install jettie

to verify it installed successfully, open a python shell, do the following:

>>> import jettie
>>> jettie.woof()
I am a Shiba, and I woof! Woof!
>>>

About

a lightweight, python descendant from NYU's JET Information Extraction system

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published