Skip to content

julightzhong10/pyoracc

 
 

Repository files navigation

pyoracc

Build Status Maintainability codecov DOI

Python tools for working with ORACC/C-ATF files

Depends on PLY, Mako, Multiprocessing and Pytest

Installation

If you don't use pip, you're missing out. Here are installation instructions.

Simply run:

    $ cd pyoracc
    $ git pull origin master
    $ pip install .

Or you can just do

$ pip install git+git://github.com/cdli-gh/pyoracc.git 

Or you can also do

$ pip install git+https://github.com/cdli-gh/pyoracc.git 

Upgrading

If you already have installed it and want to upgrade the tool:

    $ cd pyoracc
    $ git pull origin master
    $ pip install . --upgrade

Or you can just do

$ pip install git+git://github.com/cdli-gh/pyoracc.git --upgrade

Or you can also do

$ pip install git+https://github.com/cdli-gh/pyoracc.git --upgrade

Usage

To use it:

$ pyoracc --help

*Only files with the .atf extension can be processed. *

To run it on file:

$ pyoracc -i ./pyoracc/test/data/cdli_atf_20180104.atf -f cdli

For a fresh copy of CDLI ATF, download the data bundle here : https://github.com/cdli-gh/data/blob/master/cdliatf_unblocked.atf

To run it on oracc file:

$ pyoracc -i ./pyoracc/test/data/cdli_atf_20180104.atf -f oracc

To run it on folder:

$ pyoracc -i ./pyoracc/test/data -f cdli

To disable segmentation (will be slow) and to run on whole, use switch -w/--whole:

$ pyoracc -i ./pyoracc/test/data -f cdli -w

To see the console messages of the tool, use --verbose switch

$ pyoracc -i ./pyoracc/test/data -f cdli --verbose

To output a summary, run parser without -w/--whole and use -s/--summary

$ pyoracc -i ./pyoracc/test/data/cdli_atf_20180104.atf -f cdli -s [summary path]

Note that using the verbose option will also create a parselog.txt file, containing the log output along with displaying it on command line. The verbose output contains the lexical symbols, the parse grammer table and the LR parsing table states.

Note that, if you parse a file contains mutiple ATF records under whole mode, the parser will stop whenever it meets a error and raise the info. If you want to see a summary of the whole file, you need to run without -w/--whole. You may also use the -s/--summary to specify the output path the summary when you run without -w/--whole.

Also note that, first time usage with any atf format will always display the parse tables irrespective of verbose switch.

If you don't give arguments, it will prompt for the path and atf file type.

Help

$ pyoracc --help
Usage: pyoracc [OPTIONS]

  My Tool does one work, and one work well.

Options:
  -i, --input_path PATH      Input the file/folder name.  [required]
  -f, --atf_type [cdli|atf]  Input the atf file type.  [required]
  -v, --verbose              Enables verbose mode
  -s, --summary              Input the summary path, only useful when run without -w/--whole
  --version                  Show the version and exit.
  --help                     Show this message and exit.

Internal Dev Usage

Development Guideline

  • ORACC atf based changes will go in pyoracc/atf/oracc
  • CDLI atf based changes will go in pyoracc/atf/cdli
  • Common atf based changes will go in pyoracc/atf/common

To run on directory

$ python  -m pyoracc.model.corpus ./pyoracc/test/data  cdli

To run on individual file

$ python -m pyoracc.atf.common.atffile ./pyoracc/test/data/cdli_atf_20180104.atf cdli True

Running Tests

Before running pytest and coverage, install py.test and pytest-cov.

$ py.test --cov=pyoracc --cov-report xml --cov-report html --cov-report annotate --runslow

Before running pycodestyle, install pycodestyle.

$ pycodestyle

API Consumption

from pyoracc.atf.common.atffile import file_process
file_process(pathname, atftype, verbose)

About

Python tools for working with ATF

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%