Skip to content

fmof/concrete-python

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Copyright 2012-2014 Johns Hopkins University HLTCOE. All rights reserved. This software is released under the 2-clause BSD license. See LICENSE in the project root directory.

Concrete - Python

Python modules and scripts for working with Concrete, an HLT data specification defined using Thrift.

This repository contains the Python classes generated by the Thrift compiler, but not the .thrift definition files that were used to generate these classes. The .thrift definition files can be found in the Concrete-Thrift GitHub repository: https://github.com/hltcoe/concrete-thrift

Requirements

Concrete-Python requires the following:

  • Python >= 2.7.x
  • 'networkx' Python package
  • 'thrift' Python package >= 0.9.1

You do not need to install the Thrift compiler to use this library.

Installation

You can install Concrete using the pip package manager:

pip install git+https://github.com/hltcoe/concrete-python.git#egg=concrete

or by cloning this repository and running setup.py:

git clone https://github.com/hltcoe/concrete-python.git
cd concrete-python
python setup.py test
python setup.py install

Useful Scripts

The Concrete Python package comes with two scripts.

  • concrete2json.py reads in a Concrete Communication and prints a JSON version of the Communication to stdout. The JSON is "pretty printed" with indentation and whitespace, which makes the JSON easier to read and to use for diffs.

  • validate_communication.py reads in a Concrete Communication file and prints out information about any invalid fields. This script is a command-line wrapper around the functionality in the concrete.validate library.

Use the '-h/--help' flag for details about the scripts' command line arguments.

Using the code in your project

Compiled Python classes end up in the concrete namespace. You can use them by importing them as follows:

from concrete import Communication

foo = Communication()
foo.text = 'hello world'
...

Validating Concrete Communications

The Python version of the Thrift Libraries does not perform any validation of Thrift objects. You should use the validate_communication() function after reading and before writing a Concrete Communication:

from concrete.util import read_communication_from_file
from concrete.validate import validate_communication

comm = read_communication_from_file('tests/testdata/serif_dog-bites-man.concrete')

# Returns True|False, logs details using Python stdlib 'logging' module
validate_communication(comm)

Thrift fields have three levels of requiredness:

  • explicitly labeled as required
  • explicitly labeled as optional
  • no requiredness label given ("default required")

The Java version of the Thrift libraries will raise an exception if a required field is missing on deserialization or serialization, and will raise an exception if a "default required" field is missing on serialization. The Python version of the Thrift Libraries (as of Thrift 0.9.1) does not perform any validation of Thrift objects on serialization or deserialization. The Python Thrift libraries do provide a validate() function, but this function only checks for explicitly required fields, and not "default required" fields. The Thrift validate() function also only performs shallow validation - nested data structures are not checked for required fields.

The validate_communication() function recursively checks a Communication object for required fields, plus additional checks for UUID mismatches.

About

Python modules and scripts for working with Concrete

Resources

License

Stars

Watchers

Forks

Packages

No packages published