Skip to content
/ pywhip Public

Python package to validate data against whip specifications

License

Notifications You must be signed in to change notification settings

inbo/pywhip

Repository files navigation

pywhip

Build Status Build Status Updates

The pywhip package is a Python package to validate data against whip specifications, a human and machine-readable syntax to express specifications for data.

Check the documentation pages for more information.

Installation

To install pywhip, run this command in your terminal:

pip install pywhip

For more detailed installation instructions, see the documentation pages.

Test pywhip in jupyter notebook

Launch a jupyter notebook to interactively try out the pywhip package:

Binder

Quickstart

To validate a CSV data file with the field headers country, eventDate and individualCount, write whip specifications, according to the whip syntax:

specifications = """
    country:
       allowed: [BE, NL]
    eventDate:
        dateformat: '%Y-%m-%d'
        mindate: 2016-01-01
        maxdate: 2018-12-31
    individualCount:
        numberformat: x  # needs to be an integer value
        min: 1
        max: 100
    """

To whip your data set, e.g. my_data.csv, pass the data to whip specifications:

from pywhip import whip_csv

example = whip_csv("my_data.csv", specifications, delimiter=',')

and write the output report to an html file:

with open("report_example.html", "w") as index_page:
    index_page.write(example.get_report('html'))

Resulting in a report like this. For a more detailed introduction, see the documentaton tutorial.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Validation of data rows is using the Cerberus package.