Skip to content

backgroundcheck/jsonmapping

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

jsonmapping Build Status

To transform flat data structures into nested object graphs matching JSON schema definitions, this package defines a mapping language. It defines how the columns of a source data set (e.g. a CSV file, database table) are to be converted to the fields of a JSON schema.

The format allows mapping nested structures, including arrays. It also supports the application of very basic data transformation steps, such as generating a URL slug or hashing a column value.

Example mapping

The mapping format is independent of any particular JSON schema, such that multiple mappings could be defined for any one particular schema.

{
    "schema": {"$ref": "http://www.popoloproject.com/schemas/person.json"},
    "mapping": {
        "id": {"column": "person_id"},
        "name": {"column": "person_name"},
        "memberships": [{
            "mapping": {
                "role": {"default": "Member of Organization"},
                "organization": {
                    "mapping": {
                        "id": {
                            "columns": ["org_id"],
                            "constant": "default-org"
                        },
                        "name": {
                            "column": "org_name",
                            "constant": "Default Organization",
                            "transforms": ["strip"]
                        }
                    }
                }
            }
        }]
    }
}

This mapping would apply to a four-column CSV file and map it to a set of nested JSON objects (a Popolo person, with a membership in an organization).

Data Transforms

While jsonmapping is not a data cleaning tool, it supports some very basic data transformation operations that can be applied on a particular column or set of columns. These include:

  • coalesce: Select the first non-null value from the list of items.
  • slugify: Transform each string into a URL slug form.
  • join: Merge together the string values of all selected columns.
  • upper: Transform the text to upper case.
  • lower: Transform the text to lower case.
  • strip: Remove leading and trailing whitespace.
  • hash: Generate a SHA1 hash of the given value.

Usage

jsonmapping is available on the Python Package Index:

$ pip install jsonmapping

The library can then be used as follows:

from jsonschema import RefResolver
from jsonmapping import Mapper

# ... load the mapping ...
mapping = load_mapping()
resolver = RefResolver.from_schema(mapping)

# ... grab some data ...
rows = read_csv()
objs = []

# This will transform flat data rows into nested JSON objects:
for obj, err in Mapper.apply_iter(rows, mapping, resolver):
    if err is None:
        objs.append(obj)

Tests

The test suite will usually be executed in it's own virtualenv and perform a coverage check as well as the tests. To execute on a system with virtualenv and make installed, type:

$ make test

About

Transform flat data structures into nested object graphs matching JSON schema definitions.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.8%
  • Makefile 1.2%