pygeometa is a Python package to generate metadata for geospatial datasets.
pygeometa is a Python package to generate metadata for geospatial datasets. Metadata content is managed by pygeometa in simple Metadata Control Files (MCF) which consist of 'parameter = value' pairs. pygeometa generates metadata records from MCF files based on the schema specified by the user, such as ISO-19139. pygeometa supports nesting MCF files, which reduces duplication of metadata content common to multiple records and ease maintenance.
- simple configuration: inspired by Python's ConfigParser
- extensible: template architecture allows for easy addition of new metadata formats
- flexible: use as a command-line tool or integrate as a library
pygeometa is best installed and used within a Python virtualenv.
- Python 2.7 and above. Works with Python 3 (but not tested)
- Python virtualenv package
Dependencies are listed in requirements.txt. Dependencies are automatically installed during pygeometa's installation.
virtualenv my-env
cd my-env
. bin/activate
git clone https://github.com/laurensdv/pygeometa.git
cd pygeometa
pip install -r requirements.txt
python setup.py build
python setup.py install
# iso19139 (XML) -> geodcat-ap (RDF)
generate_metadata.py --xml=path/to/file.xml # to stdout
generate_metadata.py --xml=path/to/file.xml --output=some_file.rdf # to file
# geodcat-ap (RDF) -> iso19139 (XML)
generate_metadata.py --rdf=path/to/file.xml # to stdout
generate_metadata.py --rdf=path/to/file.rdf --output=some_file.xml # to file
With xml source files you can choose to include a:
--html
flag, you can choose to convert the xml to HTML instead of GeoDCAT-AP RDF.
generate_metadata.py --xml=path/to/file.xml --html # to stdout
generate_metadata.py --xml=path/to/file.xml --html --output=some_file.html # to file
--validate
flag to check if the xml is valid against the latest iso19139. If you include a schema parameter you can define another supported schema against which the xml should be validated. The file will not be converted to GeoDCAT-AP RDF.
generate_metadata.py --xml=path/to/file.xml --validate # to stdout
generate_metadata.py --xml=path/to/file.xml --validate --output=some_file # to file
Schemas supported by this pygeometa branch:
- iso-19139-to-dcat-ap, tweaked version of the EU ISO19139->GeoDCAT-AP conversion
- iso191139-flanders, updated iso19139 to be compatible with open data in the Belgian/EU region Flanders.
- Local schema, specified with
--schema_local=/path/to/my-schema
Action | Schema Type |
---|---|
iso19139 (XML) -> geodcat-ap (RDF) | xslt |
geodcat-ap (RDF) -> iso19139 (XML) | pygeometadata |
from pygeometa import iso_to_dcat, dcat_to_iso
# default schemas
rdf_output = iso_to_dcat('/path/to/file.xml')
xml_output = dcat_to_iso('/path/to/file.rdf')
# user-defined schemas
rdf_output = iso_to_dcat('/path/to/file.xml', schema_local='/path/to/new-schema.xsl')
xml_output = dcat_to_iso('/path/to/file.rdf', schema_local='/path/to/new-schema')
# validation
from pygeometa.validation.validation import Validators
from lxml import etree
profiles = ["iso19139latest"] # or another profile
xml = '/path/to/file.xml'
v = Validators(profiles)
v_results = v.is_valid(etree.parse(open(xml)))
Workflow to generate metadata XML:
- Install pygeometa
- Create a 'metadata control file' .mcf file that contains metadata information
- Modify the sample.mcf example
- pygeometa supports nesting MCF files together, allowing providing a single MCF file for common metadata parameters (e.g. common contact information)
- Refer to the Metadata Control File Reference documentation
- Run pygeometa for the .mcf file with a specified target metadata schema
generate_metadata.py --mcf=path/to/file.mcf --schema=iso19139 # to stdout
generate_metadata.py --mcf=path/to/file.mcf --schema=iso19139 --output some_file.xml # to file
# to use your own defined schema:
generate_metadata.py --mcf=path/to/file.mcf --schema_local=/path/to/my-schema --output some_file.xml # to file
Schemas supported by pygeometa:
- iso19139, reference
- iso19139-hnap, reference
- iso19139-flanders, updated iso19139 to be compatible with open data in the Belgian/EU region Flanders.
- Local schema, specified with
--schema_local=/path/to/my-schema
from pygeometa import render_template
# default schema
xml_string = render_template('/path/to/file.mcf', schema='iso19139')
# user-defined schema
xml_string = render_template('/path/to/file.mcf', schema_local='/path/to/new-schema')
with open('output.xml', 'w') as ff:
ff.write(xml_string)
Same as installing a package. Use a virtualenv. Also install developer requirements:
pip install -r requirements-dev.txt
List of supported metadata schemas in pygeometa/templates/
To add support to new metadata schemas:
cp -r pygeometa/templates/iso19139 pygeometa/templates/new-schema
Then modify *.j2
files in the new pygeometa/templates/new-schema
directory to comply to new metadata schema.
# via distutils
python setup.py test
# manually
cd tests
python run_tests.py
All bugs, enhancements and issues are managed on GitHub.
This pygeometadata branch intends to make it possible to transform iso19139 -> geodcat-ap and vise versa (maximizing losslessness and validity).
pygeometa originated within an internal project called pygdm, which provided generic geospatial data management functions. pygdm (now at end of life) was used for generating MSC/CMC geospatial metadata. pygeometa was pulled out of pygdm to focus on the core requirement of generating geospatial metadata within a real-time environment.
In 2015 pygeometa was made publically available in support of the Canadian Treasury Board Policy on Acceptable Network and Device Use.