Skip to content
forked from jmccrae/wn-rdf

WordNet RDF export

License

BSD-3-Clause, Unknown licenses found

Licenses found

BSD-3-Clause
LICENSE
Unknown
license.html
Notifications You must be signed in to change notification settings

congwang-ai/wn-rdf

 
 

Repository files navigation

WordNet RDF Framework

The framework consists of the following elements

  • WNRDF.py: The module for converting from the SQLite database to RDF
  • WNRDFWeb.py: The WSGI interface for rendering pages based on RDF data
  • WNFromRDF.py: Convert the RDF data back into SQLite format
  • WNRDFTest.py: Unit tests for the conversion

Other files are

  • build_ontology.py: Generates ontology.rdf from the current state of the database
  • footer and header: The nearly-static content returned at the beginning and end of HTML pages generated by WNRDFWeb
  • index.html: The static welcome page of WNRDFWeb (without header/footer)
  • ontology.rdf: The OWL ontology for WN
  • rdf2html.xsl: XSLT for generating HTML from RDF
  • sparql.html: The static page for the SPARQL query interface
  • sparql2html.xsl: XSLT for generating HTML from SPARQL XML results
  • sparql_load.py: Script for generating database for SPARQL queries
  • wnrdf.css: CSS file for the web interface
  • wordnet_3.1+.db: Please symlink the database here
  • wordnet.nt.gz: All the RDF data
  • flag/*.gif: Flags used to show language

The following files are not necessary to deploy the web interface

  • roundtrip.sh: Test if database can be converted to RDF and reloaded into SQL
  • wn_schema.py: Contains data from some of the small tables (e.g., linktype) as Python dicts
  • write_schema.py: Generates wn_schema.py from the SQLite database
  • write_sql_schema.sh: Generates the header (SQL CREATE commands) necessary to create a new SQLite database from an existing database
  • extra_indexes.sql: Generates extra indexes in the SQLite database to speed up page load time

Deployment

The application can be deployed either by configuring a WSGI application, this can be done by simply adding the following to the apache2.conf or httpd.conf file:

WSGIScriptAlias /rdf /path/to/WNRDFWeb.py

More details here

Or by starting the server as a standalone, e.g.,

python WNRDFWeb.py -p 8051 

Requirements

To run this the following are required

RDFLib and LXML should be part of most Linux distributions, e.g., in Ubuntu/Debian:

apt-get install python-rdflib python-lxml

rdflib-jsonld can be installed as follows:

git clone https://github.com/RDFLib/rdflib-jsonld.git
cd rdflib-jsonld
sudo python setup.py install 

Mappings

All mappings are stored in the mapping folder. To run most of the mapping scripts it is necessary to create the mapping database; this can be done as follows:

gunzip wn20-30.csv.gz wn30-31.csv.gz w3c-wn20.csv.gz
sqlite3 mapping.db < mapping.sql
gzip wn20-30.csv wn30-31.csv w3c-wn20.csv

The following mappings can be generated and added to the database

All files can either be generated from the appropriate .py script or by running the NTriple file through WNFromRDF.py (see next section)

Adding Mappings to DB

First add the extra indexes and tables by

sqlite3 wordnet_3.1+.db < extra_indexes.sql    

Then, each of the mappings can be added as follows

zcat mapping/omwn.nt.gz | python WNFromRDF.py | sqlite3 wordnet_3.1+.db
zcat mapping/uby.nt.gz | python WNFromRDF.py | sqlite3 wordnet_3.1+.db
zcat mapping/vn.nt.gz | python WNFromRDF.py | sqlite3 wordnet_3.1+.db
zcat mapping/w3c-synsets.nt.gz | python WNFromRDF.py | sqlite3 wordnet_3.1+.db

Generating dumps and enabling SPARQL

The file wordnet.nt.gz should be generated each time the database is changed, this is done as follows

python WNRDF.py
gzip wordnet.nt

Once the dump is generated the SPARQL index in the folder store must be generated as follows

python sparql_load.py

About

WordNet RDF export

Resources

License

BSD-3-Clause, Unknown licenses found

Licenses found

BSD-3-Clause
LICENSE
Unknown
license.html

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 69.1%
  • XSLT 27.6%
  • CSS 1.8%
  • Other 1.5%