Esempio n. 1
0
def main(fp='data/ecartico.nt'):

    # If there was no format issue in the streets data, this function would
    # work. Instead, download the data yourself and point to it:
    # datasets = downloadDatasets(datasets=(GEBOUWEN, PERSONS, WIJKEN))

    dsG = rdflib.Dataset()  # rdflib Dataset
    rdfSubject.db = dsG  # hook onto rdfAlchemy

    TITLE = ["ECARTICO"]
    DESCRIPTION = [
        Literal(
            """Linking cultural industries in the early modern Low Countries, ca. 1475 - ca. 1725. ECARTICO is a comprehensive collection of structured biographical data concerning painters, engravers, printers, book sellers, gold- and silversmiths and others involved in the ‘cultural industries’ of the Low Countries in the sixteenth and seventeenth centuries. As in other biographical databases, users can [search and browse](http://www.vondel.humanities.uva.nl/ecartico/persons/) for data on individuals or make selections of certain types of data. However, ECARTICO also allows users to [visualize and analyze](http://www.vondel.humanities.uva.nl/ecartico/analysis/) data on cultural entrepreneurs and their ‘milieus’.

## Focus on analysis

The focus on analysis sets ECARTICO apart from other (biographical) resources in this field. One of the reasons to start with ECARTICO was that we felt that available resources were primarily designed for storage and retrieval of single data with little ‑ if any ‑ opportunities for aggregation and analysis. As a consequence other resources also offer poor support for modelling social and genealogical networks.

ECARTICO was not designed as an electronic reference work, although it can be used as such. Rather think of ECARTICO as a ‘social medium’ for the cultural industries of the Dutch and Flemish Golden Ages.

## Old and new data

ECARTICO is standing on the shoulders of giants. Much of the data present in ECARTICO is derived from the wealth of biographical and genealogical studies, that has been published over the last centuries. Also much data is derived from original research on primary sources. Many biographical details can be found in ECARTICO, that can not be found anywhere else.

## History

ECARTICO has its roots in the research project [Economic and Artistic Competition in the Amsterdam art market c. 1630-1690: history painting in Amsterdam in Rembrandt's time](http://www.nwo.nl/onderzoek-en-resultaten/onderzoeksprojecten/19/2300136219.html), which was funded by the [Netherlands Organisation for Scientific Research](http://www.nwo.nl/), and headed by Eric Jan Sluijter and Marten Jan Bok. Initially it was intended as a prosopographical research database dealing with history painters in seventeenth century Amsterdam. However, the scope of the database has become much wider because we could build upon data compiled by Pieter Groenendijk for [his lexicon (2006)](http://www.primaverapers.nl/shop/index.php?main_page=product_info&cPath=16_12&products_id=151) of 16th and 17th century visual artists from the Northern and the Southern Netherlands.

During the period 2010-2013 ECARTICO was further expanded within the research project The Cultural Industry of Amsterdam in the Golden Age, which was funded by the [The Royal Netherlands Academy of Arts and Sciences](http://www.knaw.nl/), and headed by Eric Jan Sluijter and Harm Nijboer. As part of this project the scope of ECARTICO has widened to other cultural industries like printing, publishing, sculpture, goldsmithery and theatre.

## Lacunae

Up until now, data entry has been strongly inclined towards the Dutch Republic and with a focus on Amsterdam. Especially the Southern Netherlands are still underrepresented. For instance, data from the Antwerp _Liggeren_ and the Bruges _Memorielijst_ have not been entered systematically, yet.

At this moment ECARTICO is still mostly geared towards visual artists. However we are catching up with publishers and printers at a fast pace.

Do you want to assist in expanding ECARTICO? Please contact us!

## Future development

New data are added on an almost daily base. Meanwhile the technological infrastructure of ECARTICO is kept under continuous review.

Current projects are:

*   Adding data on Dutch printers and publishers, prior to 1720
*   Implementation of revision management
*   Making ECARTICO available as Linked Open Data""",
            lang='en')
    ]

    DATE = Literal(datetime.datetime.now().strftime('%Y-%m-%d'),
                   datatype=XSD.datetime)

    ds = Dataset(
        create.term('id/ecartico/'),
        label=TITLE,
        name=TITLE,
        dctitle=TITLE,
        description=DESCRIPTION,
        dcdescription=DESCRIPTION,
        image=URIRef(
            "http://www.vondel.humanities.uva.nl/ecartico/images/logo.png"),
        url=[URIRef("http://www.vondel.humanities.uva.nl/ecartico/")],
        temporalCoverage=[Literal("1475-01-01/1725-12-31")],
        spatialCoverage=[Literal("The Netherlands")],
        dateModified=DATE,
        dcdate=DATE,
        dcmodified=DATE,
        licenseprop=URIRef("https://creativecommons.org/licenses/by-sa/3.0/"))

    # Add the datasets as separate graphs. Metadata on these graphs is in the

    # default graph.
    guri = create.term('id/ecartico/')

    # download = DataDownload(None,
    #                         contentUrl=URIRef(uri),
    #                         encodingFormat="application/turtle")

    g = rdflib.Graph(identifier=guri)

    g.bind('schema', schema)
    g.bind('foaf', foaf)
    g.bind('dcterms', dcterms)
    g.bind('owl', OWL)
    g.bind('pnv', Namespace('https://w3id.org/pnv#'))
    g.bind(
        'ecartico',
        Namespace('http://www.vondel.humanities.uva.nl/ecartico/lod/vocab/#'))
    g.bind('bio', Namespace('http://purl.org/vocab/bio/0.1/'))
    g.bind('sem', Namespace('http://semanticweb.cs.vu.nl/2009/11/sem/#'))
    g.bind('skos', Namespace('http://www.w3.org/2004/02/skos/core#'))
    g.bind('time', Namespace('http://www.w3.org/2006/time#'))

    g.parse(fp, format='nt')

    dsG.add_graph(g)

    ds.triples = sum(1 for i in g.subjects())

    dsG.bind('void', void)
    dsG.bind('dcterms', dcterms)
    dsG.bind('schema', schema)

    print("Serializing!")
    dsG.serialize('datasets/ecartico.trig', format='trig')
Esempio n. 2
0
def main(fp=None):

    # If there was no format issue in the streets data, this function would
    # work. Instead, download the data yourself and point to it:
    # datasets = downloadDatasets(datasets=(GEBOUWEN, PERSONS, WIJKEN))

    dsG = rdflib.Dataset()  # rdflib Dataset
    rdfSubject.db = dsG  # hook onto rdfAlchemy

    TITLE = ["STCN"]
    DESCRIPTION = [
        Literal(
            """STCN Golden Agents dump. schema:PublicationEvents explicitly typed.""",
            lang='en'),
        Literal(
            """De STCN is de retrospectieve nationale bibliografie van Nederland voor de periode 1540-1800; ook opgenomen zijn summiere beschrijvingen van Nederlandse (post-)incunabelen.
Het bestand staat als wetenschappelijk bibliografisch onderzoeksinstrument aan iedereen ter beschikking. Uiteindelijk zal de STCN beschrijvingen bevatten van alle boeken die tot 1800 in Nederland zijn verschenen, en van alle boeken die buiten Nederland in de Nederlandse taal zijn gepubliceerd.

De STCN wordt samengesteld op basis van collecties in binnen- en buitenland. Alle boeken zijn met het boek in de hand (in autopsie) beschreven. De omvang van het bestand was begin 2018 ca. 210.000 titels in ongeveer 550.000 exemplaren. Het bestand wordt dagelijks uitgebreid.

De STCN wordt samengesteld en uitgegeven door de Koninklijke Bibliotheek.""",
            lang='nl')
    ]

    DATE = Literal(datetime.datetime.now().strftime('%Y-%m-%d'),
                   datatype=XSD.datetime)

    ds = Dataset(
        create.term('id/stcn/'),
        label=TITLE,
        name=TITLE,
        dctitle=TITLE,
        description=DESCRIPTION,
        dcdescription=DESCRIPTION,
        image=URIRef(
            "https://www.kb.nl/sites/default/files/styles/indexplaatje_conditional/public/stcn-00.jpg"
        ),
        url=[
            URIRef(
                "https://www.kb.nl/organisatie/onderzoek-expertise/informatie-infrastructuur-diensten-voor-bibliotheken/short-title-catalogue-netherlands-stcn"
            )
        ],
        temporalCoverage=[Literal("1540-01-01/1800-12-31")],
        spatialCoverage=[Literal("The Netherlands")],
        dateModified=DATE,
        dcdate=DATE,
        dcmodified=DATE,
        licenseprop=URIRef(
            "https://creativecommons.org/publicdomain/zero/1.0/"))

    # Add the datasets as separate graphs. Metadata on these graphs is in the

    # default graph.
    guri = create.term('id/stcn/')

    # download = DataDownload(None,
    #                         contentUrl=URIRef(uri),
    #                         encodingFormat="application/turtle")

    g = rdflib.Graph(identifier=guri)

    g.bind('schema', schema)
    g.bind('foaf', foaf)
    g.bind('dcterms', dcterms)
    g.bind('owl', OWL)
    g.bind('pnv', Namespace('https://w3id.org/pnv#'))
    g.bind('kbdef', Namespace('http://data.bibliotheken.nl/def#'))
    # g.bind('bio', Namespace('http://purl.org/vocab/bio/0.1/'))
    g.bind('sem', Namespace('http://semanticweb.cs.vu.nl/2009/11/sem/#'))
    g.bind('skos', Namespace('http://www.w3.org/2004/02/skos/core#'))
    g.bind('time', Namespace('http://www.w3.org/2006/time#'))

    turtlefiles = [
        os.path.join('data/stcn', i) for i in os.listdir('data/stcn')
        if i.endswith('.ttl')
    ]
    for n, f in enumerate(turtlefiles, 1):
        print(f"Parsing {n}/{len(turtlefiles)}\t {f}")
        g.parse(f, format='turtle')

    dsG.add_graph(g)

    ds.triples = sum(1 for i in g.subjects())

    dsG.bind('void', void)
    dsG.bind('dcterms', dcterms)
    dsG.bind('schema', schema)

    print("Serializing!")
    dsG.serialize('datasets/stcn.trig', format='trig')
Esempio n. 3
0
print("Printing Dataset Length:")
print("---")
print(len(d))
print("---")
print()
print()

# Query one graph in the Dataset for all it's triples
# we should get
"""
(rdflib.term.URIRef('http://example.com/subject-z'), rdflib.term.URIRef('http://example.com/predicate-z'), rdflib.term.Literal('Triple Z'))
(rdflib.term.URIRef('http://example.com/subject-x'), rdflib.term.URIRef('http://example.com/predicate-x'), rdflib.term.Literal('Triple X'))
"""
print("Printing all triple from one Graph in the Dataset:")
print("---")
for triple in d.triples((None, None, None, graph_1)):
    print(triple)
print("---")
print()
print()

# Query the union of all graphs in the dataset for all triples
# we should get Nothing:
"""
"""
# A Dataset's default union graph does not exist by default (default_union property is False)
print("Attempt #1 to print all triples in the Dataset:")
print("---")
for triple in d.triples((None, None, None, None)):
    print(triple)
print("---")
Esempio n. 4
0
from rdflib import Dataset, URIRef, Literal, Namespace
import pickle

d = Dataset(default_union=True).parse("../data/items.trig", format="trig")
d.default_union = True
print(len(d))
# clean up HTTP/HTTPS for linked.data.gov.au
for s, p, o in d.triples((None, None, None)):
    #if "http://linked.data.gov.au" in str(s):
    print(s)
Esempio n. 5
0
def main(fp='data/onstage.nt'):

    # If there was no format issue in the streets data, this function would
    # work. Instead, download the data yourself and point to it:
    # datasets = downloadDatasets(datasets=(GEBOUWEN, PERSONS, WIJKEN))

    dsG = rdflib.Dataset()  # rdflib Dataset
    rdfSubject.db = dsG  # hook onto rdfAlchemy

    TITLE = ["ONSTAGE"]
    DESCRIPTION = [
        Literal(
            """Online Datasystem of Theatre in Amsterdam from the Golden Age to the present. This is your address for questions about the repertoire, performances, popularity and revenues of the cultural program in Amsterdam’s public theatre during the period 1637 - 1772. All data provided in this system links to archival source materials in contemporary administration.

The [Shows page](http://www.vondel.humanities.uva.nl/onstage/shows/) gives you access by date to chronological lists of the theater program, and the plays staged per day. At the [Plays page](http://www.vondel.humanities.uva.nl/onstage/plays/) you have access to the repertoire by title, and for each play you will find its performances and revenues throughout time. At the [Persons page](http://www.vondel.humanities.uva.nl/onstage/persons/) you can access the data for playwrights, actors and actresses, and translators involved in the rich national and international variety of the Amsterdam Theater productions.

Go see your favorite play!""",
            lang='en')
    ]

    DATE = Literal(datetime.datetime.now().strftime('%Y-%m-%d'),
                   datatype=XSD.datetime)

    ds = Dataset(
        create.term('id/onstage/'),
        label=TITLE,
        name=TITLE,
        dctitle=TITLE,
        description=DESCRIPTION,
        dcdescription=DESCRIPTION,
        image=URIRef(
            "http://www.vondel.humanities.uva.nl/onstage/images/logo.png"),
        url=[URIRef("http://www.vondel.humanities.uva.nl/onstage/")],
        temporalCoverage=[Literal("1637-01-01/1772-12-31")],
        spatialCoverage=[Literal("Amsterdam")],
        dateModified=DATE,
        dcdate=DATE,
        dcmodified=DATE,
        licenseprop=URIRef(
            "https://creativecommons.org/publicdomain/zero/1.0/"))

    # Add the datasets as separate graphs. Metadata on these graphs is in the

    # default graph.
    guri = create.term('id/onstage/')

    # download = DataDownload(None,
    #                         contentUrl=URIRef(uri),
    #                         encodingFormat="application/turtle")

    g = rdflib.Graph(identifier=guri)

    g.bind('schema', schema)
    g.bind('foaf', foaf)
    g.bind('dcterms', dcterms)
    g.bind('owl', OWL)
    g.bind('pnv', Namespace('https://w3id.org/pnv#'))
    g.bind(
        'onstage',
        Namespace('http://www.vondel.humanities.uva.nl/onstage/lod/vocab/#'))
    g.bind('bio', Namespace('http://purl.org/vocab/bio/0.1/'))
    g.bind('sem', Namespace('http://semanticweb.cs.vu.nl/2009/11/sem/#'))
    g.bind('skos', Namespace('http://www.w3.org/2004/02/skos/core#'))
    g.bind('time', Namespace('http://www.w3.org/2006/time#'))

    g.parse(fp, format='nt')

    dsG.add_graph(g)

    ds.triples = sum(1 for i in g.subjects())

    dsG.bind('void', void)
    dsG.bind('dcterms', dcterms)
    dsG.bind('schema', schema)

    print("Serializing!")
    dsG.serialize('datasets/onstage.trig', format='trig')
Esempio n. 6
0
def main():

    # If there was no format issue in the streets data, this function would
    # work. Instead, download the data yourself and point to it:
    # datasets = downloadDatasets(datasets=(GEBOUWEN, PERSONS, WIJKEN))
    datasets = [
        ('https://adamlink.nl/data/rdf/streets', 'data/adamlinkstraten.ttl'),
        ('https://adamlink.nl/data/rdf/buildings',
         'data/adamlinkgebouwen.ttl'),
        ('https://adamlink.nl/data/rdf/districts', 'data/adamlinkbuurten.ttl'),
        ('https://adamlink.nl/data/rdf/persons', 'data/adamlinkpersonen.ttl')
    ]

    dsG = rdflib.Dataset()  # rdflib Dataset
    rdfSubject.db = dsG  # hook onto rdfAlchemy

    TITLE = ["Adamlink"]
    DESCRIPTION = [
        Literal(
            """Adamlink, een project van [Stichting AdamNet](http://www.adamnet.nl), wil Amsterdamse collecties verbinden en als LOD beschikbaar maken.

Om collecties te verbinden hebben we identifiers ([URIs](https://nl.wikipedia.org/wiki/Uniform_resource_identifier)) voor concepten als straten, personen en gebouwen nodig. Vaak zijn die al beschikbaar, bijvoorbeeld in de [BAG](https://nl.wikipedia.org/wiki/Basisregistraties_Adressen_en_Gebouwen), [RKDartists](https://rkd.nl/nl/explore/artists) of [Wikidata](https://www.wikidata.org).

Hier voegen we onze eigen Adamlink URIs aan die identifiers toe. Niet omdat we die beter vinden dan BAG, RKDartists of Wikidata, maar omdat bepaalde concepten - verdwenen straten bijvoorbeeld - niet in genoemde authority sets terug te vinden zijn. En omdat we op Adamlink allerlei naamvarianten van concepten bijeen kunnen brengen.

We proberen Adamlink als hub laten fungeren, door bijvoorbeeld bij een straat naar zowel BAG als Wikidata te verwijzen. Regelmatig nemen we data eerst op Adamlink op, bijvoorbeeld alle geportretteerden die we in de beeldbank van het Stadsarchief tegenkomen, om die personen vervolgens (zowel scriptsgewijs als handmatig) te verbinden met bestaande authority sets als Wikidata, Ecartico of RKDartists.

Maakt en publiceert u data met (historische) straat-, gebouw- of persoonsnamen? Gebruik dan altijd een identifier die door zoveel mogelijk anderen ook gebruikt wordt. U heeft dan toegang tot alle andere informatie die over zo'n concept beschikbaar is, zoals naamsvarianten of de locatie of de tijd waarin het concept leefde of bestond. En u verbindt uw data ook met de collecties van Amsterdamse erfgoedinstellingen.""",
            lang='nl'),
        Literal("Reference data for Amsterdam collections.", lang='en')
    ]
    DATE = Literal(datetime.datetime.now().strftime('%Y-%m-%d'),
                   datatype=XSD.datetime)

    ds = Dataset(create.term('id/adamlink/'),
                 label=TITLE,
                 name=TITLE,
                 dctitle=TITLE,
                 description=DESCRIPTION,
                 dcdescription=DESCRIPTION,
                 image=URIRef("https://adamlink.nl/img/footerimg.jpg"),
                 url=[URIRef("https://www.adamlink.nl/")],
                 temporalCoverage=[Literal("1275-10-27/..")],
                 spatialCoverage=[Literal("Amsterdam")],
                 dateModified=DATE,
                 dcdate=DATE,
                 dcmodified=DATE)

    subdatasets = []

    # Add the datasets as separate graphs. Metadata on these graphs is in the
    # default graph.
    for uri, fp in datasets:

        graphtype = uri.replace(PREFIX, '')
        guri = create.term('id/adamlink/' + graphtype + '/')

        TITLE = [f"Adamlink {graphtype.title()}"]
        DESCRIPTION = [
            Literal(
                f"Data over {graphtype} uit Adamlink - Referentiedata voor Amsterdamse collecties.",
                lang='nl'),
            Literal(
                f"Data on {graphtype} from Adamlink - Reference data for Amsterdam collections.",
                lang='en')
        ]

        download = DataDownload(None,
                                contentUrl=URIRef(uri),
                                encodingFormat="application/turtle")

        subds = Dataset(guri,
                        label=TITLE,
                        name=TITLE,
                        dctitle=TITLE,
                        description=DESCRIPTION,
                        dcdescription=DESCRIPTION,
                        url=[URIRef("https://www.adamlink.nl/")],
                        temporalCoverage=[Literal("1275-10-27/..")],
                        spatialCoverage=[Literal("Amsterdam")],
                        distribution=[download])

        # Add data to the respective graph
        print("Parsing", uri)
        subgraph = rdflib.Graph(identifier=guri)
        subgraph.parse(fp, format='turtle')

        dsG.add_graph(subgraph)
        subdatasets.append(subds)

    print("Adding more meta data and dataset relations")
    for subds in subdatasets:
        subds.isPartOf = ds
        subds.inDataset = ds

        subds.triples = sum(1 for i in subgraph.subjects())

    ds.hasPart = subdatasets
    ds.subset = subdatasets

    ds.triples = sum(
        1
        for i in dsG.graph(identifier=create.term('id/adamlink/')).subjects())

    dsG.bind('void', void)
    dsG.bind('dcterms', dcterms)
    dsG.bind('schema', schema)

    print("Serializing!")
    dsG.serialize('datasets/adamlink.trig', format='trig')