Skip to content
/ REL Public
forked from informagi/REL

REL: Radboud Entity Linker

License

Notifications You must be signed in to change notification settings

zxlzr/REL

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

REL: Radboud Entity Linker

API status

REL is a modular Entity Linking package that can both be integrated in existing pipelines or be used as an API. REL has various meanings, one might first notice that it stands for relation, which is a suiting name for the problems that can be tackled with this package. Additionally, in Dutch a 'rel' means a disturbance of the public order, which is exactly what we aim to achieve with the release of this package.

Setup API

This section elaborates on how a user may utilise our API. Steps include obtaining a API key and querying our API. Please note that at this point in time we do not require obtaining a key and leave it for future work.

Obtaining a key

Not necessary at this point in time, please continue to the next step.

Querying our API

Users may access our API by using the example script below. For EL, the user should leave the spans field empty. Additionally, if a user wishes to predict in an ED-fashion only, then the spans key should not be left empty and should be filled with tuples consisting of integer values that represent the starting position and length of the mention respectively.

import requests

IP_ADDRESS = "http://gem.cs.ru.nl/api"
PORT = "80"
text_doc = "If you're going to try, go all the way - Charles Bukowski"

# Example EL.
document = {
    "text": text_doc,
    "spans": []
}

# Example ED.
document = {
    "text": text_doc,
    "spans": [(41, 16)]
}


API_result = requests.post("{}:{}".format(IP_ADDRESS, PORT), json=document).json()

Setup package

The following installation, downloads and installation focuses on the local-usage of our package. If a user wishes to use our API, then we refer to the section above.

Installation

Please run the following command in a terminal to install REL:

pip install git+https://github.com/informagi/REL

Download

The files used for this project can be divided into three categories. The first is a generic set of documents and embeddings that was used throughout the project. This folder includes the GloVe embeddings used by Le et al. and the unprocessed datasets that were used to train the ED model. The second and third category are Wikipedia corpus related files, which in our case either originate from a 2014 or 2019 corpus. Alternatively, users may use their own corpus, for which we refer to the tutorials.

Download generic files

Download Wikipedia corpus (2014)

Download Wikipedia corpus (2019)

Tutorials

To promote usage of this package we developed various tutorials. If you simply want to use our API, then we refer to the section above. If you feel one is missing or unclear, then please create an issue, which is much appreciated :)! The first two tutorials are for users who simply want to use our package for EL/ED and will be using the data files that we provide. The remainder of the tutorials are optional and for users who wish to e.g. train their own Embeddings.

  1. How to get started (project folder and structure).
  2. End-to-End Entity Linking.
  3. Evaluate on GERBIL.
  4. Deploy REL for a new Wikipedia corpus:
    1. Extracting a new Wikipedia corpus and creating a p(e|m) index.
    2. Training your own Embeddings.
    3. Generating training, validation and test files.
    4. Training your own Entity Disambiguation model.
  5. Reproducing our results
  6. REL as systemd service

Cite

How to cite us.

Contact

Please email your questions or comments to Mick van Hulst

Acknowledgements

Our thanks go out to the authors that open-sourced their code, enabling us to this package that can hopefully be of service to many.

About

REL: Radboud Entity Linker

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 89.5%
  • Java 10.0%
  • Other 0.5%