Skip to content

JMendes1995/Time-Matters-Query

Repository files navigation

Time-Matters-Query

Time matters query is the result of a research conducted by Ricardo Campos during his PhD at the University of Porto. It builds on top of Time matters algorithm, which was originally implemented in C#. Current version, however, is now available as a Python package, developed by Jorge Mendes under the supervision of Professor Ricardo Campos in the scope of the Final Project of the Computer Science degree of the Polytechnic Institute of Tomar, Portugal.

What is Time-Matters-Query?

Time matters query is a python package that aims to score the relevance of temporal expressions found (through time matters package) within a set of texts. To get these texts, users are given the chance to query a system. The current version of this package offers users the chance to get results from the Arquivo.pt, portuguese web archive, yet other systems may be easily added.

Where can I find Time-Matters-Query?

Time-Matters-Query can be found as a standalone installation on github and as an API [http://time-matters-query.inesctec.pt/api].

How to Install Time Matters Query

Install Time-Matters-Query library

pip install git+https://github.com/LIAAD/Time-Matters-Query.git

Install External Dependencies

You will need to install nltk:

Go to the command line and install nltk through the following command:

pip install nltk

Then open your python interpreter and write the following code (you can set the download_dir folder to /home/nltk_data when using linux)

import nltk
nltk.download('punkt', download_dir='c:/nltk_data')

More about this here

Time-Matters-Query rests on the extraction of relevant keywords and temporal expressions found in the text.

For the first (that is, the extraction of relevant keywords), we resort to YAKE! keyword extractor.

pip install git+https://github.com/LIAAD/yake

For the latter (that is, the extraction of temporal expressions), we resort to two possibilities:

The first, is an internal self-defined rule-based approach developed in regex. The latter is a Python wrapper for the well-known Heideltime temporal tagger.

To work with the Time-Matters-Query package the following packages should be installed:

pip install git+https://github.com/JMendes1995/py_rule_based
pip install git+https://github.com/JMendes1995/py_heideltime

You should also have java JDK and perl installed in your machine for heideltime dependencies (note that none of this is needed should your plan is to only use a rule-based approach).

Windows users

To install java JDK begin by downloading it here. Once it is installed don't forget to add the path to the environment variables. On user variables for Administrator add the JAVA_HOME as the Variable name:, and the path (e.g., C:\Program Files\Java\jdk-12.0.2\bin) as the Variable value. Then on System variables edit the Path variable and add (e.g., ;C:\Program Files\Java\jdk-12.0.2\bin) at the end of the variable value.

For Perl we recomment you to download and install the following distribution. Once it is installed don't forget to restart your PC.

Note that perl doesn't need to be installed if you are using Anaconda instead of pure Python distribution.

Linux users

Perl usually comes with Linux, thus you don't need to install it.

If your user does not have permission executions on python lib folder, you should execute the following command: sudo chmod 111 /usr/local/lib//dist-packages/py_heideltime/HeidelTime/TreeTaggerLinux/bin/*

How to use Time-Matters-Query to query Arquivo.pt

User's are offered the chance to either issue a query or to provide an URL where to look for information. To make this happen, we make use of the textsearch and of the versionHistory feature provided by Arquivo.pt (a description of both APIs is available here).

We highly recommend you to resort to this Python Notebook should you want to play with Time-Matters-Query when using the standalone version.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published