Skip to content

hbredin/Evaluation2015

Repository files navigation

This repository provides both evaluation and submission scripts.

Installation

git clone https://github.com/MediaevalPersonDiscoveryTask/evaluation.git
cd evaluation
pip install -r requirements.txt

Evaluation metric

The official evaluation metric is the Evidence-weighted Mean Average Precision (or EwMAP).
A detailed description can be found in the wiki of this repository.

We provide a Python implementation of EwMAP.
This implementation will be used for the final ranking of submissions.

$ python evaluation.py --queries=samples/queries.lst  # list of queries
                       samples/dev.test2.shot \       # reference list of shots
                       samples/dev.test2.ref  \       # label reference
                       samples/dev.test2.eviref \     # evidence reference
                       samples/dev.test2.label \      # label hypothesis
                       samples/dev.test2.evidence     # evidence hypothesis
EwMAP = 51.39 %  # <-- official evaluation metric (higher is better)
MAP = 51.77 %    # <-- standard mean average precision (higher is better)
C = 58.75 %      # <-- evidence correctness (higher is better)

More information about file formats can be found in the wiki.

Submission

Each team must submit exactly one primary run, following strict "no supervision" constraints described in the private task wiki.
Additionally, each team is allowed to submit up to four contrastive runs.
The official ranking will be based on primary runs.

We provide a Python script to manage (list, delete, create) your submissions.

$ python submission.py --help

Changelog

Version 0.2 (2015-06-08)

  • feat: submission script

Version 0.1 (2015-05-27)

  • first version

Contribute

Feel free to contribute to the evaluation tool or share your own implementations in alternative languages, using GitHub's pull request procedure. We will gladly add them to this repository.

About

Python tools for evaluation

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages