Skip to content

This repository contains all necessary meta information, results and source files to reproduce the results in the publication Eric Müller-Budack, Matthias Springstein, Sherzod Hakimov, Kevin Mrutzek, and Ralph Ewerth: "Ontology-driven Event Type Classification in Images". In: IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE…

License

Notifications You must be signed in to change notification settings

TIBHannover/VisE

Repository files navigation

Ontology-driven Event Type Classification in Images

This is the official GitHub page for the paper:

Eric Müller-Budack, Matthias Springstein, Sherzod Hakimov, Kevin Mrutzek, and Ralph Ewerth: "Ontology-driven Event Type Classification in Images". In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 2928-2938, IEEE, Virtual Conference, 2021.

The paper is available on:

Further information can be found on the EventKG website: http://eventkg.l3s.uni-hannover.de/VisE

Content

Setup

We provide three different ways to setup the project. The results can be reproduced using the setup with singularity. The singularity image is built with an optimized pytorch implementation on arch linux, which we used for training and testing.

While the other two setups using a virtual environment or docker produce the same result on our testsets, they slightly differ from the results reported in the paper (deviation around 0.1%).

Setup with Singularity (for Reproducibility)

To install singularity please follow the instructions on: https://sylabs.io/guides/3.6/admin-guide/installation.html

Download our singularity image from: link (Filesize is 5 GB)

To run code using sinularity, please run:

singularity exec \
  -B </PATH/TO/REPOSITORY>:/src \
  --nv </PATH/TO/SINGULARITY/IMAGE>.sif \
  bash

cd /src

Setup with Virtual Environment

Please run the following command to setup the project in your (virtual) environment:

pip install -r requirements.txt

NOTE: This setup produces slightly different results (deviation around 0.1%) while testing. To fully reproduce our results we have provided a singularity image, which is a copy of our training and testing environment and uses a highly optimized pytorch implementation.

Setup with Docker

We have provided a Docker container to execute our code. You can build the container with:

docker build <PATH/TO/REPOSITORY> -t <DOCKER_NAME>

To run the container please use:

docker run \
  --volume <PATH/TO/REPOSITORY>:/src \
  --shm-size=256m \
  -u $(id -u):$(id -g) \
  -it <DOCKER_NAME> bash 

cd /src

NOTE: This setup produces slightly different results (deviation around 0.1%) while testing. To fully reproduce our results we have provided a singularity image, which is a copy of our training and testing environment and uses a highly optimized pytorch implementation.

Download Ontology, Dataset and Models

You can automatically download the files (ontologies, models, etc.) that are required for inference and test with the following command:

python download_resources.py

The files will be stored in a folder called resources/ relative to the repository path.

Models

We provide the trained models for the following approaches:

  • Classification baseline (denoted as C): link
  • Best ontology driven approach using the cross-entropy loss (denoted as CO_cel): link
  • Best ontology driven approach using the cross-entropy loss (denoted as CO_cos): link

The performance of these models regarding the top-k accuracy, jaccard similarity coefficient (JSC), and cosine similarity (CS) on the VisE-Bing and VisE-Wiki testsets is listed below using the provided singularity image:


VisE-Bing

Model Top-1 Top-3 Top-5 JSC CS
C 77.4 89.8 93.6 84.7 87.7
CO_cel 81.5 91.8 94.3 87.5 90.0
CO_cos 81.9 90.8 93.2 87.9 90.4


VisE-Wiki

Model Top-1 Top-3 Top-5 JSC CS
C 61.7 74.6 79.2 72.7 77.8
CO_cel 63.4 74.7 78.8 73.9 78.7
CO_cos 63.5 74.3 78.8 74.1 79.0

Inference

In order to apply our models on an image or a list of images, please execute the following command:

python infer.py -c </path/to/model.yml> -i </path/to/image(s)>

If you followed the instructions in Download Ontology, Dataset and Models the model config is placed in resources/VisE-D/models/<modelname>.yml relative to the repository path.

Optional parameters: As standard parameters the batch size is set to 32, the top-5 predictions will be shown, and the multiplied values of the leaf node probability and subgraph cosine similarity are used to convert the subgraph vector to a leaf node vector (details are presented in Section 4.2.3 of the paper).

--batch_size <int> specifies the batch size (default 16)

--num_predictions <int> sets the number of top predictions printed on the console (default 3)

--s2l_strategy [leafprob, cossim, leafprob*cossim] specifies the strategies to retrieve the leaf node vector from a subgraph vector (default leafprob*cossim)

Test

This step requires to download the test images in the VisE-Bing or VisE-Wiki dataset. You can run the following command to automatically download the images:

python download_images.py -d </path/to/dataset.jsonl> -o </path/to/output/root_directory/>

If you followed the instructions in Download Ontology, Dataset and Models the dataset is placed in resources/VisE-D/<datasetname>.jsonl and model config is placed in resources/VisE-D/models/<modelname>.yml relative to the repository path.

Optional parameters:

-t <int> sets the number of parallel threads (default 32)

-r <int> sets the number of retries to download an image (default 5)

--max_img_dim <int> sets the dimension of the longer image dimension (default 512)

NOTE: This step also allows to download the training and validation images in case you want to build your own models.


After downloading the test images you can calculate the results using the following command:

python test.py \
    -c </path/to/model.yml> \
    -i </path/to/image/root_directory> \
    -t </path/to/testset.jsonl>
    -o </path/to/output.json>

Optional parameters: As standard parameters the batch size is set to 32 and the multiplied values of the leaf node probability and subgraph cosine similarity are used to convert the subgraph vector to a leaf node vector (details are presented in Section 4.2.3 of the paper).

--batch_size <int> specifies the batch size (default 16)

--s2l_strategy [leafprob, cossim, leafprob*cossim] specifies the strategies to retrieve the leaf node vector from a subgraph vector (default leafprob*cossim)

VisE-D: Visual Event Classification Dataset

The Visual Event Classification Dataset (VisE-D) is available on: https://data.uni-hannover.de/de/dataset/vise

You can automatically download the dataset by following the instructions in Download Ontology, Dataset and Models. To download the images from the provided URLs, please run the following command:

python download_images.py -d </path/to/dataset.jsonl> -o </path/to/output/root_directory/>

Optional parameters:

-t <int> sets the number of parallel threads (default 32)

-r <int> sets the number of retries to download an image (default 5)

--max_img_dim <int> sets the dimension of the longer image dimension (default 512)

VisE-O: Visual Event Ontology

In Section 3.2 of the paper, we have presented several methods to create an Ontology for newsworthy event types. Statistics are presented in Table 1 of the paper.

Different versions of the Visual Event Ontology (VisE-O) can be downloaded here: link

Furthermore you can explore the Ontologies using the following links:

  • Initial Ontology (result of Section 3.2.2): explore
  • Disambiguated Ontology (result of Section 3.2.3): explore
  • Refined Ontology (result of Section 3.2.4): explore

USAGE: After opening an Ontology, the Leaf Event Nodes (blue), Branch Event Nodes (orange), and Root Node (yellow) as well as their Relations are displayed. By clicking on a specific Event Node additional information such as the Wikidata ID and related child (Incoming) and parent (Outgoing) nodes are shown. In addition, the search bar can be used to directly access a specific Event Node.

Benchmark Ontologies

In order to evaluate the presented ontology-diven approach on other benchmark datasets, we have manually linked classes of the Web Images for Event Recognition (WIDER), Social Event Dataset (SocEID), and the Rare Event Dataset (RED) to the Wikidata knowledge base according to Section 5.3.3. The resulting Ontologies for these datasets can be downloaded and explored here:

Supplemental Material

Detailed information on the sampling strategy to gather event images, statistics for the training and testing datasets presented in Section 3.3, and results using different inference strategies (Section 4.2.3) are available in the vise_supplemental.pdf.

LICENSE

This work is published under the GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007. For details please check the LICENSE file in the repository.

About

This repository contains all necessary meta information, results and source files to reproduce the results in the publication Eric Müller-Budack, Matthias Springstein, Sherzod Hakimov, Kevin Mrutzek, and Ralph Ewerth: "Ontology-driven Event Type Classification in Images". In: IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE…

Topics

Resources

License

Stars

Watchers

Forks