Skip to content

anukat2015/AIR

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

89 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lerot: an Online Learning to Rank Framework

This project is designed to run experiments on online learning to rank methods for information retrieval, implementations of predecessors of Lerot can be found here: http://ilps.science.uva.nl/resources/online-learning-framework . A paper describing Lerot can found here: http://www.anneschuth.nl/wp-content/uploads/2013/09/cikm-livinglab-2013-lerot.pdf . Below is a short summary of its prerequisites, how to run an experiment, and possible extensions.

Prerequisites

  • Python (2.6, 2.7)
  • PyYaml
  • Numpy
  • Scipy
  • Celery (only for distributed runs)
  • Gurobi (only for OptimizedInterleave)

All prerequisites (except for Celery and Gurobi) are included in the academic distribution of Enthought Python, e.g., version 7.1.

Installation

Install the prerequisites plus Lerot as follows:

$ pip install PyYAML numpy scipy celery
$ git clone https://bitbucket.org/ilps/lerot.git
$ cd lerot
$ python setup.py install

Running experiments

  1. prepare data in svmlight format, e.g., download the MQ2007 (see next section on Data) :

    $ mkdir data
    $ wget http://research.microsoft.com/en-us/um/beijing/projects/letor/LETOR4.0/Data/MQ2007.rar -O data/MQ2007.rar
    $ unrar x data/MQ2007.rar data/
  2. prepare a configuration file in yml format, e.g., starting from the template below, store as config/experiment.yml (or simply use config/config.yml instead ) :

    training_queries: data/MQ2007/Fold1/train.txt
    test_queries: data/MQ2007/Fold1/test.txt
    feature_count: 46
    num_runs: 1
    num_queries: 10
    query_sampling_method: random
    output_dir: outdir
    output_prefix: Fold1
    user_model: environment.CascadeUserModel
    user_model_args:
        --p_click 0:0.0,1:0.5,2:1.0
        --p_stop 0:0.0,1:0.0,2:0.0
    system: retrieval_system.ListwiseLearningSystem
    system_args:
        --init_weights random
        --sample_weights sample_unit_sphere
        --comparison comparison.ProbabilisticInterleave
        --delta 0.1
        --alpha 0.01
        --ranker ranker.ProbabilisticRankingFunction
        --ranker_arg 3
        --ranker_tie random
    evaluation:
        - evaluation.NdcgEval
  3. run the experiment using python:

    $ python src/scripts/learning-experiment.py -f config/experiment.yml
  4. summarize experiment outcomes:

    $ python src/scripts/summarize-learning-experiment.py --fold_dirs outdir

    Arbitrarily many folds can be listed per experiment. Results are aggregated over runs and folds. The output format is a simple text file that can be further processed using e.g., gnuplot. The columns are: mean_offline_perf stddev_offline_perf mean_online_perf stddev_online_perf

Data

Lerot acceptes data formatted in the SVMlight (see http://svmlight.joachims.org/) format. You can download learning to rank data sets here:

Note that Lerot reads from both plain text and text.gz files.

Extensions

The code can easily be extended with new learning and/or feedback mechanisms for future experiments. The most obvious points for extension are:

  1. comparison - extend ComparisonMethod to add new interleaving or inference methods; existing methods include balanced interleave, team draft, and probabilistic interleave.
  2. retrieval_system - extend OnlineLearningSystem to add a new mechanism for learning from click feedback. New implementations need to be able to provide a ranked list for a given query, and ranking solutions should have the form of a vector.

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this program. If not, see http://www.gnu.org/licenses/.

Citation

If you use Lerot to produce results for your scientific publication, please refer to this paper: :

@inproceedings{schuth_lerot_2013,
title = {Lerot: an Online Learning to Rank Framework},
author = {A. Schuth, K. Hofmann, S. Whiteson, M. de Rijke},
url = {http://www.anneschuth.nl/wp-content/uploads/2013/09/cikm-livinglab-2013-lerot.pdf},
year = {2013},
booktitle = {Living Labs for Information Retrieval Evaluation workshop at CIKM’13.}
}

Publications -----------Lerot has been used in numerous publication, including these:

    1. Hofmann, A. Schuth, S. A. Whiteson, M. de Rijke (2013): Reusing Historical Interaction Data for Faster Online Learning to Rank for IR. In: Proceeding of the sixth ACM international conference on Web Search and Data Mining, 2013.
    1. Chuklin, A. Schuth, K. Hofmann, P. Serdyukov, M. de Rijke (2013): Evaluating Aggregated Search Using Interleaving. In: Proceedings of the International Conference on Information and Knowledge Management, 2013.
    1. Schuth, F. Sietsma, S. Whiteson, M. de Rijke (2014): Optimizing Base Rankers Using Clicks: A Case Study using BM25. In: 36th European Conference on Information Retrieval (ECIR’14), 2014.
    1. Hofmann, A. Schuth, A. Bellogin, M. de Rijke (2014): User Behavior and Bias in Click-Based Recommender Evaluation*. In: 36th European Conference on Information Retrieval (ECIR’14), 2014.

A paper describing Lerot is published in the living labs workshop at CIKM’13: A. Schuth, K. Hofmann, S. Whiteson, M. de Rijke (2013): Lerot: an Online Learning to Rank Framework. In: Living Labs for Information Retrieval Evaluation workshop at CIKM’13., 2013.

Acknowledgements --------------The development of Lerot is partially supported by the EU FP7 project LiMoSINe (http://www.limosine-project.eu).

About

Kognetics -> Advanced Information Retrieval - Information Extraction; Lerot: An Online Learning to Rank Framework

Resources

License

LGPL-3.0, GPL-3.0 licenses found

Licenses found

LGPL-3.0
COPYING.LESSER
GPL-3.0
COPYING

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.4%
  • Other 1.6%