GitHub - hurshprasad/active-learning-elo-letor: Active Learning for Learning to Rank (LETOR)

Project for UCL Information Retrieval 2016 Learning to Rank (LETOR)

Implementation of Active Learning for Ranking through Expected Loss Optimization

We are comparing two LETOR models adaRank, and LambdaMart, and then observing another approach to LETOR called ELO Active Learning.

How to Run the code

Active Learning (ELO)

Download pre-processed data from this link file called ELO_ACTIVE_LEARNING_PRE_PROCESSED_DATA.zip
place the zip file under your project directory under data/MQ2016/active_learning/pre_processed/
uses python sklearn machine learning library, numpy, and cPickle
run the python file active_learning/elo_active_learning.py
we used pycharm (which setup the module paths) screen shot gif

RankLib Models (AdaRank, LambdaMart)

The following runs AdaRank on our dataset, change -ranker to 6 to run LambdaMart

$ java -jar bin/RankLib.jar -train ../data/MQ2016/base1024/Fold1/train.txt -test ../data/MQ2016/active_learning/test.txt -validate ../data/MQ2016/base1024/Fold1/vali.txt -ranker 3 -metric2t DCG@10

Data

data is MQ2007
Segmented into following folders representing record sizes 2^[9 10 11 12 13 14 15 15] for NDCG@10 comparison to ELO Active Learning
- base512
- base1024
- base2048
- base4096
- base8192
- base16384
- base32768
- base65536
Data Description (further reading)

Folds	Training Set	Validation Set	Test Set
Fold1	{S1,S2,S3}	S4	S5
Fold2	{S2,S3,S4}	S5	S1
Fold3	{S3,S4,S5}	S1	S2
Fold4	{S4,S5,S1}	S2	S3
Fold5	{S5,S1,S2}	S3	S4

Frameworks

RankLib
Add ranklib/bin/RankLib.jar to CLASSPATH
Command Line Parameters
Runing RankLib from command line or terminal
$ java -jar bin/RankLib.jar -train ../data/MQ2008/Fold1/train.txt -test ../data/MQ2008/Fold1/test.txt -validate ../data/MQ2008/Fold1/vali.txt -ranker 6 -metric2t NDCG@10 -metric2T ERR@10 -save mymodel.txt
Letor Framework
$ git clone https://bitbucket.org/ilps/lerot.git
$ cd lerot
$ pip install -r requirements.txt

Folder Structure

.  
├── data 
|	 ├── MQ2016  						# segmented MQ2007 data
|	 │   ├── S1.txt
|	 │   ├── S2.txt
|	 │   ├── S3.txt
|	 │   ├── S4.txt
|	 │   ├── S5.txt
|	 │   ├── base512					# segmented data
|	 │   ├── base1024
|	 │   ├── base2048
|	 │   ├── base4096
|	 │   ├── base8192
|	 │   ├── base16384
|	 │   ├── base32768
|	 │   ├── base65536
|	 │   └── active_learning/*		# all pre-processed data
├── literature  
├── poster  
├── ranklib  
├── report  
├── results
└── active_learning					# source code for active learning
	├── __init__.py
	├── constants.py  
	├── elo_active_learning.py
	├── pre_processing.py
	└── util.py

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
active_learning		active_learning
data/MQ2016		data/MQ2016
literature		literature
ranklib		ranklib
results		results
.gitignore		.gitignore
README.md		README.md
RecordingWorkingELO.gif		RecordingWorkingELO.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

active_learning

active_learning

data/MQ2016

data/MQ2016

literature

literature

ranklib

ranklib

results

results

.gitignore

.gitignore

README.md

README.md

RecordingWorkingELO.gif

RecordingWorkingELO.gif

Repository files navigation

How to Run the code

Active Learning (ELO)

RankLib Models (AdaRank, LambdaMart)

Data

Frameworks

Folder Structure

About

Releases

Packages

Contributors 2

Languages

hurshprasad/active-learning-elo-letor

Folders and files

Latest commit

History

Repository files navigation

How to Run the code

Active Learning (ELO)

RankLib Models (AdaRank, LambdaMart)

Data

Frameworks

Folder Structure

About

Topics

Resources

Stars

Watchers

Forks

Languages