GitHub - loftmain/learner-performance-prediction: Simple and performant implementations of learner performance prediction algorithms.

Simple and performant implementations of learner performance prediction algorithms:

Setup

Create a new conda environment, install PyTorch and the remaining requirements:

conda create python==3.7 -n learner-performance-prediction
conda activate learner-performance-prediction
pip install -r requirements.txt
conda install pytorch==1.2.0 torchvision==0.4.0 -c pytorch

The code supports the following datasets:

ASSISTments 2009-2010 (assistments09)
ASSISTments 2012-2013 (assistments12)
ASSISTments 2015 (assistments15)
ASSISTments Challenge 2017 (assistments17)
Bridge to Algebra 2006-2007 (bridge_algebra06)
Algebra I 2005-2006 (algebra05)
Spanish (spanish)
Statics (statics)

Dataset	# Users	# Items	# Skills	# Interactions	Mean # skills/item	Timestamps	Median length
assistments09	3,241	17,709	124	278,868	1.20	No	35
assistments12	29,018	53,086	265	2,711,602	1.00	Yes	49
assistments15	14,567	100	100	658,887	1.00	No	20
assistments17	1,708	3,162	102	942,814	1.23	Yes	441
bridge_algebra06	1,146	129,263	493	1,817,476	1.01	Yes	1,362
algebra05	574	173,113	112	607,025	1.36	Yes	574
spanish	182	409	221	578,726	1.00	No	1,924
statics	282	1,223	98	189,297	1.00	No	635

For your convenience, the preprocessed data sets are in the data/ folder. You do NOT need to preprocess data sets yourself.

If you want to reproduce the preprocessing, download the data from one of the links above and:

place the main file under data/<dataset codename>/data.csv for an ASSISTments dataset
place the main file under data/<dataset codename>/data.txt for a KDDCup dataset
place the two data files under data/<dataset codename>/{filename} for the Spanish dataset

python prepare_data.py --dataset <dataset codename> --remove_nan_skills

Training

Logistic Regression

To encode a sparse feature matrix with specified features:

Item Response Theory (IRT): -i
PFA: -s -sc -w -a
DAS3H: -i -s -sc -w -a -tw
Best logistic regression features (Best-LR): -i -s -ic -sc -tc -w -a

python encode.py --dataset <dataset codename> <feature flags>

To train a logistic regression model with a sparse feature matrix encoded through encode.py:

python train_lr.py --X_file data/<dataset codename>/X-<feature suffix>.npz --dataset <dataset codename>

Deep Knowledge Tracing

To train a DKT model:

python train_dkt2.py --dataset <dataset codename>

Self-Attentive Knowledge Tracing

To train a SAKT model:

python train_sakt.py --dataset <dataset codename>

Results (AUC)

Algorithm	assist09	assist12	assist15	assist17	bridge06	algebra05	spanish	statics
IRT	0.69	0.71	0.64	0.68	0.75	0.77	0.68	0.79
PFA	0.72	0.67	0.69	0.62	0.77	0.76	0.85	0.69
DAS3H	-	0.74	-	0.69	0.79	0.83	-	-
Best-LR	0.77	0.75	0.70	0.71	0.80	0.83	0.86	0.82
DKT	0.75	0.77	0.73	0.77	0.79	0.82	0.83	0.83
SAKT	0.75	0.73	0.73	0.72	0.78	0.80	0.83	0.81
SAINT	0.73	0.77	0.00	0.00	0.00	0.00	0.00	0.00

Behavioral Testing of models

Sample behavioral test is in behavior_test.py. To test a model:

python behavior_test.py --dataset <dataset codename> --load_dir <load_directory> --filename <filename>

This sample code examines first 1000 test data. For each input data, we change all responses to 1(correct) and check if the predicted correctness probability increases. Similarly, we change all responses to 0(incorrect) and check if the predicted correctness probability decreases.

Results

Algorithm	assist09	assist12	assist15	assist17	bridge06	algebra05	spanish	statics	ednet_small
Best-LR
DKT	10.3% / 0.6%	1.3% / 0.1%	4.0% / 0.2%	5.3% / 35.1%	5.8% / 4.0%	2.6% / 0.9%	2.7% / 0.0%	5.3% / 0.0%	0.2% / 0.1%
SAINT	27.2% / 6.9%	8.6% / 2.4%	7.2% / 2.1%	2.7% / 17.7%	6.4% / 4.3%				6.0% / 4.2%

Name		Name	Last commit message	Last commit date
Latest commit History 138 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
experiments		experiments
models		models
results		results
setups		setups
utils		utils
.gitignore		.gitignore
Pipfile		Pipfile
README.md		README.md
__init__.py		__init__.py
_aied_eda.py		_aied_eda.py
_present_result.py		_present_result.py
behavior_test.py		behavior_test.py
bt_case_continuity.py		bt_case_continuity.py
bt_case_original.py		bt_case_original.py
bt_case_perturbation.py		bt_case_perturbation.py
bt_case_question_prior.py		bt_case_question_prior.py
bt_case_repetition.py		bt_case_repetition.py
encode.py		encode.py
prepare_data.py		prepare_data.py
prepare_ednet.py		prepare_ednet.py
requirements.txt		requirements.txt
train_dkt1.py		train_dkt1.py
train_dkt2.py		train_dkt2.py
train_ffw.py		train_ffw.py
train_lightning.py		train_lightning.py
train_lr.py		train_lr.py
train_sakt.py		train_sakt.py
train_utils.py		train_utils.py

loftmain/learner-performance-prediction

Folders and files

Latest commit

History

Repository files navigation

Setup

Training

Logistic Regression

Deep Knowledge Tracing

Self-Attentive Knowledge Tracing

Results (AUC)

Behavioral Testing of models

Results

About

Resources

Stars

Watchers

Forks

Languages