GitHub - FraCose/Caratheodory_GD_Acceleration: Cosentino, Oberhauser, Abate - Caratheodory Sampling for Stochastic Gradient Descent

This Repository contains the Algorithms explained in
Cosentino, Oberhauser, Abate
"Caratheodory Sampling for Stochastic Gradient Descent"

The files are divided in the following way:

The ipython notebooks contain the experiments to be run;
The files *.py are the libraries with all the necessary functions.

Some general notes:

The names of the ipynb files refer directly to the experiments in the cited work.
The last cells of the notebooks produce the pictures of the pdf, except for plots_rules.ipynb.
To reduce the running time the parameters can be easily changed, e.g. decreasing N, n or sample.

Libraries - CaGD_log.py, CaGD_ls.py

They contain the algorithms relative to the acceleration of the GD-based methods via the
Caratheodory's Theorem. The first is specialised in the logistic regression, while the second in
the least-squares case. The second contains also functions which replicate the behaviour of ADAM and SAG.
Only the functions in CaGD_ls.py are parallelized.
Requirement: recombination.py.

Libraries - train.py, Create_dataset_A.m, src folder

These are the files necessary for the experiments done using the rules of [Nutini et al.].

Create_dataset_A.m creates the Dataset A of [Nutini et al.] using Matlab.
Train.py contains the functions which are the skeleton of the optimization procedure.
It corresponds to trainval.py in [Nutini URL]. We have :
• removed dependencies not relevant for our experiments;
• added the skeleton for the optimization procedure using the Caratheodory Sampling Procedure.
src/losses.py follows the same logic as losses.py from [Nutini URL].
We have kept only the least-squares object and we have modified it because the Cartheodory
Sampling procedure requires the gradient for any sample.
src/update_rules/update_rules.py the same structure of the same file from [Nutini URL].
The function update_Caratheodory(…) is the same as update(…) in the cited repository. We added the functions:
update_Caratheodory, recomb_step, Caratheodory_Acceleration.
The rest of the files is the same as [Nutini URL].

[Nutini et al.] Julie Nutini, Issam Laradji, and Mark Schmidt - "Let’s make block coordinate
descent go fast: Faster greedy rules, message-passing, active-set complexity, and
superlinear convergence", arXiv preprint arXiv:1712.08859, 2017.
[Nutini URL] https://github.com/IssamLaradji/BlockCoordinateDescent

Library - recombination.py

It contains the algorithms relative to the reduction of the measure presented in
COSENTINO, OBERHAUSER, ABATE - "A randomized algorithm to reduce the support of discrete measures",
NeurIPS 2020, Available at https://github.com/FraCose/Recombination_Random_Algos

Special notes to run the experiments

The notebooks "CaGD_paths.ipynb" and "Comparison_GD_vs_CaGD.ipynb" contain multiple experiments.
You have to comment/uncomment the respective parts of the code as indicated to reproduce the
wanted experiments.

To Run the Experiments - Datasets

To run the experiments, the following dataset need to be donwloaded and saved in the /Datasets folder:

3D_spatial_network.txt -
https://archive.ics.uci.edu/ml/machine-learning-databases/00246/3D_spatial_network.txt
household_power_consumption.txt -
https://archive.ics.uci.edu/ml/machine-learning-databases/00235/household_power_consumption.zip
(extract the .txt file)
NY_train.csv -
https://www.kaggle.com/c/nyc-taxi-trip-duration/data?select=train.zip
(extract the .csv file and rename it to NY_train.csv)

Funding

The authors want to thank The Alan Turing Institute and the University of Oxford
for the financial support given. FC is supported by The Alan Turing Institute, TU/C/000021,
under the EPSRC Grant No. EP/N510129/1. HO is supported by the EPSRC grant Datasig
[EP/S026347/1], The Alan Turing Institute, and the Oxford-Man Institute.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
src		src
.gitignore		.gitignore
CaBCD_paths.ipynb		CaBCD_paths.ipynb
CaGD_log.py		CaGD_log.py
CaGD_ls.py		CaGD_ls.py
CaGD_paths.ipynb		CaGD_paths.ipynb
Comparison_3DRoads_ADAM-SAG-Ca.ipynb		Comparison_3DRoads_ADAM-SAG-Ca.ipynb
Comparison_Elec_ADAM-SAG-Ca.ipynb		Comparison_Elec_ADAM-SAG-Ca.ipynb
Comparison_GD_vs_CaGD.ipynb		Comparison_GD_vs_CaGD.ipynb
Comparison_NY_ADAM-SAG-Ca.ipynb		Comparison_NY_ADAM-SAG-Ca.ipynb
LICENSE		LICENSE
README.md		README.md
create_dataset_A.m		create_dataset_A.m
plots_rules.ipynb		plots_rules.ipynb
recombination.py		recombination.py
requirements.txt		requirements.txt
train.py		train.py

License

FraCose/Caratheodory_GD_Acceleration

Folders and files

Latest commit

History

Repository files navigation

This Repository contains the Algorithms explained in Cosentino, Oberhauser, Abate "Caratheodory Sampling for Stochastic Gradient Descent"

Libraries - CaGD_log.py, CaGD_ls.py

Libraries - train.py, Create_dataset_A.m, src folder

Library - recombination.py

Special notes to run the experiments

To Run the Experiments - Datasets

Funding

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

This Repository contains the Algorithms explained in
Cosentino, Oberhauser, Abate
"Caratheodory Sampling for Stochastic Gradient Descent"