Nesterov Gradient Methods

Implementation of methods from "Gradient methods for minimizing composite functions" by Yu. Nesterov for minimization problem:

where y is a vector of observations (empirical data) of endogenous variables, X is a matrix of observations of exogenous variables, beta is a vector of model's parameters and lambda is a penalty parameter.

link to paper: https://link.springer.com/article/10.1007/s10107-012-0629-5

Please note that some transformations were needed to obtain a method from the paper above (like three pages of them). They are included in the pdf file.

Conclusions:

Accelerated Method were always the fastest to converge, both in terms of number of steps and computation time.
Basic Method and Dual Gradient Method always converged in the same pace when it comes to the number of steps.
Basic Method execution was slightly faster, causing a very small lead over Dual Gradient Method in convergence time.
In all tests, all three methods converged to the same final results.
The way methods estimated the Lipschitz constant differed throughout the problems. Accelerated Method tended to use higher estimations than the other two methods.
Values of gamma_u did not have a significant impact on the models' performance.
Values of gamma_u had a huge impact on the models' execution time. Raising gamma_u was causing the execution time to drop significantly.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
src		src
.gitignore		.gitignore
README.md		README.md
oda.pdf		oda.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src

src

.gitignore

.gitignore

README.md

README.md

oda.pdf

oda.pdf

Repository files navigation

Nesterov Gradient Methods

About

Releases

Packages

Contributors 2

Languages

JakubBilski/nesterov-gradient-methods

Folders and files

Latest commit

History

Repository files navigation

Nesterov Gradient Methods

About

Resources

Stars

Watchers

Forks

Languages