TITLE: | util |
PURPOSE: | A machine learning, optimization, and data science utilities package. |
AUTHOR: | Thomas C.H. Lux |
EMAIL: | tchlux@vt.edu |
$ pip install git+https://github.com/tchlux/util.git
Contains various useful utilities.
Classes for many different approximation algorithms. Contains wrappers for converting pure approximators (numeric outputs) to classifiers.
Class "Data" behaves like a modified Pandas dataframe, but is written in pure python.
Decorator function "same_as" makes the signature and documentation of a function copy another. Decorator function "cache" generates a unique file name for (input,output) pairs from a function and stores the pair in a serialized file for faster re-execution. Decorator function "stability_lock" uses a cache of (input,output) pairs to check if a function maintains the same behaviors after modifications are made. Decorator function "timeout" uses a system call to cancel a python function (must respond to global interpreter lock) after a certain amount of time has elapsed. Decorator function "type_check" performs (unpythonic) type checking of function inputs before executing a function.
Function "make_test_data" splits a given data set into components that allow for detailed analysis of model performance with increasing dimension and amounts of training data. Class "MDA_Iterator" simplifies the process of iterating through test cases generated by the function "make_test_data".
Function "minimize" uses a meta-heuristic optimization algorithm to solve a global optimization problem given an arbitrary function.
Provides an extensive interface to HTML plotting through plotly. All documentation is within module, see documentation of submodule with "from util import plotly; help(plotly)" for more details.
Contains useful statistical functions for data analysis.
Function "run" is a (python+OS)-safe interface to command-line execution that cleanly handles errors. Class "AtomicOpen" provides an atomic file operation class that uses system locking mechanisms to enforce atomic operations on files.