Skip to content

shangrz/ramp

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ramp - Rapid Machine Learning Prototyping

Ramp is a python module for rapid prototyping of machine learning solutions. It is essentially a pandas wrapper around various python machine learning and statistics libraries (scikit-learn, rpy2, etc.), providing a simple, declarative syntax for exploring features, algorithms and transformations quickly and efficiently.

Documentation: http://ramp.readthedocs.org

Why Ramp?

  • Complex feature transformations

    Chain and combine features:

    Normalize(Log('x'))
    Interactions([Log('x1'), (F('x2') + F('x3')) / 2])
    

    Reduce feature dimension:

    DimensionReduction([F('x%d'%i) for i in range(100)], decomposer=PCA(n_components=3))
    

    Incorporate residuals or predictions to blend with other models:

    Residuals(config_model1) + Predictions(config_model2)
    

    Any feature that uses the target ("y") variable will automatically respect the current training and test sets.

  • Caching

    Ramp caches and stores on disk in fast HDF5 format (or elsewhere if you want) all features and models it computes, so nothing is recomputed unnecessarily. Results are stored and can be retrieved, compared, blended, and reused between runs.

  • Easy extensibility

    Ramp has a simple API, allowing you to plug in estimators from scikit-learn, rpy2 and elsewhere, or easily build your own feature transformations, metrics, feature selectors, reporters, or estimators.

Quick example

Getting started with Ramp: Classifying insults

Status

Ramp is very alpha currently, so expect bugs, bug fixes and API changes.

Requirements

  • Numpy
  • Scipy
  • Pandas
  • PyTables
  • Sci-kit Learn

About

Rapid Machine Learning Prototyping in Python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published