Skip to content

Python package to compute running descriptive statistics over data

License

Notifications You must be signed in to change notification settings

eruffaldi/pylivestat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

livestat

livestat

Python module to compute running statistics over data, like when measuring timings from a stream.

Properties:

  • count
  • min,max,mean
  • std and variance
  • kurtosis and skewness
  • merge of two livestats preserving statistics
  • arithmetic operation over stat: + - * /
  • standardization of the stat: uses mean and std to compute offset and scaling and then applies to the data updating the other statistics
  • minmax normalization to [0,1]: uses mean and std to compute offset and scaling and then applies to the data updating the other statistics

Normality tests:

  • jarque_bera
  • kurtosis and skewness

The main class is LiveStat to which data can be appended with append(x). For incremental values the DeltaLiveStat provides an easy to use helper.

Usage:

from livestat import LiveStat,DeltaLiveStat

x = LiveStat("optionalname")
x.append(10)
x.append(20)
print x # count is 2

x = DeltaLiveStat("dt")
x.append(10)
x.append(20)
print x # count is 1 containing the difference

#also from array
x.extend([10,20,30,40,50])

Extra Features:

# the LiveStat objects can be combined for example when performing over different data Windows or in a multiprocessing environment
x.merge(y) # now x contains the merge of the statistics

# the LiveStat object can be multipled by scalar or translated, for the objective of performing some unit transformation. All the measures are transformed appropriately
x + 5
x * 5

Planned

  • numpy support for fast forwarding append, and for vectorial statistics

Package Repository

This project is maintained here: https://github.com/eruffaldi/pylivestat

Related

The faststat package is similar:

https://pypi.python.org/pypi/faststat/
https://github.com/doublereedkurt/faststat/

More sophisticated features can be found in scipy.stats https://docs.scipy.org/doc/scipy-0.14.0/reference/tutorial/stats.html

About

Python package to compute running descriptive statistics over data

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages