def midrange(data): """Returns the midrange of a sequence of numbers. >>> midrange([2.0, 3.0, 3.5, 4.5, 7.5]) 4.75 The midrange is halfway between the smallest and largest element. It is a weak measure of central tendency. """ try: L, H = minmax(data) except ValueError as e: e.args = ('no midrange defined for empty iterables',) raise return (L + H)/2
def range(data, interval=0): """range(iterable [, interval=0]) -> sample range R of data The range R is the difference between the smallest and largest element in the given sample. It is an unbiased but weak measure of variability, and is frequently used in process control applications. >>> range([1.0, 3.5, 7.5, 2.0, 0.25]) 7.25 For N > 15, the sampling distribution of R becomes unstable and it is wise to treat the sample range with caution. An even better measure of variability is R/d2, where d2 is a value that depends only on N. For samples taken from a normally-distributed population, the d2 values are available by looking up N in the dict ``range.d2``. For small N (say, up to about 10) R/d2 makes a good estimator of the population standard deviation. Correction for binned or rounded data ------------------------------------- If the data points have been uniformly rounded (perhaps by binning, or by rounding to a fixed number of decimal places, or simply due to measurement error), the samples represent intervals rather than exact values. E.g. if x=1.2 is given to one decimal place, x could actually be any number between 1.15 and 1.25. In this case, it is appropriate to make an adjustment to the sample range by taking into account the width of the data interval: >>> range([1.2, 3.0, 1.5, 2.4, 0.2], 0.1) 2.9 The ``interval`` argument is optional, with default value of 0. If given, it must be a non-negative number. No attempt is made to check that the data points actually are consistent with the given interval. """ if interval < 0: raise ValueError('interval must be non-negative') try: a, b = minmax(data) except ValueError as e: e.args = ('no range defined for empty iterables',) raise return b - a + interval
def fivenum(data): """Return Tukey's five number summary from data. The five summary numbers are: minimum, lower-hinge, median, upper-hinge, maximum >>> tuple(fivenum([2, 4, 6, 8, 10, 12, 14, 16, 18])) (2, 6, 10, 14, 18) The summary is a namedtuple with the following fields: minimum lower_hinge median upper_hinge maximum If the data has length N of the form ``4n+5`` (e.g. 5, 9, 13, 17...) then the hinges can be visualised by writing out the sorted data in the shape of a W, where each limb of the W is equal is length. For example, the data (A,B,C,...,M) has N=13 and would be written out like this: A G M B F H L C E I K D J The hinges are D, G and J and the fivenum summary is (A, D, G, J, M). For data with length that doesn't match ``4n+5``, the three hinges are interpolated. They are equivalent to ``quartiles`` called with scheme=1. """ if isinstance(data, str): raise TypeError('data argument cannot be a string') data = sorted(data) a, b = minmax(data) h1, m, h2 = quartiles(data, scheme=1) summary = collections.namedtuple('fivenum', 'minimum lower_hinge median upper_hinge maximum') return summary(a, h1, m, h2, b)