Ejemplo n.º 1
0
def pkurtosis(data, m=None, s=None):
    """pkurtosis(data [,m [,s]]) -> population kurtosis of data.

    This returns γ₂ "\\N{GREEK SMALL LETTER GAMMA}\\N{SUBSCRIPT TWO}", the
    population kurtosis relative to that of the normal distribution, also
    known as the excess kurtosis. For the "kurtosis proper" known as
    β₂ "\\N{GREEK SMALL LETTER BETA}\\N{SUBSCRIPT TWO}", add 3 to the result.

    For more information about kurtosis, see the sample kurtosis function
    ``kurtosis``.

    >>> pkurtosis([1.25, 1.5, 1.5, 1.75, 1.75, 2.5, 2.75, 4.5])
    ... #doctest: +ELLIPSIS
    0.7794232987...

    """
    n, total = stats._std_moment(data, m, s, 4)
    assert n >= 0
    assert total >= 1
    if n <= 1:
        raise StatsError('no kurtosis is defined for empty data')
    kurt = v.div(total, n)
    return v.sub(kurt, 3)
Ejemplo n.º 2
0
def kurtosis(data, m=None, s=None):
    """kurtosis(data [,m [,s]]) -> sample excess kurtosis of data.

    The kurtosis of a distribution is a measure of its shape. This function
    returns an estimate of the sample excess kurtosis usually known as g₂
    "g\\N{SUBSCRIPT TWO}". For the population kurtosis, see ``pkurtosis``.

        WARNING: The mathematical terminology and notation related to
        kurtosis is often inconsistent and contradictory. See Wolfram
        Mathworld for further details:

        http://mathworld.wolfram.com/Kurtosis.html

    >>> kurtosis([1.25, 1.5, 1.5, 1.75, 1.75, 2.5, 2.75, 4.5])
    ... #doctest: +ELLIPSIS
    3.03678892733564...

    If you already know one or both of the population mean and standard
    deviation, you can pass the mean as optional argument m and/or the
    standard deviation as s:

    >>> kurtosis([1.25, 1.5, 1.5, 1.75, 1.75, 2.5, 2.75, 4.5], m=2.25, s=1)
    2.3064453125

        CAUTION: "Garbage in, garbage out" applies here. You can pass
        any values you like as ``m`` or ``s``, but if they are not
        sensible estimates for the mean and standard deviation, the
        result returned as the kurtosis will likewise not be sensible.
        If you give either m or s, and the calculated kurtosis is out
        of range, a warning is raised.

    If m or s are not given, or are None, they are estimated from the data.

    If data is an iterable of sequences, each inner sequence represents a
    row of data, and ``kurtosis`` operates on each column. Every row must
    have the same number of columns, or ValueError is raised.

    >>> data = [[0, 1],
    ...         [1, 5],
    ...         [2, 6],
    ...         [5, 7]]
    ...
    >>> kurtosis(data)  #doctest: +ELLIPSIS
    [1.50000000000000..., 2.23486717956161...]

    Similarly, if either m or s are given, they must be either a single
    number or have the same number of items:

    >>> kurtosis(data, m=[3, 5], s=2)  #doctest: +ELLIPSIS
    [-0.140625, 18.4921875]

    The kurtosis of a population is a measure of the peakedness and weight
    of the tails. The normal distribution has kurtosis of zero; positive
    kurtosis generally has heavier tails and a sharper peak than normal;
    negative kurtosis generally has lighter tails and a flatter peak.

    There is no upper limit for kurtosis, and a lower limit of -2. Higher
    kurtosis means more of the variance is the result of infrequent extreme
    deviations, as opposed to frequent modestly sized deviations.

        CAUTION: As a rule of thumb, a non-zero value for kurtosis
        should only be treated as meaningful if its absolute value is
        larger than approximately twice its standard error. See also
        ``stderrkurtosis``.

    """
    n, total = stats._std_moment(data, m, s, 4)
    assert n >= 0
    v.assert_(lambda x: x >= 1, total)
    if n < 4:
        raise StatsError('sample kurtosis requires at least 4 data points')
    q = (n-1)/((n-2)*(n-3))
    gamma2 = v.div(total, n)
    # Don't do this:-
    # kurt = v.mul((n+1)*q, gamma2)
    # kurt = v.sub(kurt, 3*(n-1)*q)
    #   Even though the above two commented out lines are mathematically
    #   equivalent to the next two, and cheaper, they appear to be
    #   slightly less accurate.
    kurt = v.sub(v.mul(n+1, gamma2), 3*(n-1))
    kurt = v.mul(q, kurt)
    if v.isiterable(kurt): out_of_range = any(x < -2 for x in kurt)
    else: out_of_range = kurt < -2
    if m is s is None:
        assert not out_of_range, 'kurtosis failed: %r' % kurt
        # This is a "should never happen" condition, hence an assertion.
    else:
        # This, on the other hand, can easily happen if the caller
        # gives junk values for m or s. The difference between a junk
        # value and a legitimate value can be surprisingly subtle!
        if out_of_range:
            import warnings
            warnings.warn('calculated kurtosis out of range')
    return kurt