Statistics

Note: This module has been deprecated. See the stats module.

The statistics module in SymPy implements standard probability distributions and related tools. Its contents can be imported with the following statement:

>>> from sympy import *
>>> from sympy.statistics import *
>>> init_printing(use_unicode=False, wrap_line=False, no_global=True)

Normal distributions

Normal(mu, sigma) creates a normal distribution with mean value mu and standard deviation sigma. The Normal class defines several useful methods and properties. Various properties can be accessed directly as follows:

>>> N = Normal(0, 1)
>>> N.mean
0
>>> N.median
0
>>> N.variance
1
>>> N.stddev
1

You can generate random numbers from the desired distribution with the random method:

>>> N = Normal(10, 5)
>>> N.random() 
4.914375200829805834246144514
>>> N.random() 
11.84331557474637897087177407
>>> N.random() 
17.22474580071733640806996846
>>> N.random() 
9.864643097429464546621602494

The probability density function (pdf) and cumulative distribution function (cdf) of a distribution can be computed, either in symbolic form or for particular values:

>>> N = Normal(1, 1)
>>> x = Symbol('x')
>>> N.pdf(1)
   ___
 \/ 2
--------
    ____
2*\/ pi
>>> N.pdf(3).evalf()
0.0539909665131880
>>> N.cdf(x)
   /  ___        \
   |\/ 2 *(x - 1)|
erf|-------------|
   \      2      /   1
------------------ + -
        2            2
>>> N.cdf(-oo), N.cdf(1), N.cdf(oo)
(0, 1/2, 1)
>>> N.cdf(5).evalf()
0.999968328758167

The method probability gives the total probability on a given interval (a convenient alternative syntax for cdf(b)-cdf(a)):

>>> N = Normal(0, 1)
>>> N.probability(-oo, 0)
1/2
>>> N.probability(-1, 1)
   /  ___\
   |\/ 2 |
erf|-----|
   \  2  /
>>> N.probability(-1, 1).evalf()
0.682689492137086

You can also generate a symmetric confidence interval from a given desired confidence level (given as a fraction 0-1). For the normal distribution, 68%, 95% and 99.7% confidence levels respectively correspond to approximately 1, 2 and 3 standard deviations:

>>> N = Normal(0, 1)
>>> N.confidence(0.68)
(-0.994457883209753, 0.994457883209753)
>>> N.confidence(0.95)
(-1.95996398454005, 1.95996398454005)
>>> N.confidence(0.997)
(-2.96773792534178, 2.96773792534178)

Plug the interval back in to see that the value is correct:

>>> N.probability(*N.confidence(0.95)).evalf()
0.950000000000000

Other distributions

Besides the normal distribution, uniform continuous distributions are also supported. Uniform(a, b) represents the distribution with uniform probability on the interval [a, b] and zero probability everywhere else. The Uniform class supports the same methods as the Normal class.

Additional distributions, including support for arbitrary user-defined distributions, are planned for the future.

API Reference

Sample

class sympy.statistics.distributions.Sample[source]

Sample([x1, x2, x3, ...]) represents a collection of samples. Sample parameters like mean, variance and stddev can be accessed as properties. The sample will be sorted.

Examples

>>> from sympy.statistics.distributions import Sample
>>> Sample([0, 1, 2, 3])
Sample([0, 1, 2, 3])
>>> Sample([8, 3, 2, 4, 1, 6, 9, 2])
Sample([1, 2, 2, 3, 4, 6, 8, 9])
>>> s = Sample([1, 2, 3, 4, 5])
>>> s.mean
3
>>> s.stddev
sqrt(2)
>>> s.median
3
>>> s.variance
2

Continuous Probability Distributions

class sympy.statistics.distributions.ContinuousProbability[source]

Base class for continuous probability distributions

probability(s, a, b)[source]

Calculate the probability that a random number x generated from the distribution satisfies a <= x <= b

Examples

>>> from sympy.statistics import Normal
>>> from sympy.core import oo
>>> Normal(0, 1).probability(-1, 1)
erf(sqrt(2)/2)
>>> Normal(0, 1).probability(1, oo)
-erf(sqrt(2)/2)/2 + 1/2
random(s, n=None)[source]

random() – generate a random number from the distribution. random(n) – generate a Sample of n random numbers.

Examples

>>> from sympy.statistics import Uniform
>>> x = Uniform(1, 5).random()
>>> x < 5 and x > 1
True
>>> x = Uniform(-4, 2).random()
>>> x < 2 and x > -4
True
class sympy.statistics.distributions.Normal(mu, sigma)[source]

Normal(mu, sigma) represents the normal or Gaussian distribution with mean value mu and standard deviation sigma.

Examples

>>> from sympy.statistics import Normal
>>> from sympy import oo
>>> N = Normal(1, 2)
>>> N.mean
1
>>> N.variance
4
>>> N.probability(-oo, 1)   # probability on an interval
1/2
>>> N.probability(1, oo)
1/2
>>> N.probability(-oo, oo)
1
>>> N.probability(-1, 3)
erf(sqrt(2)/2)
>>> _.evalf()
0.682689492137086
cdf(s, x)[source]

Return the cumulative density function as an expression in x

Examples

>>> from sympy.statistics import Normal
>>> Normal(1, 2).cdf(0)
-erf(sqrt(2)/4)/2 + 1/2
>>> from sympy.abc import x
>>> Normal(1, 2).cdf(x)
erf(sqrt(2)*(x - 1)/4)/2 + 1/2
confidence(s, p)[source]

Return a symmetric (p*100)% confidence interval. For example, p=0.95 gives a 95% confidence interval. Currently this function only handles numerical values except in the trivial case p=1.

For example, one standard deviation:

>>> from sympy.statistics import Normal
>>> N = Normal(0, 1)
>>> N.confidence(0.68)
(-0.994457883209753, 0.994457883209753)
>>> N.probability(*_).evalf()
0.680000000000000

Two standard deviations:

>>> N = Normal(0, 1)
>>> N.confidence(0.95)
(-1.95996398454005, 1.95996398454005)
>>> N.probability(*_).evalf()
0.950000000000000
static fit(sample)[source]

Create a normal distribution fit to the mean and standard deviation of the given distribution or sample.

Examples

>>> from sympy.statistics import Normal
>>> Normal.fit([1,2,3,4,5])
Normal(3, sqrt(2))
>>> from sympy.abc import x, y
>>> Normal.fit([x, y])
Normal(x/2 + y/2, sqrt((-x/2 + y/2)**2/2 + (x/2 - y/2)**2/2))
pdf(s, x)[source]

Return the probability density function as an expression in x

Examples

>>> from sympy.statistics import Normal
>>> Normal(1, 2).pdf(0)
sqrt(2)*exp(-1/8)/(4*sqrt(pi))
>>> from sympy.abc import x
>>> Normal(1, 2).pdf(x)
sqrt(2)*exp(-(x - 1)**2/8)/(4*sqrt(pi))
class sympy.statistics.distributions.Uniform(a, b)[source]

Uniform(a, b) represents a probability distribution with uniform probability density on the interval [a, b] and zero density everywhere else.

cdf(s, x)[source]

Return the cumulative density function as an expression in x

Examples

>>> from sympy.statistics import Uniform
>>> Uniform(1, 5).cdf(2)
1/4
>>> Uniform(1, 5).cdf(4)
3/4
confidence(s, p)[source]

Generate a symmetric (p*100)% confidence interval.

>>> from sympy import Rational
>>> from sympy.statistics import Uniform
>>> U = Uniform(1, 2)
>>> U.confidence(1)
(1, 2)
>>> U.confidence(Rational(1,2))
(5/4, 7/4)
static fit(sample)[source]

Create a uniform distribution fit to the mean and standard deviation of the given distribution or sample.

Examples

>>> from sympy.statistics import Uniform
>>> Uniform.fit([1, 2, 3, 4, 5])
Uniform(-sqrt(6) + 3, sqrt(6) + 3)
>>> Uniform.fit([1, 2])
Uniform(-sqrt(3)/2 + 3/2, sqrt(3)/2 + 3/2)
pdf(s, x)[source]

Return the probability density function as an expression in x

Examples

>>> from sympy.statistics import Uniform
>>> Uniform(1, 5).pdf(1)
1/4
>>> Uniform(2, 4).pdf(2)
1/2
class sympy.statistics.distributions.PDF(func, (x, a, b)) represents continuous probability distribution with probability distribution function func(x) on interval (a, b)[source]

If func is not normalized so that integrate(func, (x, a, b)) == 1, it can be normalized using PDF.normalize() method

Examples

>>> from sympy import Symbol, exp, oo
>>> from sympy.statistics.distributions import PDF
>>> from sympy.abc import x
>>> a = Symbol('a', positive=True)
>>> exponential = PDF(exp(-x/a)/a, (x,0,oo))
>>> exponential.pdf(x)
exp(-x/a)/a
>>> exponential.cdf(x)
1 - exp(-x/a)
>>> exponential.mean
a
>>> exponential.variance
a**2
cdf(x)[source]

Return the cumulative density function as an expression in x

Examples

>>> from sympy.statistics.distributions import PDF
>>> from sympy import exp, oo
>>> from sympy.abc import x, y
>>> PDF(exp(-x/y), (x,0,oo)).cdf(4)
y - y*exp(-4/y)
>>> PDF(2*x + y, (x, 10, oo)).cdf(0)
-10*y - 100
normalize()[source]

Normalize the probability distribution function so that integrate(self.pdf(x), (x, a, b)) == 1

Examples

>>> from sympy import Symbol, exp, oo
>>> from sympy.statistics.distributions import PDF
>>> from sympy.abc import x
>>> a = Symbol('a', positive=True)
>>> exponential = PDF(exp(-x/a), (x,0,oo))
>>> exponential.normalize().pdf(x)
exp(-x/a)/a
transform(func, var)[source]

Return a probability distribution of random variable func(x) currently only some simple injective functions are supported

Examples

>>> from sympy.statistics.distributions import PDF
>>> from sympy import oo
>>> from sympy.abc import x, y
>>> PDF(2*x + y, (x, 10, oo)).transform(x, y)
PDF(0, ((_w,), x, x))

Table Of Contents

Previous topic

Details on the Hypergeometric Function Expansion Module

Next topic

Stats

This Page