Best Practices¶
This page outlines some of the best practices for users of SymPy. The best practices here will help avoid some common bugs and pitfalls that can occur when using SymPy.
This page primarily focuses on best practices that apply generally to all parts of SymPy. Best practices that are specific to certain SymPy submodules or functions are outlined in the documentation for those specific functions.
Basic Usage¶
Defining Symbols¶
Define symbols with
symbols()
orSymbol()
. Thesymbols()
function is the most convenient way to create symbols. It supports creating one or more symbols at once:>>> from sympy import symbols >>> x = symbols('x') >>> a, b, c = symbols('a b c')
Additionally, it supports adding assumptions to symbols
>>> i, j, k = symbols('i j k', integer=True)
and defining
Function
objects:>>> from sympy import Function >>> f, g, h = symbols('f g h', cls=Function)
It also supports shorthands for defining many numbered symbols at once:
>>> symbols('x:10') (x0, x1, x2, x3, x4, x5, x6, x7, x8, x9)
The
Symbol()
constructor may also be used directly. Unlikesymbols()
,Symbol()
always creates one symbol. It is the best option if you want to make a symbol with unusual characters in its name or if you are creating symbols programmatically.>>> from sympy import Symbol >>> x_y = Symbol('x y') # This creates a single symbol named 'x y'
The
var()
function should be avoided, except when working interactively. It works like thesymbols()
function, except it automatically injects symbol names into the calling namespace. This function is designed solely for interactive typing convenience and is not recommended for programmatic use.Do not use
sympify()
orS()
to create symbols. This may appear to work:>>> from sympy import S >>> x = S("x") # DO NOT DO THIS
However,
S()
/sympify()
are not designed to create symbols. They are designed to parse entire expressions. This method fails if the input string is not valid Python. It also fails if the string parses to a larger expression:>>> # These both fail >>> x = S("0x") Traceback (most recent call last): ... SyntaxError: invalid syntax (<string>, line 1) >>> x = S("x+") Traceback (most recent call last): ... SyntaxError: invalid syntax (<string>, line 1)
Any Python string can be used as a valid Symbol name.
Furthermore, all the same issues described in the Avoid String Inputs section below apply here.
Add assumptions to symbols when they are known. Assumptions can be added by passing the relevant keywords to
symbols()
. The most common assumptions arereal=True
,positive=True
(ornonnegative=True
), andinteger=True
.Assumptions are never required, but it is always recommended to include them if they are known because it will allow certain operations to simplify. If no assumptions are provided, symbols are assumed to be general complex numbers, and simplifications will not be made unless they are true for all complex numbers.
For example:
>>> from sympy import integrate, exp, oo >>> a = symbols('a') # no assumptions >>> integrate(exp(-a*x), (x, 0, oo)) Piecewise((1/a, Abs(arg(a)) < pi/2), (Integral(exp(-a*x), (x, 0, oo)), True))
>>> a = symbols('a', positive=True) >>> integrate(exp(-a*x), (x, 0, oo)) 1/a
Here, \(\int_0^\infty e^{-ax}\,dx\) gives a piecewise result when
a
is defined with no assumptions, because the integral only converges whena
is positive. Settinga
to be positive removes this piecewise.When you do use assumptions, the best practice is to always use the same assumptions for each symbol name. SymPy allows the same symbol name to be defined with different assumptions, but these symbols will be considered unequal to each other:
>>> z1 = symbols('z') >>> z2 = symbols('z', positive=True) >>> z1 == z2 False >>> z1 + z2 z + z
See also Avoid String Inputs and Don’t Hardcode Symbol Names in Python Functions for related best practices around defining symbols.
Avoid String Inputs¶
Don’t use strings as input to functions. Rather, create the objects symbolically using Symbols and the appropriate SymPy functions, and manipulate them.
Don’t
>>> from sympy import expand
>>> expand("(x**2 + x)/x")
x + 1
Do
>>> from sympy import symbols
>>> x = symbols('x')
>>> expand((x**2 + x)/x)
x + 1
It’s always best to create expressions explicitly using Python operators, but
sometimes you really do start with a string input, like if you accept an
expression from the user. If you do have a string that you are starting with,
you should parse it explicitly with
parse_expr()
. It is best to parse
all strings early and only use symbolic manipulation from there on.
>>> from sympy import parse_expr
>>> string_input = "(x**2 + x)/x"
>>> expr = parse_expr(string_input)
>>> expand(expr)
x + 1
Reason
There are many disadvantages to using strings as input to SymPy functions:
It is unpythonic and makes code harder to read. See the Zen of Python “explicit is better than implicit”.
Support for string inputs in general SymPy functions is mostly accidental. It happens because these functions call
sympify()
on their inputs in order to convert things like Pythonint
s into SymPyInteger
s. However,sympify()
also parses strings into SymPy expressions, unless thestrict=True
flag is used. Automatic parsing of strings for general SymPy functions (other thansympify()
orparse_expr()
) may go away in a future version of SymPy.Typos in symbol or function names can go unnoticed. This is because all undefined names in the string will be automatically parsed into Symbols or Functions. If the input has a typo, the string will still parse correctly, but the output will not be what was expected. For example
>>> from sympy import expand_trig >>> expand_trig("sine(x + y)") sine(x + y)
Compare this to the explicit error you get when not using strings:
>>> from sympy import sin, symbols >>> x, y = symbols('x y') >>> expand_trig(sine(x + y)) # The typo is caught by a NameError Traceback (most recent call last): ... NameError: name 'sine' is not defined >>> expand_trig(sin(x + y)) sin(x)*cos(y) + sin(y)*cos(x)
In the first example,
sine
, a typo forsin
, is parsed intoFunction("sine")
, and it appears thatexpand_trig
cannot handle it. In the second case, we immediately get an error from the undefined namesine
, and fixing our typo, we see thatexpand_trig
can indeed do what we want.The biggest gotcha when using string inputs comes from using assumptions. In SymPy, if two symbols have the same name but different assumptions, they are considered unequal:
>>> z1 = symbols('z') >>> z2 = symbols('z', positive=True) >>> z1 == z2 False >>> z1 + z2 z + z
It is generally recommended to avoid doing this, as it can lead to confusing expressions like the one above (see Defining Symbols above).
However, string inputs will always create symbols without assumptions. So if you have a symbol with an assumption and later try to use the string version of it, you will end up with confusing results.
>>> from sympy import diff >>> z = symbols('z', positive=True) >>> diff('z**2', z) 0
The answer here is apparently wrong, but what happened is that the
z
in"z**2"
parsed toSymbol('z')
with no assumptions, which SymPy considers to be a different symbol fromz = Symbol('z', positive=True)
, which is used as the second argument todiff()
. So as far asdiff
is concerned, the expression is constant and the result is 0.This sort of thing is particularly bad because it generally doesn’t lead to any errors. It will just silently give the “wrong” answer because SymPy will be treating symbols that you thought were the same as different. The situation is avoided by not using string inputs.
If you are parsing strings, and you want some of the symbols in it to have certain assumptions, you should create those symbols and pass them to the dictionary to
parse_expr()
. For example:Don’t
>>> a, b, c = symbols('a b c', real=True) >>> # a, b, and c in expr are different symbols without assumptions >>> expr = parse_expr('a**2 + b - c') >>> expr.subs({a: 1, b: 1, c: 1}) # The substitution (apparently) doesn't work a**2 + b - c
Do
>>> # a, b, and c are the same as the a, b, c with real=True defined above >>> expr = parse_expr('a**2 + b - c', {'a': a, 'b': b, 'c': c}) >>> expr.subs({a: 1, b: 1, c: 1}) 1
Many SymPy operations are defined as methods, not functions, that is, they are called like
sympy_obj.method_name()
. These methods won’t work on strings, since they are not yet SymPy objects. For example:>>> "x + 1".subs("x", "y") Traceback (most recent call last): ... AttributeError: 'str' object has no attribute 'subs'
Contrasted with:
>>> x, y = symbols('x y') >>> (x + 1).subs(x, y) y + 1
Symbol names can contain any character, including things that aren’t valid Python. But if you use strings as input, it is impossible to use such symbols. For example
>>> from sympy import solve >>> solve('x_{2} - 1') ValueError: Error from parse_expr with transformed code: "Symbol ('x_' ){Integer (2 )}-Integer (1 )" ... SyntaxError: invalid syntax (<string>, line 1)
This doesn’t work because
x_{2}
is not valid Python. But it is perfectly possible to use this as a Symbol name:>>> x2 = symbols('x_{2}') >>> solve(x2 - 1, x2) [1]
Actually, the above is the best case scenario, where you get an error. It is also possible you might get something unexpected:
>>> solve('x^1_2 - 1') [-1, 1, -I, I, -1/2 - sqrt(3)*I/2, -1/2 + sqrt(3)*I/2, 1/2 - sqrt(3)*I/2, 1/2 + sqrt(3)*I/2, -sqrt(3)/2 - I/2, -sqrt(3)/2 + I/2, sqrt(3)/2 - I/2, sqrt(3)/2 + I/2]
What happened here is that instead of parsing
x^1_2
as \(x^1_2\), it is parsed asx**12
(^
is converted to**
and_
is ignored in numeric literals in Python).If we instead create a Symbol, the actual contents of the symbol name are ignored. It is always represented as a single symbol.
>>> x12 = symbols('x^1_2') >>> solve(x12 - 1, x12) [1]
If you use strings, syntax errors won’t be caught until the line is run. If you build up the expressions, syntax errors will be caught immediately by before any of it runs.
Syntax highlighting in code editors doesn’t typically recognize and color-code the content of strings, whereas it can recognize Python expressions.
Avoid Manipulating Expressions as Strings¶
If you find yourself doing a lot of string or regular expression manipulations
on symbolic expressions, this is generally a sign that you are using SymPy
incorrectly. It’s better to build up expressions directly with operators like
+
, -
, *
, and /
and SymPy’s various functions and methods. String-based
manipulations can introduce errors, grow complex quickly, and lose the
benefits of symbolic expression structures.
The reason for this is that there is no notion of a symbolic expression in a
string. To Python, "(x + y)/z"
is no different from "/x+)(y z "
, which is
the same string with the characters in another order. To contrast, a SymPy
expression actually knows about what type of mathematical object it
represents. SymPy has many methods and functions for building and manipulating
expressions, and they all operate on SymPy objects, not strings.
For example
Don’t
>>> expression_str = '+'.join([f'{i}*x_{i}' for i in range(10)])
>>> expr = parse_expr(expression_str)
>>> expr
x_1 + 2*x_2 + 3*x_3 + 4*x_4 + 5*x_5 + 6*x_6 + 7*x_7 + 8*x_8 + 9*x_9
Do
>>> from sympy import Add, Symbol
>>> expr = Add(*[i*Symbol(f'x_{i}') for i in range(10)])
>>> expr
x_1 + 2*x_2 + 3*x_3 + 4*x_4 + 5*x_5 + 6*x_6 + 7*x_7 + 8*x_8 + 9*x_9
See also the previous section on avoiding string inputs to functions.
Exact Rational Numbers vs. Floats¶
If a number is known to be exactly equal to some quantity, avoid defining it as a floating-point number.
For example,
Don’t
>>> expression = x**2 + 0.5*x + 1
Do
>>> from sympy import Rational
>>> expression = x**2 + Rational(1, 2)*x + 1
>>> expression = x**2 + x/2 + 1 # Equivalently
However, this isn’t to say that you should never use floating-point numbers in SymPy, only that if a more exact value is known it should be preferred. SymPy does support arbitrary precision floating-point numbers, but some operations may not perform as well with them.
This also applies to non-rational numbers which can be represented exactly. For
example, one should avoid using math.pi
and prefer sympy.pi
, since the
former is a numerical approximation to \(\pi\) and the latter is exactly \(\pi\)
(see also Separate Symbolic and Numeric Code below; in
general, one should avoid importing math
when using SymPy).
Don’t
>>> import math
>>> import sympy
>>> math.pi
3.141592653589793
>>> sympy.sin(math.pi)
1.22464679914735e-16
Do
>>> sympy.pi
pi
>>> sympy.sin(sympy.pi)
0
Here sympy.sin(math.pi)
is not exactly 0, because math.pi
is not exactly \(\pi\).
One should also take care to avoid writing integer/integer
where both
integers are explicit integers. This is because Python will evaluate this to a
floating-point value before SymPy is able to parse it.
Don’t
>>> x + 2/7 # The exact value of 2/7 is lost
x + 0.2857142857142857
In this case, use Rational
to create a rational number, or use
S()
shorthand if you want to save on typing.
Do
>>> from sympy import Rational, S
>>> x + Rational(2, 7)
x + 2/7
>>> x + S(2)/7 # Equivalently
x + 2/7
Reason
Exact values, if they are known, should be preferred over floats for the following reasons:
An exact symbolic value can often be symbolically simplified or manipulated. A float represents an approximation to an exact real number, and therefore cannot be simplified exactly. For example, in the above example,
sin(math.pi)
does not produce0
becausemath.pi
is not exactly \(\pi\). It is just a floating-point number that approximates \(\pi\) to 15 digits (effectively, a close rational approximation to \(\pi\), but not exactly \(\pi\)).Some algorithms will not be able to compute a result if there are floating-point values, but can if the values are rational numbers. This is because rational numbers have properties that make it easier for these algorithms to work with them. For instance, with floats, one can have a situation where a number should be 0, but due to approximation errors, does not equal exactly 0.
A particularly notable example of this is with floating-point exponents. For example,
>>> from sympy import factor >>> factor(x**2.0 - 1) x**2.0 - 1 >>> factor(x**2 - 1) (x - 1)*(x + 1)
SymPy Floats have the same loss of significance cancellation issues that can occur from using finite precision floating-point approximations:
>>> from sympy import expand >>> expand((x + 1.0)*(x - 1e-16)) # the coefficient of x should be slightly less than 1 x**2 + 1.0*x - 1.0e-16 >>> expand((x + 1)*(x - Rational('1e-16'))) # Using rational numbers gives the coefficient of x exactly x**2 + 9999999999999999*x/10000000000000000 - 1/10000000000000000
It is possible to avoid these issues in SymPy in many cases by making careful use of
evalf
with its ability to evaluate in arbitrary precision. This typically involves either computing an expression with symbolic values and substituting them later withexpr.evalf(subs=...)
, or by starting withFloat
values with a precision higher than the default of 15 digits:>>> from sympy import Float >>> expand((x + 1.0)*(x - Float('1e-16', 20))) x**2 + 0.9999999999999999*x - 1.0e-16
A Float
number can be converted to its exact rational equivalent by passing
it to Rational
. Alternatively, you can use nsimplify
to find the nicest
rational approximation. This can sometimes reproduce the number that was
intended if the number is supposed to be rational (although again, it’s best
to just start with rational numbers in the first place, if you can):
>>> from sympy import nsimplify
>>> Rational(0.7)
3152519739159347/4503599627370496
>>> nsimplify(0.7)
7/10
Avoid simplify()
¶
simplify()
(not to be confused with sympify()
) is designed as
a general purpose heuristic. It tries various simplification algorithms on the
input expression and returns the result that seems the “simplest” based on
some metric.
simplify()
is perfectly fine for interactive use, where you just want SymPy
to do whatever it can to an expression. However, in programmatic usage, it’s
better to avoid simplify()
and use more targeted simplification
functions instead (e.g., cancel()
, expand()
,
or collect()
).
There are a few reasons why this is generally preferred:
Due to its heuristical nature,
simplify()
can potentially be slow, since it tries a lot of different approaches to try to find the best simplification.There are no guarantees about what form an expression will have after being passed through
simplify()
. It may actually end up “less simple” by whatever metric you were hoping for. To contrast, targeted simplification functions are very specific about what behaviors they have and what they guarantee about the output. For example,factor()
will always factor a polynomial into irreducible factors.cancel()
will always convert a rational function into the form \(p/q\) where \(p\) and \(q\) are expanded polynomials with no common factors.
The documentation for each function describes exactly what behavior it will have on the input expression.
A targeted simplification will not do something unexpected if the expression contains an unexpected form, or an unexpected subexpression. This is especially the case if simplification functions are applied with
deep=False
to only apply the simplification to the top-level expression.
Some other simplification functions are heuristical in nature, and care should
be taken with them as well. For example, the trigsimp()
function is a
heuristic targeted to trigonometric functions, but the routines in the
sympy.simplify.fu
submodule allow applying specific trigonometric
identities.
The simplify section of the tutorial and the simplify module reference list the various targeted simplification functions.
In some cases, you may know exactly what simplification operations you wish to
apply to an expression, but there may not be an exact set of simplification
functions that do them. When this happens, you can create your own targeted
simplification using replace()
, or in general, manually using
advanced expression manipulation.
Don’t Hardcode Symbol Names in Python Functions¶
Instead of hard-coding Symbol
names inside of a function
definition, make the symbols a parameter to the function.
For example, consider a function theta_operator
that computes the theta
operator \(\theta =
zD_z\):
Don’t
def theta_operator(expr):
z = symbols('z')
return z*expr.diff(z)
Do
def theta_operator(expr, z):
return z*expr.diff(z)
A hard-coded symbol name has the disadvantage of requiring all expressions to
use that exact symbol name. In the above example, it is not possible to
compute \(\theta = xD_x\) because it is hard-coded to \(zD_z\). What’s worse,
trying to do so silently leads to a wrong result instead of an error, since
x
is treated as a constant expression:
>>> def theta_operator(expr):
... z = symbols('z')
... return z*expr.diff(z)
>>> theta_operator(x**2) # The expected answer is 2*x**2
0
This is particularly problematic if the function accepts arbitrary user input,
as the user may be using a different variable name that makes more sense in
their mathematical context. And if the user already used the symbol z
but as
a constant, they would need to swap things around with subs
before being
able to use the function.
The other reason this antipattern is problematic is due to the gotcha that symbols with assumptions are considered unequal to symbols without assumptions. If someone defined their expression using
>>> z = symbols('z', positive=True)
for example, to make further simplifications possible (see
Defining Symbols above), the function hard-coding
Symbol('z')
without assumptions would not work:
>>> theta_operator(z**2)
0
By making the symbol an argument to the function, like theta_operator(expr, z)
, these problems all go away.
Separate Symbolic and Numeric Code¶
SymPy sets itself apart from most of the rest of the libraries in the Python ecosystem in that it operates symbolically, whereas other libraries, like NumPy, operate numerically. These two paradigms are different enough that it’s always best to keep them as separate as possible.
Importantly, SymPy is not designed to work with NumPy arrays, and conversely, NumPy will not work directly with SymPy objects.
>>> import numpy as np
>>> import sympy
>>> a = np.array([0., 1., 2.])
>>> sympy.sin(a)
Traceback (most recent call last):
...
AttributeError: 'ImmutableDenseNDimArray' object has no attribute 'as_coefficient'
>>> x = Symbol('x')
>>> np.sin(x) # NumPy functions do not know how to handle SymPy expressions
Traceback (most recent call last):
...
TypeError: loop of ufunc does not support argument 0 of type Symbol which has no callable sin method
If you want to use both SymPy and NumPy, you should explicitly convert your
SymPy expressions into NumPy functions using lambdify()
. The typical
workflow in SymPy is to model your problem symbolically using SymPy, then
convert the result into a numerical function with lambdify()
that can be
evaluated on NumPy arrays. For advanced use-cases, lambdify()
/NumPy may not
be enough and you may instead need to use SymPy’s more general code
generation routines to generate code for other fast numerical
languages such as Fortran or C.
>>> # First symbolically construct the expression you are interested in with SymPy
>>> from sympy import diff, sin, exp, lambdify, symbols
>>> x = symbols('x')
>>> expr = diff(sin(x)*exp(x**2), x)
>>> # Then convert it to a numeric function with lambdify()
>>> f = lambdify(x, expr)
>>> # Now use this function with NumPy
>>> import numpy as np
>>> a = np.linspace(0, 10)
>>> f(a)
[ 1.00000000e+00 1.10713341e+00 1.46699555e+00 ... -3.15033720e+44]
These are some antipatterns that should be generally avoided
Do not use
import math
. It is virtually never necessary to use the standard librarymath
module alongside SymPy (or NumPy). Every function that is inmath
is already in SymPy. SymPy can compute values numerically using evalf, which provides more precision and accuracy thanmath
. Or better, SymPy will by default compute things symbolically. Functions and constants inmath
are floats, which are inexact. SymPy always works better with exact quantities when possible. For example,>>> import math >>> math.pi # a float 3.141592653589793 >>> import sympy >>> sympy.sin(math.pi) 1.22464679914735e-16
The result of
sympy.sin(math.pi)
is not0
as you might expect, becausemath.pi
is only an approximation of \(\pi\), equal to 16 digits. On the other hand,sympy.pi
is exactly equal to \(\pi\) because it is represented symbolically, so it is able to give the exact answer:>>> sympy.sin(sympy.pi) 0
So in general, one should prefer symbolic representations. But even if you actually do want a float, you are better off using SymPy’s
evalf()
rather thanmath
. This avoids the pitfall thatmath
functions can only operate onfloat
objects, not symbolic expressions>>> x = Symbol('x') >>> math.sin(x) Traceback (most recent call last): ... TypeError: Cannot convert expression to float
And furthermore, SymPy’s
evalf()
is more accurate thanmath
, because it uses arbitrary precision arithmetic, and allows you to specify any number of digits.>>> sympy.sin(1).evalf(30) 0.841470984807896506652502321630 >>> math.sin(1) 0.8414709848078965
Even when using NumPy,
math
should be avoided. NumPy functions are faster than theirmath
equivalents, support a larger range of numerical dtypes, and can operate on arrays of values, whereasmath
functions can only operate on a single scalar at a time.Don’t pass SymPy expressions to a NumPy function. You should not pass a SymPy expression to a NumPy function. This includes anything in the
numpy
orscipy
namespaces, as well as most functions from other Python libraries such asmatplotlib
. These functions are only designed to work with NumPy arrays with numeric values.Don’t pass SymPy expressions to a lambdified function. Similar to the previous point, you should not pass SymPy expressions to a function created with
lambdify
. In effect, the functions returned bylambdify
are NumPy functions, so the situation here is exactly the same. It is possible that in some cases a function created fromlambdify()
will work with a SymPy expression, but this is just an accident of the way it works. See the “how it works” section of thelambdify()
documentation for more details on why this happens.Avoid storing SymPy expressions in a NumPy array. While it is technically possible to store SymPy expressions inside of a NumPy array, doing so usually represents a mistake. A sign that this is happening is if the
dtype
of the NumPy array isobject
(instead of a numeric dtype likefloat64
orint64
).Just as one should avoid using NumPy when doing symbolic calculations with SymPy, one should stop using SymPy once the calculation have moved over to the numeric side of things with NumPy.
A NumPy array that contains SymPy expressions effectively has the same problem as trying to call NumPy functions directly on a SymPy expression. They do not know how to operate on SymPy objects, so they will fail. This applies even if the SymPy objects are all SymPy
Float
s.>>> import numpy as np >>> import sympy >>> a = np.asarray([sympy.Float(1.0), sympy.Float(0.0)]) # Do not do this >>> print(repr(a)) # Note that the dtype is 'object' array([1.00000000000000, 0.0], dtype=object) >>> np.sin(a) Traceback (most recent call last): ... TypeError: loop of ufunc does not support argument 0 of type Float which has no callable sin method
If you are doing this, you should probably either be using native NumPy floats, or, if you really do want to store an array of SymPy expressions, you should use SymPy’s
Matrix
orNDimArray
classes.
Advanced Usage¶
Be Careful Comparing and Sorting Symbolic Objects¶
Be careful with programmatic code that compares numerical quantities, either
directly using an inequality (<
, <=
, >
, >=
) or indirectly with
something like sorted
. The issue is that if an inequality is unknown, the
result will be symbolic, like
>>> x > 0
x > 0
A symbolic inequality will raise an exception if bool()
is called on it, due
to the ambiguity:
>>> bool(x > 0)
Traceback (most recent call last):
...
TypeError: cannot determine truth value of Relational
A check like
if x > 0:
...
May work just fine if you only ever test it for numerical x
. But if x
can
ever be symbolic, the above code is wrong. It will fail with TypeError: cannot determine truth value of Relational
. If you ever see this exception,
it means this error has been made somewhere (sometimes the error is in SymPy
itself; if this appears to be the case, please open an
issue).
The exact same issue occurs when using sorted
, since this internally uses >
.
>>> sorted([x, 0])
Traceback (most recent call last):
...
TypeError: cannot determine truth value of Relational
There are a few options for fixing this issue, and the correct one to choose depends on what you are doing:
Disallow symbolic inputs. If your function cannot possibly work on symbolic inputs, you can explicitly disallow them. The primary benefit here is to give a more readable error message to users than
TypeError: cannot determine truth value of Relational
. Theis_number
attribute can be used to check if an expression can be evaluated to a specific number withevalf()
. If you want to only accept integers, you can checkisinstance(x, Integer)
(after callingsympify()
to convert Python ints). Beware thatis_integer
uses the assumptions system and may be True even for symbolic objects, likeSymbol('x', integer=True)
.Use the assumptions system. If you do support symbolic inputs, you should use the assumptions system to check for things like
x > 0
, e.g., usingx.is_positive
. When doing this, you should always be aware of the nuances of the three-valued fuzzy logic used in the assumptions system. That is, always be aware that an assumption could beNone
, meaning its value is unknown and could be either true or false. For example,if x.is_positive: ...
will only run the block if
x.is_positive
isTrue
, but you may want to do something whenx.is_positive
isNone
.Return a Piecewise result. If the result of a function depends on an inequality or other boolean condition, you can use
Piecewise
to return a result that represents both possibilities symbolically. This is generally preferred when possible, as it offers the most flexibility. This is because the result is represented symbolically, meaning, for instance, one can later substitute specific values for the symbols and it will evaluate to the specific case, even if it is combined with other expressions.For example, instead of
if x > 0: expr = 1 else: expr = 0
this can be represented symbolically as
>>> from sympy import Piecewise, pprint >>> expr = Piecewise((1, x > 0), (0, True)) >>> pprint(expr, use_unicode=True) ⎧1 for x > 0 ⎨ ⎩0 otherwise >>> expr.subs(x, 1) 1 >>> expr.subs(x, -1) 0
Use
ordered()
to sort expressions into a canonical order. If you are trying to usesorted
because you want a canonical ordering, but you don’t particularly care what that ordering is, you can useordered
.>>> from sympy import ordered >>> list(ordered([x, 0])) [0, x]
Alternatively, try to write the function in a way so that the correctness of the result does not depend on the order that arguments are processed in.
Custom SymPy Objects¶
SymPy is designed to be extended with custom classes, typically by subclassing Basic, Expr, or Function. All the symbolic classes in SymPy itself are written this way, and the points here apply equally to them as to user-defined classes.
For an in-depth guide on how to write a Function
subclass, see the guide on
writing custom functions.
Args Invariants¶
Custom SymPy objects should always satisfy the following invariants:
all(isinstance(arg, Basic) for arg in args)
expr.func(*expr.args) == expr
The first says that all elements of args should be instances of
Basic. The second says that an expression should be rebuildable from
its args
(note that func is usually the same as type(expr)
, though
it may not always be).
These two invariants are assumed throughout SymPy, and are essential for any function that manipulates expressions.
For example, consider this simple function, which is a simplified version of
xreplace()
:
>>> def replace(expr, x, y):
... """Replace x with y in expr"""
... newargs = []
... for arg in expr.args:
... if arg == x:
... newargs.append(y)
... else:
... newargs.append(replace(arg, x, y))
... return expr.func(*newargs)
>>> replace(x + sin(x - 1), x, y)
y + sin(y - 1)
The function works by recursively traversing the args
of expr
, and
rebuilding it except any instances of x
are replaced by y
.
It’s easy to see how this function would break if the args invariants did not hold:
If an expression had args that were not
Basic
, they would fail withAttributeError
on a recursive call, because the non-Basic
args would not have the.args
or.func
attributes.If an expression did not rebuild from its
args
, the linereturn exr.func(*newargs)
would fail, even in the trivial case where none of the args are changed by the replacement, which should effectively be a no-op.
Making all args
instances of Basic
usually just means calling _sympify()
on the inputs to the class so that they are basic instances. If you want to
store a string on a class, you should either use a Symbol
or
sympy.core.symbols.Str
.
In some cases a class may accept args in multiple equivalent forms. It is
important that whatever form is stored in args
is one of the ways that can
be used to reconstruct the class. It is okay to normalize args
as long as
that normalized form is accepted as input. For example, Integral
always
stores the variable argument as a tuple to make things easier to process
internally, but this form is also accepted by the class constructor:
>>> from sympy import Integral
>>> expr = Integral(sin(x), x)
>>> expr.args # args are normalized
(sin(x), (x,))
>>> Integral(sin(x), (x,)) # Also accepted
Integral(sin(x), x)
Note that most user-defined custom functions should be defined by subclassing
Function
(see the guide to writing custom functions).
The Function
class automatically takes care of both of the args invariants,
so if you are using it, you do not need to worry about this.
Avoid Too Much Automatic Evaluation¶
When defining a custom function, avoid doing too much automatic evaluation
(i.e., evaluation in the eval
or __new__
methods).
Generally, automatic evaluation should only be done in instances where it is
fast, and it is something that no one ever want to not happen. Automatic
evaluation is difficult to undo. A good rule of thumb is to evaluate on
explicit numeric values (isinstance(x, Number)
), and leave everything else
symbolically unevaluated.
Further simplification using more advanced identities should be done in
specific simplification functions or doit
(see the custom functions
guide for a list of common simplification routines that can
be defined on SymPy objects).
The custom functions guide goes over
this in depth (but note that this guideline applies equally to all SymPy
objects, not just functions). But in a nutshell, the reason for this is that
the only way to prevent automatic evaluation is to use evaluate=False
, which
is fragile. Additionally, code will invariably be written assuming the
invariants that are true due to automatic evaluations, meaning that
expressions created with evaluate=False
can lead to wrong results from this
code. This also means that removing automatic evaluation later can be
difficult.
Evaluation that can potentially be expensive (for instance, applying a
symbolic identity) is itself bad because it can make creating an expression
without even doing anything with it allow. This also applies to checking for
symbolic assumptions (like x.is_integer
), so this should also be avoided in
class constructors.
Don’t
class f(Function):
@classmethod
def eval(cls, x):
if x.is_integer: # Bad (checking general assumptions)
return 0
if isinstance(x, Add): # Bad (applying symbolic identities)
return Add(*[f(i) for i in x.args])
Do
class f(Function):
@classmethod
def eval(cls, x):
if isinstance(x, Integer): # Good (only evaluating on explicit integers)
return 0
# Good (applying simplification on assumptions in doit())
def doit(self, deep=True, **hints):
x = self.args[0]
if deep:
x = x.doit(deep=deep, **hints)
if x.is_integer:
return S(0)
return self
# Good (applying symbolic identities inside of simplification functions)
def _eval_expand_func(self, **hints):
x = self.args[0]
if isinstance(x, Add):
return Add(*[f(i) for i in x.args])
return self
Note that not all the classes in SymPy currently follow this guideline very well, but it is something that we are improving.
Don’t Denest Collections¶
Functions and classes that accept an arbitrary number of arguments should
either accept the arguments directly, like f(*args)
, or as a single
argument, like f(args)
. They should not try to support both at once.
The reason is that this makes it impossible to represented nested collections.
For example, take the FiniteSet
class. It is constructed like
FiniteSet(x, y, z)
(i.e., using *args
).
>>> from sympy import FiniteSet
>>> FiniteSet(1, 2, 3)
{1, 2, 3}
It might be tempting to also support FiniteSet([1, 2, 3])
, to match the
built-in set
. However, doing so would make it impossible to represent a
nested FiniteSet
containing a single FiniteSet
, like \(\{\{1, 2, 3\}\}\):
>>> FiniteSet(FiniteSet(1, 2, 3)) # We don't want this to be the same as {1, 2, 3}
FiniteSet({1, 2, 3})
As to whether args
or *args
should be used, if it is only possible for
there to be a finite number of arguments, *args
is generally better, as this
makes things easier to deal with using the object’s args, since
obj.args
will be the direct arguments of the class. However, if it is
possible that you might want to support a symbolic infinite collection in
addition to finite ones, like Integers
or Range
, then it
is better to use args
as this will be impossible to do with *args
.
Avoid Storing Extra Attributes on an Object¶
A common reason that you might want to create a custom SymPy object is that you want to store extra attributes on the object. However, doing this in a naive way, i.e., by simply storing the data as a Python attribute on the object, is almost always a bad idea.
SymPy does not expect objects to have extra data stored in them beyond what is
in their args. For instance, this breaks ==
checking, which only
compares an objects args
. See the Don’t Overwrite __eq__ section below for
why it is a bad idea to override __eq__
. This section and that one are
closely related.
Typically, there is a better way to do what you are trying to do, depending on the specific details of your situation:
Store the extra data in the object’s
args
. This is the best approach if the extra data you want to store is part of the mathematical description of your object.As long as the data is representable using other SymPy objects, it can be stored in
args
. Note that an object’sargs
should be usable to recreate the object (e.g., something likeYourObject(*instance.args)
should recreateinstance
).Additionally, it should be mentioned that it is not a good idea to subclass
Symbol
if you plan to store anything extra inargs
.Symbol
is designed around having noargs
. You are better off subclassingFunction
(see Writing Custom Functions) orExpr
directly. If you simply want to have two symbols that are distinct from one another, the best approach is often just to give them different names. If you are concerned about how they are printed, you can replace them with a more canonical name when it comes time to print things, or use a custom printer.Store the data about the object separately. This is the best approach if the extra data is not directly related to an object’s mathematical properties.
Remember that SymPy objects are hashable, so they can easily be used as dictionary keys. So maintaining a separate dictionary of
{object: extra_data}
pairs is straightforward.Note that some SymPy APIs already allow redefining how they operate on objects separately from the objects themselves. A big example of this is the printers, which allow defining custom printers that change how any SymPy object is printed without modifying those object themselves. Functions like
lambdify()
andinit_printing()
allow passing in a custom printer.Represent the attribute using different subclasses. This is often a good idea if there are only a few possible values for the attribute (e.g., a boolean flag). Code duplication can be avoided by using a common superclass.
If the data you want to store is a Python function, it’s best to just use as a method on the class. In many cases, the method may already fit into one of the existing set of overridable SymPy methods. If you want to define how a function evaluates itself numerically, you can use
implemented_function()
.Represent the information using by modifying the object’s
func
. This solution is much more complicated than the others, and should only be used when it is necessary. In some extreme cases, it is not possible to represent every mathematical aspect of an object usingargs
alone. This can happen, for example, because of the limitation thatargs
should only containBasic
instances. It is still possible to create custom SymPy objects in these situations by using a custom func that is different fromtype(expr)
(in this case, you would override__eq__
on thefunc
rather than on the class).However, this sort of situation is rare.
Don’t Overwrite __eq__
¶
When building a custom SymPy object, it is sometimes tempting to overwrite
__eq__
to define custom logic for the ==
operator. This is almost always a
bad idea. Custom SymPy classes should leave __eq__
undefined and use the
default implementation in the Basic
superclass.
In SymPy, ==
compares objects using structural equality. That is, a == b
means that a
and b
are exactly the same object. They have the same
type and the same args. ==
does not perform any sort of
mathematical equality checking. For example,
>>> x*(x - 1) == x**2 - x
False
==
also always returns a boolean True
or False
. Symbolic equations can
be represented with Eq
.
There are several reasons for this
Mathematical equality checking can be very expensive to compute, and in general, it is computationally impossible to determine.
Python itself automatically uses
==
in various places and assumes that it returns a boolean and is inexpensive to compute. For example, ifb
is a builtin Python container likelist
,dict
, orset
, thena in b
uses==
.[1]
SymPy internally uses
==
all over the place, both explicitly and implicitly via things likein
or dictionary keys. This usage all implicitly assumes that==
operates structurally.
In affect, structural equality means that if a == b
is True
, then a
and b
are for all intents and purposes the same object. This is because all
SymPy objects are immutable. When a ==
, any SymPy function may
freely replace a
with b
in any subexpression.
The default __eq__
method on Basic checks if the two objects have
the same type and the same args
. There are also many parts of SymPy that
implicitly assume that if two objects are equal, then they have the same
args
. Therefore, it is not a good idea to try to override __eq__
as a way
to avoid storing some identifying information about an object in its args
.
The args
of an object should contain everything that is needed to recreate
it (see args). Note that it is possible for an objects constructor to
accept multiple forms of arguments, so long as it accepts the form stored in
args
(e.g., it is perfectly fine for some args to have default values).
Here are some examples of reasons you might be tempted to override __eq__
and the preferred alternatives:
To make
==
apply some smarter equality check than purely structural equality. As noted above, this is a bad idea because too many things implicitly assume==
works structurally only. Instead, use a function or method to implement the smarter equality checking (for example, theequals
method).Another option is to define a canonicalization method that puts objects into canonical form (e.g., via
doit
), so that, for instance,x.doit() == y.doit()
is true wheneverx
andy
are mathematically equal. This is not always possible because not every type of object has a computable canonical form, but it is a convenient approach when one does exist.To make
==
check for some additional attributes beyond those stored in theargs
of an expression. See the Avoid Storing Extra Attributes on an Object section above for more details on why it’s a bad idea to directly store extra attributes on a SymPy object, and what the best alternatives are.To make
==
compare equal to some non-SymPy object. It is preferable to extendsympify
to be able to convert this object into the SymPy object. The default__eq__
implementation will automatically callsympify
on the other argument if it isn’t aBasic
instance (e.g.,Integer(1) == int(1)
givesTrue
). It is possible to extendsympify
both for objects you control by defining a_sympy_
method and for objects you do not control by extending theconverter
dictionary. See thesympify()
documentation for more details.
Avoiding Infinite Recursion from Assumptions Handlers¶
When writing assumptions handlers on custom functions like _eval_is_positive
(see the custom
functions guide for details on how to do this),
there are two important things to keep in mind:
Firstly, avoid creating new expressions inside of an assumption handler. You should always pull apart the arguments of a function directly instead. The reason is that creating a new expression could itself result in an assumptions query. This can easily lead to infinite recursion. And even when it doesn’t, creating a new expression that itself could lead to many recursive assumptions queries is bad for performance compared to querying the desired property more directly.
This generally means using methods like as_independent()
or
{meth}
~.as_coeff_muland checking the
args` of expressions directly (see the custom functions
guide for an example).
Secondly, do not recursively evaluate assumptions on self
in assumptions
handlers. Assumptions handlers should only check for assumptions on
self.args
. The global assumptions system will automatically handle
implications between different assumptions.
For example, you may be tempted to write something like
# BAD
class f(Function):
def _eval_is_integer(self):
# Quick return if self is not real (do not do this).
if self.is_real is False:
return False
return self.args[0].is_integer
However, the if self.is_real is False
check is completely unnecessary. The
assumptions system already knows that integer
implies real
, and it will
not bother checking is_integer
if it already knows that is_real
is False.
If you define the function this way, it will lead to an infinite recursion:
>>> class f(Function):
... def _eval_is_integer(self):
... if self.is_real is False:
... return False
... return self.args[0].is_integer
>>> f(x).is_real
Traceback (most recent call last):
...
RecursionError: maximum recursion depth exceeded while calling a Python object
Instead, define the handler based on the arguments of the function only:
# GOOD
class f(Function):
def _eval_is_integer(self):
return self.args[0].is_integer