# why is symbolic comparison so slow?

 5 I tried to write a little differentiatior: from operator import add, mul, pow def mydiff(s,x=x): if x not in SR(s).variables(): return 0 elif s == x: return 1 elif s == log(x): return 1/x elif s == sin(x): return cos(x) elif s == cos(x): return -sin(x) elif s == tan(x): return 1+tan(x)^2 elif s == arcsin(x): return 1/sqrt(1-x^2) elif s == arctan(x): return 1/(1+x^2) op = s.operator() ops = s.operands() if op == pow and ops[0]==x and ops[1]._is_numeric(): return ops[1]*ops[0]^(ops[1]-1) elif op == add: return mydiff(ops[0]) + mydiff(reduce(op,ops[1:])) elif op == mul: f = ops[0]; g = reduce(mul,ops[1:]) return g*mydiff(f) + f*mydiff(g) elif len(ops)==1: return mydiff(ops[0],x)*mydiff(op(x),x).subs(x=ops[0]))  one my one year old macbook i need to wait more than 17 second to get the following answer %time mydiff(sum(k*x^k,k,0,10)) CPU times: user 12.39 s, sys: 0.78 s, total: 13.17 s Wall time: 17.49 s 100*x^9 + 81*x^8 + 64*x^7 + 49*x^6 + 36*x^5 + 25*x^4 + 16*x^3 + 9*x^2 + 4*x + 1  why does it that long? asked Aug 26 '10 Philipp Schneider 153 ● 3 ● 4 ● 11

 8 The previous answers (with the exception of Nils Bruin's) are incorrect in their assessment of the problem -- the slowdown is indeed coming from trying to decide whether "x == tan(x)" is True or False. In particular, the slowdown comes a lot from our overhead in talking to Maxima. Sage performs the following things when trying to test whether an equation is zero or not. Check to see if both sides are constants. If they are, compare the constants. Ask GiNaC / Pynac to see if the expressions are structurally equal (i.e., they have are the same expression trees.) If they are, then we can immediately say things are equal. Check to make sure that there are no assumptions on any of the variables. If there are, then we have to immediately send things to Maxima. If there are no assumptions, then we can try evaluating at a number of points to try to verify that they are provably different using interval arithmetic. This is done in the test_relation method of Expression objects. If that doesn't work, we send it to Maxima. Since the Maxima output gives False if they are not equal as well as if it can't prove they are equal, we can only rely on a True answer from Maxima. Thus, a number of simplifications are done to try to "prove" equality. This is all done in sage.symbolic.relation.test_relation_maxima If all of the above things don't work, then we have to return False. There are a number of areas for improvement: Most importantly there are quite a few cases in which the test_relation method fails. For example, there some code to check to see if using CIF for the values isn't working and if so switch over to RIF. However, that code is broken. Fixing that would allow something like sage: eq = x == tan(x) sage: eq.test_relation() False  instead of giving NotImplemented like it does now. We can add a syntactically_equal method which just exposes, GiNaC's relational_to_bool test. If we just do that and modify mydiff appropriately, then we get sage: k = var('k') sage: s = sum(k*x^k,k,0,10) sage: %time mydiff(s, x) CPU times: user 0.02 s, sys: 0.00 s, total: 0.02 s Wall time: 0.02 s 100*x^9 + 81*x^8 + 64*x^7 + 49*x^6 + 36*x^5 + 25*x^4 + 16*x^3 + 9*x^2 + 4*x + 1  I will make tickets for these and add the links to them here. posted Aug 27 '10 Mike Hansen 3840 ● 21 ● 46 ● 84 thank you! Philipp Schneider (Aug 27 '10)
 1 the problem is neither Python, nor recursion, but Sage expressions. If I replace those with lists and strings the computation time drops to 0: def mydiff(s,x='x'): if not x in str(s): return 0 if s == x: return 1 elif s == ['log',x]: return ['pow',x,-1] elif s == ['sin',x]: return ['cos',x] elif s == ['cos',x]: return ['mul',-1,['sin',x]] elif s == ['tan',x]: return ['add',1,['pow',['tan',x],2]] elif s == ['arcsin',x]: return ['pow',['add',1,['mul',-1,['pow',x,2]]],-1/2] elif s == ['arctan',x]: return ['pow',['add',1,['pow',x,2]],-1] op = s[0] ops = s[1:] if op == 'pow' and ops[0]==x and ops[1] in Rationals(): return ['mul',ops[1],['pow',ops[0],(ops[1]-1)]] elif op == 'add': return ['sum',map(mydiff,ops)] elif op == 'mul': if len(ops) == 2: if ops[0] in Rationals(): return ['mul',ops[0],mydiff(ops[1])] else: f = ops[0]; g = ops[1] return ['add',['mul',g,mydiff(f)],['mul',f,mydiff(g)]]  Now: sage: var('k') sage: s = ['add']+[['mul',k,['pow','x',k]] for k in [1..10]] sage: %time mydiff(s) CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s Wall time: 0.00 s ['sum', [['mul', 1, ['mul', 1, ['pow', 'x', 0]]], ['mul', 2, ['mul', 2, ['pow', 'x', 1]]], ['mul', 3, ['mul', 3, ['pow', 'x', 2]]], ['mul', 4, ['mul', 4, ['pow', 'x', 3]]], ['mul', 5, ['mul', 5, ['pow', 'x', 4]]], ['mul', 6, ['mul', 6, ['pow', 'x', 5]]], ['mul', 7, ['mul', 7, ['pow', 'x', 6]]], ['mul', 8, ['mul', 8, ['pow', 'x', 7]]], ['mul', 9, ['mul', 9, ['pow', 'x', 8]]], ['mul', 10, ['mul', 10, ['pow', 'x', 9]]]]]  posted Aug 27 '10 Philipp Schneider 153 ● 3 ● 4 ● 11
 1 One obvious thing is that the wall time is considerably longer than the cpu-time. That can mean your process didn't get 100% of the CPU time or that Sage called out to a different process (maxima?) to get some results. Indeed, if you run: import cProfile S=sum(k*x^k,k,0,10) cProfile.run("mydiff(S)")  you'll find 4995 calls to maxima.py:722(_eval_line)  indicating that there were 4995 calls to another process. There are also various calls to various "simplify" routines, so either in construction or in comparison of symbolic expressions, Sage did feel the need to simplify things. In short: If you want to do pattern matching on symbolic expressions, do not use "==". It is apparently already reserved for a fairly expensive equality test. I am surprised that this program still triggers calls out to maxima. I don't think it's "reduce", since that only receives 18 calls and takes negligible time in the profile. All the other operations, including "==" should really map to straight pynac operations. posted Aug 27 '10 nbruin 91 ● 2 ● 6
 1 This is more like a comment to Mike's answer, but it's rather long and I don't see how to add code to comments, so I'll put it here. A workaround to check if an expression is trivially equal to another is to use match(). sage: tan(x).match(tan(x)) is not None True sage: tan(x).match(log(x)) is not None False  This is very fast: sage: sage: %timeit bool(sin(x) == sin(x)) 625 loops, best of 3: 19.8 µs per loop sage: sage: %timeit bool(sin(x) == log(x)) 5 loops, best of 3: 639 ms per loop sage: sage: %timeit bool(sin(x).match(log(x)) is not None) 625 loops, best of 3: 8.3 µs per loop sage: sage: %timeit bool(sin(x).match(sin(x)) is not None) 625 loops, best of 3: 10.9 µs per loop  Here is the mydiff() function using this hack (and removing the reduce() call for sums): from operator import add, mul, pow def mydiff2(s,x=x): if not s.has(x): return 0 op = s.operator() ops = s.operands() if s.match(x) is not None: return 1 elif op is log and ops[0].match(x) is not None: return 1/x elif op is sin and ops[0].match(x) is not None: return cos(x) elif op is cos and ops[0].match(x) is not None: return -sin(x) elif op is tan and ops[0].match(x) is not None: return 1+tan(x)**2 elif op is arcsin and ops[0].match(x) is not None: return 1/sqrt(1-x**2) elif op is arctan and ops[0].match(x) is not None: return 1/(1+x**2) elif op is pow and (ops[0].match(x) is not None) and ops[1]._is_numeric(): return ops[1]*ops[0]**(ops[1]-1) elif op is add: return sum(map(mydiff2, ops)) elif op is mul: f = ops[0]; g = reduce(mul,ops[1:]) return g*mydiff2(f) + f*mydiff2(g)  Now I get: sage: var('k') k sage: inp = sum(k*x^k,k,0,10) sage: inp 10*x^10 + 9*x^9 + 8*x^8 + 7*x^7 + 6*x^6 + 5*x^5 + 4*x^4 + 3*x^3 + 2*x^2 + x sage: load ../symbolic_eq2.py sage: %time res2 = mydiff2(inp) CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s Wall time: 0.00 s sage: res2 100*x^9 + 81*x^8 + 64*x^7 + 49*x^6 + 36*x^5 + 25*x^4 + 16*x^3 + 9*x^2 + 4*x + 1  Note that speeding up symbolic comparison is #6799 on trac. posted Aug 28 '10 burcin 1098 ● 11 ● 34 http://erocal.org/burcin
 0 Because pure python is slow. No offense but why would you want to do this? Symbolic calculations are usually done with pynac (C++ wrapped into Sage) or Maxima (40 year old optimized Lisp wrapped into Sage). If you want something about symbolic differentiation improved you should contribute to those projects. What do you need that isn't already there? For your example on my computer: var('x,k') time diff(sum(k*x^k,k,0,10),x)  gives 100*x^9 + 81*x^8 + 64*x^7 + 49*x^6 + 36*x^5 + 25*x^4 + 16*x^3 + 9*x^2 + 4*x + 1 Time: CPU 0.02 s, Wall: 0.04 s  posted Aug 26 '10 mhampton 359 ● 2 ● 6 ● 15
 0 technically, apart from mhamptons comment, there are two problems: (I've just briefly looked at the code and maybe I didn't understand it) your code is recursive and that is not a good idea for python. you rip apart the expression and before each recursion and then you partially rebuild it with reduce. this creates a bunch of calculations for a new expression which you don't need if your differentiator would take a list of operands and a operation directly. posted Aug 27 '10 schilly 267 ● 4 ● 5 ● 10
 0 @mhampton i don't wont to write the next best differentiator in the world. I'm just trying to give an example on how to program with symbolic expressions in sage. @harald sorry, i should change the line return mydiff(ops[0]) + mydiff(reduce(op,ops[1:]))  into return sum(map(mydiff,ops))  then no ripping apart and reconstruction occurs. (I was trying to mimic pattern matching.) Still, this doesn't help much. Does anybody know how Sage tests equality of symbolic expressions? Does it try to simplify those each time before it compares them? posted Aug 27 '10 Philipp Schneider 153 ● 3 ● 4 ● 11

[hide preview]