# Could we have a leaf_count() function in base sagemath?

There is leaf_count() type function which is standard and buildin in other CAS systems (Mathematica and Maple). It is very useful. It is used to measure the size of a mathematical expression in a standard way.

This will make it easier for example, to compare the size of the anti-derivative from a call to integrate from different CAS systems.

There are couple of ad-hoc attempts on the net now to do this in sagemath, but they do not work for all expressions, (give errors in some cases) and do not produce good measure of the size of the expression.

leaf_count() is described in https://reference.wolfram.com/language/ref/LeafCount.html

and

https://www.maplesoft.com/support/help/Maple/view.aspx?path=MmaTranslator/Mma/LeafCount

LeafCount counts the number of subexpressions in expr that correspond to "leaves" on the expression tree.

So basically, if it possible to obtain the expression tree of any sagemath mathematical expression, leaf_count() should just return the number of leaf nodes of that tree.

Currently, the way I obtain leaf_count() for a sagemath expression, is by calling Maple and passing it the expression (as a string), then use Maple's LeafCount function there (after converting the expression back to Maple, using parse function).

I use Maple not Mathematica for this, since the syntax of the sagemath expressions are much closer to Maple's.

But I prefer to do this all in sagemath. Much simpler. But the current implementation I saw does not work well for everything, and gives different result of the measure of the expression.

I am sure a sagemath internals expert here could write such a function and add it to sagemath as a standard buildin function. May be for 9.4 version?

The following are some examples

expr:=x;
MmaTranslator:-Mma:-LeafCount(expr)

1

expr:=1/2*(log(b*x + a) - log(b*x - a))/(a*b);
MmaTranslator:-Mma:-LeafCount(expr)

25

expr:=arctan(b*x/a)/(a*b);
MmaTranslator:-Mma:-LeafCount(expr)

14

expr:=-1/12*sqrt(3)*arctan(1/12*(7*sqrt(3)*cos(x)^2 - 4*sqrt(3))/(cos(x)*sin(x)));
MmaTranslator:-Mma:-LeafCount(expr)

31

expr:=[-sqrt(-a*b)*log((a*x - b - 2*sqrt(-a*b)*sqrt(x))/(a*x + b))/(a*b), -2*sqrt(a*b)*arctan(sqrt(a*b)/(a*sqrt(x)))/(a*b)];
MmaTranslator:-Mma:-LeafCount(expr)

68


The sagemath leaf_count() does not have to give same exact value as the above, but it should be very close to it.

Thank you

--Nasser

edit retag close merge delete

Sort by » oldest newest most voted

To every expression one can associate a syntactic tree.

You are asking for the size of that tree.

Since all tree vertices count I would call it "tree size" rather than "leaf count".

So let us write a small tree_size function.

I don't know whether Sage already has such a function, or whether a shorter one might use Sage's "expression tree walker" and "map reduce".

The function below gives the tree size of a symbolic expression, as well as of a list, tuple, or vector of symbolic expressions, or any Sage object that can be converted to a symbolic expression.

def tree_size(expr):
r"""
Return the tree size of this expression.
"""
if expr not in SR:
# deal with lists, tuples, vectors
return 1 + sum(tree_size(a) for a in expr)
expr = SR(expr)
x, aa = expr.operator(), expr.operands()
if x is None:
return 1
else:
return 1 + sum(tree_size(a) for a in aa)


Here is a companion function for the string length:

def string_size(expr):
r"""
Return the length of the string representation of this expression.
"""
return len(str(expr))


Let us define a list of the test cases in the question.

a, b, x = SR.var('a, b, x')
ee = [
x,
1/2*(log(b*x + a) - log(b*x - a))/(a*b),
arctan(b*x/a)/(a*b),
-1/12*sqrt(3)*arctan(1/12*(7*sqrt(3)*cos(x)^2 - 4*sqrt(3))/(cos(x)*sin(x))),
[-sqrt(-a*b)*log((a*x - b - 2*sqrt(-a*b)*sqrt(x))/(a*x + b))/(a*b),
-2*sqrt(a*b)*arctan(sqrt(a*b)/(a*sqrt(x)))/(a*b)],
]


and run the function on them:

sage: print('\n'.join(f'* {e}\n  {tree_size(e)}, {string_size(e)}' for e in ee))
* x
1, 1
* 1/2*(log(b*x + a) - log(b*x - a))/(a*b)
25, 39
* arctan(b*x/a)/(a*b)
14, 19
* -1/12*sqrt(3)*arctan(1/12*(7*sqrt(3)*cos(x)^2 - 4*sqrt(3))/(cos(x)*sin(x)))
31, 75
* [-sqrt(-a*b)*log((a*x - b - 2*sqrt(-a*b)*sqrt(x))/(a*x + b))/(a*b),
-2*sqrt(a*b)*arctan(sqrt(a*b)/(a*sqrt(x)))/(a*b)]
68, 117


Perfect match with the desired values from the question!

Now also works for the follow-up requests in the comment:

sage: [tree_size(a) for a in [1, 1/2, 3.4, i, CC(3, 2)]]
[1, 1, 1, 1, 1]
sage: x = polygen(ZZ)
sage: p = x - x^2
sage: p
-x^2 + x
sage: tree_size(p)
7
sage: f = (x - x^2) / (1 - 3*x)
sage: f
(-x^2 + x)/(-3*x + 1)
sage: tree_size(f)
15


Related:

more

Great, will test it more. But would it be it possible to make it handle special cases when the input is non symbolic? i.e. just numbers? This will make it more complete. For example tree_size(1) and tree_size(3.4) and tree_size(1/2) all of these return 1 in Maple. now these calls generate errors. Either object has no attribute 'operator' or maximum recursion depth exceeded. I am sure a one extra special check is all what is needed?. Thanks.

The trick is probably to define correctly what is a leaf. First rough attempt :

def LC(x):
def is_leaf(x):
from sage.symbolic.expression import Expression
return type(x) is not Expression or x.is_symbol() or x.is_numeric()
if is_leaf(x): return 1
return sum(map(LC, x.operands()))
sage: LC(sin(a+b))
2
sage: LC(sin(a+b).trig_expand())
4
sage: LC(integrate(sin(x), x, a, b, hold=True))
4


Note that, by design, it won't work on anything else than a symbolic expression ; polynomials, rational fractions, series, etc..., here considered as a leaf, need their own implementation, whose definition of "leaf" isn't a trivial question. Consider :

sage: R1.<t>=QQ[]
sage: LC(t^3-3*t^2+t-1)
1
sage: LC(SR(t^3-3*t^2+t-1))
7
sage: len((t^3-3*t^2+t-1).coefficients())
4
sage: len((t^3).coefficients())
1
sage: len((t^3).coefficients(sparse=False))
4


Anyway, this is probably not sufficiently "general" to deserve an implementation in "core" Sage. A package grouping some tree-related utilities could be considered...

BTW, the "complexity" of an expression isn't only a function of the number of its leaves, but also of the number of its non-terminal nodes, the distribution of the depth of its branches and the structure of common subexpressions... AFAIK, there is no universally acknowledged metric of an expression's "complexity"...

HTH,

more