len(list) & ceil(log(4,2)) bugs

I was programming functions in Sage for Big numbers and found a few bugs.
I emailed Jason-sage and he suggested this forum.
I hope my contributions can help building a better system.
I run Sage 4.7.2 in a new Firefox offline via the Oracle Virtual Box for Fedora on an old Windows Vista, so that could have something to do with it.

BUG:
Suddenly the Python len(list) function wasn't working anymore on a simple [1,2,3]
array.
SOLUTION:
I repasted the same code on a new Worksheet and the len() function could be called again.

BUG:
I had output that was becoming very slow for mysterious reasons (a 2 to 3 seconds time lag) for some simple small values. I used t = cputime() before and cputime(t) after blocks of code and found the responsible...
CODE INSIDE FUNCTION:

def myfunction(num):              # stops for up to 3 seconds if num == 4
t = cputime();
print(cputime(t));

for m in range(2,12):
myfunction(m);

The problem is the ceil() around the log() I believe, but why at 4 and why only when that 4 is a function argument (not a local variable or a hard-coded 4) that is originally part of a range array?
SOLUTION:

    happy = ceil(log(num+0, 2));

These bugs seem to an innocent amateur like me that Sage is an unstable system.
However, I love the potential of this software!
Best Regards, GigaGerard

edit retag close merge delete

For (1) could you post a link to the entire original worksheet and the error message that it generated? If I had to bet, I'd say that you accidentally overrode the len function. As for (2), I can't reproduce it yet -- what Sage version were you using?

( 2011-12-05 13:45:23 -0500 )edit

I agree with DSM about the len function; that's almost certainly what is happening.

( 2011-12-05 18:54:34 -0500 )edit

Sort by » oldest newest most voted

(1) This is almost certainly a case of len being accidentally overwritten. The fact that it worked when cut and pasted is a giveaway. 90% of the time when something that usually works stops working, this is the problem. (The other 10% often involves Maxima, at least for me -- I have very bad luck with it.)

If you could show a worksheet that demonstrates that it wasn't this, I'd be happy to have a look.

(2) Yeah, that looks like a performance bug. Here's what's going on, and it's kind of quirky..

In some cases, log(n,b) is automatically simplified. If you use the (default) Sage Integers, for example:

sage: time log(4,2)
2
Time: CPU 0.00 s, Wall: 0.00 s
sage: time ceil(log(4,2))
2
Time: CPU 0.00 s, Wall: 0.00 s


so it's not surprising ceil is so fast. OTOH, if you do the same with Python ints (such as the bare "range" command produces) or you break the log apart, this doesn't happen:

sage: log(int(4),2)
log(4)/log(2)
sage: log(4)/log(2)
log(4)/log(2)


So in these cases Sage has to try harder. First it checks to see if this object has its own ceiling method, and it doesn't. So it builds a RealInterval object out of log(4)/log(2), and it turns out that this is about the worst thing it could do! It checks to see whether the lower and upper parts of the interval agree [on the ceiling, I mean] (so that it knows for certain what the ceiling is). But in this case, this is always going to look like:

sage: y = log(4)/log(2)
sage: rif = RealIntervalField(53)(y)
sage: rif
2.000000000000000?
sage: rif.endpoints()
(1.99999999999999, 2.00000000000001)


These two bounds [have ceilings which] aren't equal, so it keeps increasing the precision (to 20000 bits) to see if it can prove that they are, but by construction it's never going to work. Finally Sage gives up and tries to simplify it, which succeeds:

sage: time (log(4)/log(2)).full_simplify()
CPU times: user 0.02 s, sys: 0.00 s, total: 0.02 s
Wall time: 0.02 s


This is a little perverse: the very cases that give Sage the most trouble are precisely the integer cases, which should be the easiest!

It's actually a little unclear what the right way to address the problem is in general. One way might be to call full_simplify earlier in the process to see if we can short-circuit the more common cases (and looking at integers is definitely a common enough use case to optimize.) Alternatively, we could try bool(x == round(x)), which seems a little faster. I'd probably try one or two interval rounds first, to make sure we're not increasing the burden too much for noninteger cases.

(BTW, one ...

more

@DSM: Good analysis! If you open a ticket, please cc: me - this would be good Sage Days 35.5 fodder, at least part of it.

( 2011-12-06 01:59:58 -0500 )edit

This is now the agreeably palindromic trac #12121.

( 2011-12-06 03:36:29 -0500 )edit

Yes, I had a variable there somewhere called len and then I changed its name but it was still in what you call the global scope I guess as I repasted the codes from the Worksheet and then len([1,2,3]) was 3 again. Didn't know I couldn't call the function len then, a name that should be avoided is not so object-oriented!

I can understand DSM has touched the root of the problem, with the ceil(log(4,2))
Still: 2.5 seconds average cputime on a 2GHz CPU is an exceptionally long time lag! It happened every time I tried but under very specific circumstances as described above. Oh, and the myfunction(num) was called from the textfield below.

Sage 4.7.2 in Windows, I hope this helps. Thank you!

more

I don't understand your object orientation comment-- whether a builtin function can be redefined (as in Python and Ruby) isn't an issue of object orientation, only of whether or not there's a reserved set of names which can't be rebound. That is, it's not about object-oriented vs procedural (say, C++ vs. fortran) but about degree of permitted dynamism.

( 2011-12-06 02:54:35 -0500 )edit

Yes a reserved set of names, keywords, of course. I looked at Haskell and it has all its functions out in the open, I like Java better with its neat classes.

( 2011-12-06 05:12:08 -0500 )edit