Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

answered 2011-01-01 15:44:36 -0500

DSM gravatar image

First one point: in general you probably don't want to use "range" in a generator expression because that actually builds the list, and avoiding list construction is one of the reasons people use generators and iterators in the first place. You often have to have a pretty sizable array to notice the difference, I admit:


----------------------------------------------------------------------
| Sage Version 4.6, Release Date: 2010-10-30                         |
| Type notebook() for the GUI, and license() for information.        |
----------------------------------------------------------------------
sage: w = 10**7
sage: time gen = (n for n in xrange(1,w) if (is_prime(n) & is_prime(n+2)))
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.00 s
sage: time gen = (n for n in IntegerRange(1,w) if (is_prime(n) & is_prime(n+2)))
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.00 s
sage: time gen = (n for n in range(1,w) if (is_prime(n) & is_prime(n+2)))
CPU times: user 0.19 s, sys: 0.25 s, total: 0.44 s
Wall time: 0.44 s

Note that there are int construction overheads the first time, and for small range size (for some definition of small) the difference won't matter -- range might even be faster -- but it's still going through the loop, so it'll break for really big sizes even if you only ever look at a few values. Try w=10**9 and prepare to wait..

As for your real question, I don't know of a good way to clone a generator expression using the standard tools. (A ".restart()" method wouldn't make any sense in many cases, which is one of the reasons it's not part of the protocol.) You can use itertools.tee to make multiple cached copies of the output, but since it's caching the values to do the trick you could have memory problems or accuracy problems if the function is nondeterministic.

But if you want to reinitialize the generator and start over, why not simply redeclare it instead? If you dislike the code duplication, you could wrap it in a function:


sage: def twin_prime_gen(upto=10**4):
....:         return (n for n in xrange(1,upto+1) if (is_prime(n) & is_prime(n+2))) 
....: 
sage: a = twin_prime_gen()
sage: next(a)
3
sage: next(a)
5
sage: a = twin_prime_gen()  # instead of a.restart(), which doesn't exist
sage: next(a)
3

which I think should have mostly the same effect.

First one point: in general you probably don't want to use "range" in a generator expression because that actually builds the list, and avoiding list construction is one of the reasons people use generators and iterators in the first place. You often have to have a pretty sizable array to notice the difference, I admit:


----------------------------------------------------------------------
| Sage Version 4.6, Release Date: 2010-10-30                         |
| Type notebook() for the GUI, and license() for information.        |
----------------------------------------------------------------------
sage: w = 10**7
sage: time gen = (n for n in xrange(1,w) if (is_prime(n) & is_prime(n+2)))
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.00 s
sage: time gen = (n for n in IntegerRange(1,w) if (is_prime(n) & is_prime(n+2)))
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.00 s
sage: time gen = (n for n in range(1,w) if (is_prime(n) & is_prime(n+2)))
CPU times: user 0.19 s, sys: 0.25 s, total: 0.44 s
Wall time: 0.44 s

Note that there are int construction overheads the first time, and for small range size (for some definition of small) the difference won't matter -- range might even be faster -- but it's still going through the loop, so it'll break for really big sizes even if you only ever look at a few values. Try w=10**9 and prepare to wait..

As for your real question, I don't know of a good way to clone a generator expression using the standard tools. (A ".restart()" method wouldn't make any sense in many cases, which is one of the reasons it's not part of the protocol.) You can use itertools.tee to make multiple cached copies of the output, but since it's caching the values to do the trick you could have memory problems or accuracy problems if the function is nondeterministic.

But if you want to reinitialize the generator and start over, why not simply redeclare it instead? If you dislike the code duplication, you could wrap it in a function:


sage: def twin_prime_gen(upto=10**4):
....:         return (n for n in xrange(1,upto+1) if (is_prime(n) & is_prime(n+2))) 
....: 
sage: a = twin_prime_gen()
sage: next(a)
3
sage: next(a)
5
sage: a = twin_prime_gen()  # instead of a.restart(), which doesn't exist
sage: next(a)
3

which I think should have mostly the same effect.

[UPDATE: Oops, didn't update my window, so didn't see Niles already said the same thing!]