Ask Your Question
1

a fast function taking either variable or collection as argument

asked 2011-12-07 12:58:12 +0200

ADuC812 gravatar image

I want to create a function which takes either variable or collection as argument. If the input is any kind of collection, function should be taken of every element in it and return a numpy array of results. Currently, I do in this way:

def simplefunc(x):
    <some code here> # it is assumed that x is a single variable
    return <some single result>
def arrayfunc(x):
    try: #check if we have a collection 
        x_iterator = iter(x)
    except TypeError: #this is a single expression
        return simplefunc(x)
    else: # iterate through iterable x
        ret=[] 
        for xx in x:
            ret.append(simplefunc(xx))
        return numpy.array(ret)

It works as is, however, I do not think this is the fastest way possible,especially the method to figure out if the input is a collection. I also do not like that strings are considered as collections and split to chars (I can stand it though). Is there a more elegant way?

I tried also to call arrayfunc() recursively on each element of collection (instead of simplefunc()), to handle collections of collections in the same way, however, it runs into infinite loop on strings. I have to check if the input is a string explicitly.

edit retag flag offensive close merge delete

Comments

Perhaps using 'map(function, sequence)' would help? 'try: L = map(simplefunc, x) ...'. See http://docs.python.org/tutorial/datastructures.html#functional-programming-tools

John Palmieri gravatar imageJohn Palmieri ( 2011-12-07 13:15:00 +0200 )edit

Thanks a lot... this I may have overlooked while googling.. However, this does not work if I have some other non-sequence parameters in my function Maybe better to post this as an answer?

ADuC812 gravatar imageADuC812 ( 2011-12-08 00:59:49 +0200 )edit

Your function above doesn't have non-sequence parameters, so I'm not sure how you plan to deal with those. You could always make a partial function (look up partial in the functools package). List comprehensions deal nicely with this too.

Jason Grout gravatar imageJason Grout ( 2011-12-08 09:33:01 +0200 )edit

2 Answers

Sort by ยป oldest newest most voted
2

answered 2011-12-07 13:12:45 +0200

Jason Grout gravatar image

updated 2011-12-08 09:28:47 +0200

As for the else statement, this is probably faster:

else:
    return numpy.array([simplefunc(xx) for xx in x])

I probably wouldn't even put that in an else statement either. Just put it as the last line of the function.

Update

It looks like map is slightly faster than a list comprehension in the test below:

sage: def f(x): return x*x
....: 
sage: timeit('map(f,xrange(1e6))')
5 loops, best of 3: 223 ms per loop
sage: timeit('[f(x) for x in xrange(1e6)]')
5 loops, best of 3: 277 ms per loop
edit flag offensive delete link more

Comments

is this variant: map(function, sequence) from comment above slower?

ADuC812 gravatar imageADuC812 ( 2011-12-08 01:09:46 +0200 )edit
1

answered 2011-12-08 08:58:44 +0200

jdc gravatar image

I agree with the other responders on using 'map'. As to the other issue, detecting if x is a collection: are you attempting to construct a very general tool that can be used in just about any context, or just something that will work for the solution of a particular problem? If it's the latter, then (depending on your particular context) you might only encounter certain types of collections. For example, if the only kinds of collections you expect to see are lists, tuples, or sets, you could define

itertypes = ["<type 'tuple'>", " <type 'list'>", " <class 'sage.sets.set.Set_object_enumerated_with_category'>"]

then just test if str(type(x)) is in itertypes.

I did a smidge of benchmarking, using a few basic examples, comparing this approach to your "try... except TypeError" approach. (I only tested the step where you determine whether x is a collection or not, omitting the step where you actually apply 'simplefunc'.) When x is a list/tuple/set, my method is only about half as fast as yours, but otherwise it's 3 times as fast. So which is faster might depend on the proportion of the time you expect to encounter collections.

By the way, this may not be a worry for you, but I don't think that Sage would raise any objection if you ask it to iterate over an infinite set like ZZ. It just wouldn't terminate.

edit flag offensive delete link more

Comments

Isn't easier and clearer to just test for the types directly using isinstance? Like: isinstance(x, (list, tuple))

Jason Grout gravatar imageJason Grout ( 2011-12-08 09:31:36 +0200 )edit

It sure is. I just didn't know about isinstance.

jdc gravatar imagejdc ( 2011-12-08 10:50:06 +0200 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

Stats

Asked: 2011-12-07 12:58:12 +0200

Seen: 496 times

Last updated: Dec 08 '11