# a fast function taking either variable or collection as argument

I want to create a function which takes either variable or collection as argument. If the input is any kind of collection, function should be taken of every element in it and return a numpy array of results. Currently, I do in this way:

def simplefunc(x):
<some code here> # it is assumed that x is a single variable
return <some single result>
def arrayfunc(x):
try: #check if we have a collection
x_iterator = iter(x)
except TypeError: #this is a single expression
return simplefunc(x)
else: # iterate through iterable x
ret=[]
for xx in x:
ret.append(simplefunc(xx))
return numpy.array(ret)


It works as is, however, I do not think this is the fastest way possible,especially the method to figure out if the input is a collection. I also do not like that strings are considered as collections and split to chars (I can stand it though). Is there a more elegant way?

I tried also to call arrayfunc() recursively on each element of collection (instead of simplefunc()), to handle collections of collections in the same way, however, it runs into infinite loop on strings. I have to check if the input is a string explicitly.

edit retag close merge delete

Perhaps using 'map(function, sequence)' would help? 'try: L = map(simplefunc, x) ...'. See http://docs.python.org/tutorial/datastructures.html#functional-programming-tools

( 2011-12-07 13:15:00 +0200 )edit

Thanks a lot... this I may have overlooked while googling.. However, this does not work if I have some other non-sequence parameters in my function Maybe better to post this as an answer?

( 2011-12-08 00:59:49 +0200 )edit

Your function above doesn't have non-sequence parameters, so I'm not sure how you plan to deal with those. You could always make a partial function (look up partial in the functools package). List comprehensions deal nicely with this too.

( 2011-12-08 09:33:01 +0200 )edit

Sort by » oldest newest most voted

As for the else statement, this is probably faster:

else:
return numpy.array([simplefunc(xx) for xx in x])


I probably wouldn't even put that in an else statement either. Just put it as the last line of the function.

Update

It looks like map is slightly faster than a list comprehension in the test below:

sage: def f(x): return x*x
....:
sage: timeit('map(f,xrange(1e6))')
5 loops, best of 3: 223 ms per loop
sage: timeit('[f(x) for x in xrange(1e6)]')
5 loops, best of 3: 277 ms per loop

more

is this variant: map(function, sequence) from comment above slower?

( 2011-12-08 01:09:46 +0200 )edit

I agree with the other responders on using 'map'. As to the other issue, detecting if x is a collection: are you attempting to construct a very general tool that can be used in just about any context, or just something that will work for the solution of a particular problem? If it's the latter, then (depending on your particular context) you might only encounter certain types of collections. For example, if the only kinds of collections you expect to see are lists, tuples, or sets, you could define

itertypes = ["<type 'tuple'>", " <type 'list'>", " <class 'sage.sets.set.Set_object_enumerated_with_category'>"]

then just test if str(type(x)) is in itertypes.

I did a smidge of benchmarking, using a few basic examples, comparing this approach to your "try... except TypeError" approach. (I only tested the step where you determine whether x is a collection or not, omitting the step where you actually apply 'simplefunc'.) When x is a list/tuple/set, my method is only about half as fast as yours, but otherwise it's 3 times as fast. So which is faster might depend on the proportion of the time you expect to encounter collections.

By the way, this may not be a worry for you, but I don't think that Sage would raise any objection if you ask it to iterate over an infinite set like ZZ. It just wouldn't terminate.

more

Isn't easier and clearer to just test for the types directly using isinstance? Like: isinstance(x, (list, tuple))

( 2011-12-08 09:31:36 +0200 )edit

It sure is. I just didn't know about isinstance.

( 2011-12-08 10:50:06 +0200 )edit