Ask Your Question
1

Speed comparison with numerical arrays: sage, R, matlab

asked 2013-10-28 18:52:08 +0100

Uffe_H_Thygesen gravatar image

Dear all

I am a sage newbie coming from matlab, R, and Maple, and although I really like many features, I find that my sage code for handling numerical arrays is much more complex and much slower than the corresponding code in R and Matlab.

Here's an example:

Simuating a sample path of standard Brownian motion

I want to simulate a random path of standard Brownian motion. The input is a vector of time points, and I want returned a (random) trajectory, sampled at these time points.

To do this I need: 1) Fast random number generation 2) Fast array operations: Diff, multiplication, and cumsum. I found that in sage I need to go through NumPy, so I wrote:

sage: import numpy
sage: # General routine to sample a Brownian motion on a time grid 
sage: def rB(tvec):
sage:    dt = numpy.diff(numpy.array(tvec))
sage:    sdt = sqrt(dt)
sage:    RNG = RealDistribution('gaussian',1)
sage:    dB = sdt * numpy.array([RNG.get_random_element() for j in range(dt.__len__())],dtype=float)
sage:    B1 = numpy.array([normalvariate(0,sqrt(tvec[0]))],dtype=float)
sage:    B = numpy.cumsum(numpy.concatenate((B1,dB)))
sage:    return(B)

I then time it, testing with 1 million time points:

sage: tvec = srange(0,1e3,1e-3)
sage: time B = rB(tvec)

I get a time of 3.5 seconds on my machine, a standard Laptop running Ubuntu, running sage from the shell. Using prun, I see that half of the time, roughly, is spent generating random numbers and the other half mainly constructing numerical arrays.

For comparison, the R code would be (here I assume that tvec[1]=0)

> tvec <- seq(0,1000,1e-3)
> system.time(B <- c(0,cumsum(rnorm(length(tvec)-1,sd=sqrt(diff(tvec))))))

which executes in 0.24 seconds. The matlab code would be

>> tvec = linspace(0,1e3,1e6+1);
>> tic;B = [0,cumsum(randn(1,length(tvec)-1) .* sqrt(diff(tvec)))];toc

which runs in 0.06 seconds.

You see that my sage code is not only clumsier, it is also between 10 and 100 times slower, so I guess I am doing something wrong.

I would really appreciate if you could point me in the direction of cleaner and faster code.

Cheers,

Uffe

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
4

answered 2013-10-28 20:38:19 +0100

ppurka gravatar image

I don't know what you are doing in creating B1. Other than that, most of your time is being spent in converting between Sage numbers or arrays and numpy numbers or arrays. When writing your code, make sure to minimize these kinds of conversions.

Following is the code which is almost the same as your code, except everything is handled by numpy and there are hardly any conversions being done, except for a few floating values. Also, I removed the line containing B1 since it gives errors with zero variance.

import numpy
# General routine to sample a Brownian motion on a time grid 
def rB(tvec):
    dt = numpy.diff(tvec)
    sdt = numpy.sqrt(dt)
    dB = sdt * numpy.random.normal(0, 1, len(dt))
    B = numpy.cumsum(dB)
    return(B)

Here, we must ensure that the input tvec is also a numpy array, otherwise there will be a lot of time wasted in converting it. After we make this change, the time taken by the code is 0.17s (as compared to over 4s with your code).

tvec = numpy.arange(0, 1e3, 1e-3) # srange(0,1e3,1e-3)
time B = rB(tvec)

Time: CPU 0.16 s, Wall: 0.17 s

Output of r code:

%r
tvec <- seq(0,1000,1e-3)
system.time(B <- c(0,cumsum(rnorm(length(tvec)-1,sd=sqrt(diff(tvec))))))

user  system elapsed 
0.440   0.003   0.451
edit flag offensive delete link more

Comments

I think that this is a really nice post - it illustrates exactly when the overhead of all the Sage stuff is not appropriate. Which is why it's so nice that Sage includes these for free.

kcrisman gravatar imagekcrisman ( 2013-10-28 21:13:16 +0100 )edit

@Uffe_H_Thygesen - you may enjoy running Sage in Python mode, with `sage -ipython`. You can still import Sage if you want (`from sage.all import *`, I think), but can also just do pure Numpy and friends.

kcrisman gravatar imagekcrisman ( 2013-10-28 21:14:19 +0100 )edit

Thanks! I guess the take-home-message is to stay in numpy, whereas I was trying to do as much as possible in sage proper, and only shifting to numpy for bottlenecks.

Uffe_H_Thygesen gravatar imageUffe_H_Thygesen ( 2013-10-29 03:43:18 +0100 )edit

@kcrisman - thanks for the pointer. Now, most of my code contains much symbolic manipulation before going to numerics, so that's why I tend to start in sage. I will have to do some experimentation ...

Uffe_H_Thygesen gravatar imageUffe_H_Thygesen ( 2013-10-29 03:45:29 +0100 )edit
1

No problem. What I would say is to try to abstract out the numerics stuff in your code in a separate function that you can call from your symbolic code. Especially if you use it a lot in a few places, this would be more efficient and better design.

kcrisman gravatar imagekcrisman ( 2013-10-29 08:36:36 +0100 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2013-10-28 18:52:08 +0100

Seen: 1,600 times

Last updated: Oct 28 '13