1 | initial version |
As a complement to @tmonteil's answer, some hints about speed.
To gain speed, you can
Here are four variants of a function inspired by your code, with 'trials' and 'length' as arguments.
import numpy
def compute_a(trials, length):
mean = 0
results = [0]*trials
data = [0]*length
for i in range(trials):
data[0] = numpy.random.normal(0,1)
for j in range(1, length):
data[j] = data[j-1] + numpy.random.normal(0,1)
for k in range(2, length):
if data[k] > 2*sqrt(k*log(log(k+1))):
results[i] += 1
mean += results[i]
return mean
def compute_b(trials, length):
mean = 0
results = [0]*trials
data = [0]*length
for i in range(trials):
data[0] = numpy.random.normal(0,1)
for j in range(1, length):
data[j] = data[j-1] + numpy.random.normal(0,1)
for k in range(2, length):
if data[k] > 2*sqrt(k*log(log(k+1.))):
results[i] += 1
mean += results[i]
return mean
def compute_c(trials, length):
two = RDF(2)
mean = 0
results = [0]*trials
data = [0]*length
for i in xrange(trials):
data[0] = numpy.random.normal(0,1)
for j in range(1, length):
data[j] = data[j-1] + numpy.random.normal(0,1)
for k in range(2, length):
if RDF(data[k]) > two*(RDF(k)*RDF(k+1).log().log()).sqrt():
results[i] += 1
mean += results[i]
return mean
def compute_d(trials, length):
zero = RDF.zero()
four = RDF(4r)
mean = 0r
results = [0r]*trials
data = [0r]*length
for i in xrange(trials):
data[0r] = numpy.random.normal(0r,1r)
for j in range(1r, length):
data[j] = data[j-1r] + numpy.random.normal(0r,1r)
for k in range(2r, length):
a = RDF(data[k])
if a > zero and a*a > four*RDF(k)*RDF(k+1r).log().log():
results[i] += 1r
mean += results[i]
return mean
compute_a
is just making your code into a function.compute_b
uses @tmonteil's trick of k+1.
so that comparisons are done in RR
rather than SR
.compute_c
makes sure that all multiplications, logs, sqrts, comparisons are in RDF;
we also use xrange instead of range to avoid creating a list we don't need.compute_d
tries to avoid Sage's preparser, and to avoid square roots;
we also avoid a lot of computation when data[k] < 0, because we don't
even have to compute the right-hand side in that case.For (trials, length) = (2000, 2000), I get timings of roughly: (a) 4 s, (b) 40 ms, (c) 5 ms, (d) 2 ms.
sage: timeit('compute_a(2000,2000)')
5 loops, best of 3: 3.99 s per loop
sage: timeit('compute_b(2000,2000)')
5 loops, best of 3: 41.4 ms per loop
sage: timeit('compute_c(2000,2000)')
125 loops, best of 3: 4.89 ms per loop
sage: timeit('compute_d(2000,2000)')
125 loops, best of 3: 1.92 ms per loop