# How do I generate a random number according to the binomial distribution?

I would like to generate a random integer according to the binomial distribution. That is, I would like to generate a Bin(n,p) random value. That's number between 0 and n in which the probability we get the value k is C(n,k)p^k(1-p)^(n-k).

Here is an inefficient method (which requires n calls to random() ):

def bin_rv(n,p=0.5):
"""
Generate a binomial random variable with parameters n,p.
"""
return sum( random() < p for _ in range(n))


Is there a better way to do this (already built into Sage, I hope)?

(In Matlab, this can be done via the binord function and in R with the rbinom function.)

edit retag close merge delete

Sort by » oldest newest most voted

I usually use scipy for this:

sage: import scipy.stats
sage: scipy.stats.[TAB]
Display all 213 possibilities? (y or n)
scipy.stats.Tester                scipy.stats.dweibull              scipy.stats.gmean                 scipy.stats.linregress            scipy.stats.pointbiserialr        scipy.stats.t
scipy.stats.alpha                 scipy.stats.entropy               scipy.stats.gompertz              scipy.stats.loggamma              scipy.stats.poisson               scipy.stats.test
scipy.stats.anderson              scipy.stats.erlang                scipy.stats.gumbel_l              scipy.stats.logistic              scipy.stats.powerlaw              scipy.stats.threshold
scipy.stats.anglit                scipy.stats.expon                 scipy.stats.gumbel_r              scipy.stats.loglaplace
[.. etc.]
sage: B = scipy.stats.binom(10, 0.5)
sage: B
<scipy.stats.distributions.rv_frozen object at 0x6aba7d0>
sage: B.[TAB]
B.args      B.cdf       B.dist      B.entropy   B.interval  B.isf    B.kwds
B.mean      B.median    B.moment    B.pdf       B.pmf       B.ppf    B.rvs
B.sf        B.stats     B.std       B.var
sage: B.mean()
5.0
sage: B.rvs(15)
array([6, 5, 5, 5, 4, 7, 4, 7, 6, 4, 6, 7, 4, 7, 5])

more

Thanks, but this does not appear to be faster. In my application, I'm generating lots of Bin(n,0.5) RVs, but with a different n each time.

( 2012-04-02 09:43:17 -0500 )edit

Syntax

stats::binomialRandom(n, p, <seed =="" s="">) Description

stats::binomialRandom(n, p) returns a procedure that produces binomial-deviates (random numbers) with trial parameter n and probability parameter p.

The procedure f := stats::binomialRandom(n, p) can be called in the form f(). The return value of f() is an integer between 0 and n or a symbolic expression:

If n is a positive integer and p is a real value satisfying 0 ≤ p ≤ 1, then f() returns an integer between 0 and n. If p = 0 or p = 0.0, then f() returns 0 for any value of n. If p = 1 or p = 1.0, then f() returns n for any value of n. In all other cases, f() return the symbolic call stats::binomialRandom(n, p)(). Numerical values for n are only accepted if they are positive integers.

Numerical values for p are only accepted if they satisfy 0 ≤ p ≤ 1.

The values X = f() are distributed randomly according to the binomial distribution with trial parameter n and probability parameter p. For any , the probability of X ≤ x is given by

.

Without the option Seed = s, an initial seed is chosen internally. This initial seed is set to a default value when MuPAD® is started. Thus, each time MuPAD is started or re-initialized with the reset function, random generators produce the same sequences of numbers.

Note: With this option, the parameters n and p must evaluate to suitable numerical values at the time, when the generator is created. Note: In contrast to the function random, the generators produced by stats::binomialRandom do not react to the environment variable SEED. For efficiency, it is recommended to produce sequences of K random numbers via

f := stats::binomialRandom(n, p): f() $k = 1..K; rather than by stats::binomialRandom(n, p)()$k = 1..K; The latter call produces a sequence of generators each of which is called once. Also note that stats::binomialRandom(n, p, Seed = s)() \$k = 1..K; does not produce a random sequence, because a sequence of freshly initialized generators would be created each of them producing the same number.

more

Since you like R:

sage: L = r.rbinom(100,10,.5)
sage: L
[1] 5 4 6 4 3 6 9 1 7 5 5 4 6 3 5 6 5 1 5 6 2 4 7 4 7 7 5 3 3 7 6 3 2 2 4 6 4 4 6 4 4 5 6 5 7 3 7
[48] 7 3 5 4 7 7 6 4 3 2 4 5 5 7 2 5 2 3 2 3 4 6 7 8 6 6 3 4 2 6 3 5 3 8 6 8 5 5 2 5 5 5 2 5 7 5 5
[95] 7 3 5 5 6 4
sage: M = L._sage_()
sage: type(M), len(M)
(<type 'list'>, 100)


You do get Python ints, not Sage Integers, though that wouldn't be hard to change with a list comprehension, presumably.

more

Or map, which is faster.

( 2012-04-03 17:32:31 -0500 )edit

Unfortunately, the R routine is much slower than the naive one for my application.

( 2012-04-05 04:21:21 -0500 )edit