Sample a large random number

I am trying to randomly (uniformly) sample an integer between 0, and say 2^1000 (or ideally an arbitrary large number). Attempting to do by calling random sample yields an overflow error:

sample(range(2^130), 7)


OverflowError: Python int too large to convert to C ssize_t Is it possible to do this?

edit retag close merge delete

Sort by » oldest newest most voted

First, range uses the Python int class, which can overflow (as you have experienced). The function srange uses the Sagemath Integer class instead, which will not overflow. Second, using sample is probably not a good idea. On my somewhat old computer:

sage: %time sample(srange(2^24), 7)
CPU times: user 4.11 s, sys: 556 ms, total: 4.66 s
Wall time: 6 s
sage: %time sample(srange(2^25), 7)
CPU times: user 8.95 s, sys: 1.34 s, total: 10.3 s
Wall time: 14.2 s
[14635440, 2287708, 14649651, 26289169, 18124119, 30825037, 17257883]
sage: %time sample(srange(2^26), 7)
CPU times: user 17.1 s, sys: 2.34 s, total: 19.5 s
Wall time: 23 s
[9787083, 12313568, 7761365, 8998698, 55334892, 50923481, 65747342]


It seems to be roughly doubling in time as the argument to srange doubles. (The issue is that srange actually constructs the whole list first.) Needless to say, I don't want to try 2^130. Much faster is to use randint repeatedly (and the changes of getting the same integer twice are pretty small:

sage: %time [randint(1, 2^130) for n in range(7)]
CPU times: user 63 µs, sys: 1 µs, total: 64 µs
Wall time: 67 µs
[1341203791323696537546187325304798025292,
1106243203902472361349055754890433865742,
1038420201477214469366016121149956168427,
1242849022616461088502016645925189018763,
286347095973575864665046785261064404737,
286152873235140220458041877475917123637,
41617504738350918940022502684766969623]


This is almost instant. The command A = [randint(1, 2^100000000) for n in range(7)] took almost a second for me. It actually takes longer to print the elements of A. Also, you could write a loop instead to make absolutely sure you don't get any repetitions, or test it afterwards, but it's probably not necessary if you're using such large sample sizes.

more