efficient generation of restricted divisors

This question is in the same venue as my previous one on generation of unitary divisors.

This time I'm interested in generating all divisors of a given factored integer that are below a given bound B. For simplicity I will focus on just counting such divisors.

A straightforward way is to use the divisors() function:

import itertools
def divbelow_1(n,B):
return sum(1 for _ in itertools.takewhile(lambda d: d<=B, divisors(n)))

The downside is that this function constructs a list of all divisors of $n$ (there are $\tau(n)$ of them), which limits its applicability. Still, if this list fits into the memory, divbelow_1() works quite fast. I'm interested in the case when $\tau(n)$ is large, but the number of divisors below B is moderate. Ideally, I'd like to have a function that have comparable performance to divbelow_1(n,B) when B==n, and that at the same time avoids generation of divisors above B.

My attempt was:

# N = list(factor(n)), a list of pairs (prime,exponent) forming the factorization of n
def divbelow_2(N,B,i=0):
if B <= 1:
return B
if i>=len(N):
return 1

p,e = N[i]
result = 0
for k in (0..e):
result += divbelow_2(N,B,i+1)
B //= p
if B==0:
break

return result

White it looks quite efficient for me from theoretical perspective, it loses badly to divbelow_1() in practice. Here is a particular example:

sage: %time divbelow_1(47987366728745697656468743117440000, 61448226992991993)
CPU times: user 4.73 s, sys: 368 ms, total: 5.1 s
Wall time: 5.1 s
10074213

sage: %time divbelow_2(list(factor(47987366728745697656468743117440000)), 61448226992991993)
CPU times: user 1min 31s, sys: 7.89 ms, total: 1min 31s
Wall time: 1min 31s
10074213

I can cope with 2-3 fold slowdown, but not with 18-fold one. So, my question is:

Q: Why divbelow_2() is so slow, and if there is there a room for improvement? (without using divisors())

ADDED. I suspect the drastic slowdown is caused by the recursion overhead. Here is a non-recursive version that generates the list of all divisors below B:

def divbelow_7(n,B):
div = [1]
for p, e in factor(n):
ext = div
for i in range(e):
ext = [r for d in ext if (r := d * p) <= B]
div.extend(ext)
return len(div)

On the above example it even beats divbelow_1():

sage: %time divbelow_7(47987366728745697656468743117440000, 61448226992991993)
CPU times: user 1.97 s, sys: 176 ms, total: 2.15 s
Wall time: 2.15 s
10074213
edit retag close merge delete

Maybe this is relevant?

def divisors0(f,B):
output=[1]
for p, e in f:
prev = output[:]
pn = 1
for i in range(e):
pn *= p
output.extend(a * pn for a in prev)
output=[x for x in output if x<=B]
return (len(output))

from time import time
B=61448226992991993
f=list(factor(47987366728745697656468743117440000))
%time divisors0(f,B)

CPU times: user 2.54 s, sys: 156 ms, total: 2.7 s
Wall time: 2.69 s

10074213
( 2023-03-05 07:56:35 +0100 )edit

This is better, although you still generate all divisors, while we can check the inequality <= B and perform filtering while extending. See update in the question.

( 2023-03-05 14:10:47 +0100 )edit

Not all divisors, all divisors from output. This is a significant difference for me. Nice to see some similarities. I hope that sage divisors will be replaced by your new code soon

( 2023-03-05 14:51:09 +0100 )edit

I've improved it a bit even further.

( 2023-03-05 15:56:10 +0100 )edit

Sort by » oldest newest most voted

The best performance I was able to archive is by the following code, which avoids recursion and uses a few tricks to save memory:

def divbelow_10(n,B):
if B*B >= n:
return number_of_divisors(n) - divbelow_10(n,n//(B+1))

res = 0

bs = {B:1}   # dict mapping bounds to their multiplicities

for p,e in reversed(factor(n)):

n //= p^e
n2 = 0                       # number of bounds >= n

ext = bs
for i in range(e):
ext2 = dict()
for b,c in ext.items():
r = b//p
if r >= 1:
ext2[r] = ext2.get(r,0) + c
ext = ext2

# extending bs with ext
for b,c in ext.items():
if b>=n:
n2 += c
else:
bs.setdefault(b,0)
bs[b] += c

#print('Extended:', len(bs))

for b in [b for b in bs if b*b>=n]:
c = bs.pop(b)
n2 += c
b_ = n//(b+1)
if b_ >= 1:
bs.setdefault(b_,0)
bs[b_] -= c

res += bs.pop(1,0)
res += n2 * number_of_divisors(n)

#print('Trimmed:', len(bs))

return res

Its performance is very good:

sage: %time divbelow_10(47987366728745697656468743117440000, 61448226992991993)
CPU times: user 67.5 ms, sys: 75 µs, total: 67.6 ms
Wall time: 66.9 ms
10074213
more