Pythagorean triple count

I'm trying to calculate how many Pythagorean triples there are with all arguments under a given value and my program won't finish when I set my limit at 1000.

Here it calculates a number in about a second when I set the limit at 100:

sage: len([(a,b,c) for a in range(1,100) for b in range(1,100) for c in range(1,100) if a^2+b^2==c^2])

But when I switch 100 to 1000 it computes endlessly and I need to abort the program.

Is there a way to make this code more efficient so it doesn't get overloaded? For example I've been trying to find a way to condense the variable range declarations but I'm not getting answers. I'm also new to Sagemath.

edit retag close merge delete

Your code runs in O(n^3), n being your limit. If you need "about a second" for n=100, it wil need "about 1000 seconds" for n=1000. That's about 20 minutes. Add to that the (possibly considerable) time needed to print the result...

Is this homework ? In this case, just a hint : start with c^2 and retain two-element partitions whose members are both squares. This could be accelerated by computing a list if squares...

( 2024-02-22 10:29:05 +0200 )edit

len([...]) is a waste of memory since it creates a huge list just to compute its length. Use sum(1 for a in ...) instead.

( 2024-02-22 16:30:16 +0200 )edit

Sort by ยป oldest newest most voted

There is a known formula for the number of representations as the sum of two squares - see Wikipedia. Employing it, you can have summation $r_2(c^2$) just over $c$, which will give a dramatic speed-up.

ADDED. Since this question concerns squares of positive integers, the formula for $r_2(n)$ needs to be a bit re-worked:

# number of representations of n as an ordered sum of two squares of positive integers
def r2_positive(n):
r = 1
for p,k in factor(n):
if p%4==1:
r *= k+1
elif p%4==3 and k%2:
return 0
return r - int(is_square(n))


Then you can get the desired count as

sum( r2_positive(c^2) for c in range(1,1000) )

more

Thank you! I will make it a goal to understand this code because I'm actually so new to this that it's stupid. But I know it works because the results are the same as what a couple other people got in different languages ๐

( 2024-02-22 23:44:10 +0200 )edit

Note : @Max Alekseyev's answer, using more mathematical knowledge, is, of course, much better than this one, which illustrates a programming point.

I suppose that this is not homework. If it is, you would benefit from stopping reading this and solving it yoursef...

Running this :

# https://ask.sagemath.org/question/76159/pythagorean-triple-count/
Nmax=1000
Roots={u^2:u for u in range(1, Nmax+1)}
Squares=list(Roots.keys())
Triples=[]
from time import time as stime
t0=stime()
for asq in Squares:
for bsq in Squares:
if bsq>=asq: break
if (csq:=asq-bsq) in Squares: Triples+=[[Roots[u] for u in (bsq, csq, asq)]]
t1=stime()
print("N = %d, time=%6.3f seconds.\n"%(len(Triples), t1-t0))


prints :

N = 1762, time= 3.925 seconds.


Why ?

• The algorithm should be in O(n^(3/2)) because :
• It checks directly a^2-b^2 instead of looping on all possible sums.
• It shortcuts the inner loop as soon as possible.
• It uses precomputed "possible" values and "possible" retun values.

However, empirical check doesn't confirm this. Exploration ongoing...

EDIT : searching a precomputed list of squares seems a bad idea. Compare :

def AllTriples(Nmax):
Roots={u^2:u for u in range(1, Nmax+1)}
Squares=list(Roots.keys())
Triples=[]
for asq in Squares:
for bsq in Squares:
if bsq>=asq: break
if (csq:=asq-bsq) in Squares:
Triples+=[[Roots[u] for u in (bsq, csq, asq)]]
return Triples


with

def AllTriplesFaster(Nmax):
Roots={u^2:u for u in range(1, Nmax+1)}
Squares=list(Roots.keys())
Triples=[]
for asq in Squares:
for bsq in Squares:
if bsq>=asq: break
if Integer(csq:=asq-bsq).is_square(): # <-- Here's the rub !
Triples+=[[Roots[u] for u in (bsq, csq, asq)]]
return Triples


The latter is about 40 times faster :

sage: %time foo=AllTriples(1000)
CPU times: user 3.86 s, sys: 0 ns, total: 3.86 s
Wall time: 3.86 s
sage: %time foo=AllTriplesFaster(1000)
CPU times: user 95.1 ms, sys: 1 ยตs, total: 95.1 ms
Wall time: 95.1 ms


Further exploration : run

from time import time as stime
Res=[]
Perfs=[]
for J in range(1, 21):
t0=stime()
Res+=[((N:=100*J),len(AllTriples(N)))]
t1=stime()
Perfs+=[(N, t1-t0)]


Then :

sage: r.summary(r.lm(r.formula("logt~logn"), data=r.data_frame(logn=[log(u[0]).n() for u in Perfs], logt=[log(u[1]) for u in Perfs])))

Call:
lm(formula = sage58, data = sage61)

Residuals:
Min       1Q   Median       3Q      Max
-0.36117 -0.28585  0.09187  0.13295  0.65187

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -18.30137    0.49735  -36.80   <2e-16 ***
logn          2.91594    0.07348   39.68   <2e-16 ***
---
Signif. codes:  0 โ***โ 0.001 โ**โ 0.01 โ*โ 0.05 โ.โ 0.1 โ โ 1

Residual standard error: 0.2603 on 18 degrees of freedom
Multiple R-squared:  0.9887,    Adjusted R-squared:  0.9881
F-statistic:  1575 on 1 and 18 DF,  p-value: < 2.2e-16


suggest a running time in O(n= 3), whereas :

    Res2=[]
Perfs2=[]
for J in range(1, 21):
t2=stime()
Res2+=[((N:=100*J),len(AllTriplesFaster(N)))]
t3=stime()
Perfs2+=[(N, t3-t2)]

sage: r.summary(r.lm(r.formula("logt~logn"), data=r.data_frame(logn=[log(u[0]).n() for u in Perfs2], logt=[log(u[1]) for u in Perfs2])))

Call:
lm(formula = sage55, data = sage62)

Residuals:
Min        1Q    Median        3Q       Max
-0.051962 -0.026751 -0.005048  0.025395  0.076644

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -15.97022    0.07168  -222.8   <2e-16 ***
logn          1.99332    0.01059   188.2   <2e-16 ***
---
Signif. codes:  0 โ***โ 0.001 โ**โ 0.01 โ*โ 0.05 โ.โ 0.1 โ โ 1

Residual standard error: 0.03752 on 18 degrees of freedom
Multiple R-squared:  0.9995,    Adjusted R-squared:  0.9995
F-statistic: 3.543e+04 on 1 and 18 DF,  p-value: < 2.2e-16


suggests O(n^2).

Searching in a precomputed list seems to dominate the computing time and seems to be in O(n^3).

HTH,

more

This is for self study and I'm just diving headfirst into coding for the first time since that one C++ class I took over 10 years ago, so I'm very bad and can pretty much only learn from examples and being told what the characters mean lol.

In any case, I appreciate the help you've offered. I now see that, indeed, Max's code above is what I'm most interested in understanding. It's short and produces the same result that two other people in a Facebook thread got using different programming languages.

( 2024-02-22 23:28:52 +0200 )edit