Ask Your Question
0

Pythagorean triple count

asked 2024-02-22 06:02:42 +0100

Jwlmug gravatar image

I'm trying to calculate how many Pythagorean triples there are with all arguments under a given value and my program won't finish when I set my limit at 1000.

Here it calculates a number in about a second when I set the limit at 100:

sage: len([(a,b,c) for a in range(1,100) for b in range(1,100) for c in range(1,100) if a^2+b^2==c^2])

But when I switch 100 to 1000 it computes endlessly and I need to abort the program.

Is there a way to make this code more efficient so it doesn't get overloaded? For example I've been trying to find a way to condense the variable range declarations but I'm not getting answers. I'm also new to Sagemath.

edit retag flag offensive close merge delete

Comments

Your code runs in O(n^3), n being your limit. If you need "about a second" for n=100, it wil need "about 1000 seconds" for n=1000. That's about 20 minutes. Add to that the (possibly considerable) time needed to print the result...

Is this homework ? In this case, just a hint : start with c^2 and retain two-element partitions whose members are both squares. This could be accelerated by computing a list if squares...

Emmanuel Charpentier gravatar imageEmmanuel Charpentier ( 2024-02-22 10:29:05 +0100 )edit

len([...]) is a waste of memory since it creates a huge list just to compute its length. Use sum(1 for a in ...) instead.

Max Alekseyev gravatar imageMax Alekseyev ( 2024-02-22 16:30:16 +0100 )edit

2 Answers

Sort by ยป oldest newest most voted
1

answered 2024-02-22 17:30:07 +0100

Max Alekseyev gravatar image

updated 2024-02-22 18:29:31 +0100

There is a known formula for the number of representations as the sum of two squares - see Wikipedia. Employing it, you can have summation $r_2(c^2$) just over $c$, which will give a dramatic speed-up.


ADDED. Since this question concerns squares of positive integers, the formula for $r_2(n)$ needs to be a bit re-worked:

# number of representations of n as an ordered sum of two squares of positive integers
def r2_positive(n):
    r = 1
    for p,k in factor(n):
        if p%4==1:
            r *= k+1
        elif p%4==3 and k%2:
            return 0
    return r - int(is_square(n))

Then you can get the desired count as

sum( r2_positive(c^2) for c in range(1,1000) )
edit flag offensive delete link more

Comments

Thank you! I will make it a goal to understand this code because I'm actually so new to this that it's stupid. But I know it works because the results are the same as what a couple other people got in different languages ๐Ÿ™‚

Jwlmug gravatar imageJwlmug ( 2024-02-22 23:44:10 +0100 )edit
0

answered 2024-02-22 13:31:21 +0100

Emmanuel Charpentier gravatar image

updated 2024-02-22 17:50:05 +0100

Note : @Max Alekseyev's answer, using more mathematical knowledge, is, of course, much better than this one, which illustrates a programming point.

I suppose that this is not homework. If it is, you would benefit from stopping reading this and solving it yoursef...

Running this :

# https://ask.sagemath.org/question/76159/pythagorean-triple-count/
Nmax=1000
Roots={u^2:u for u in range(1, Nmax+1)}
Squares=list(Roots.keys())
Triples=[]
from time import time as stime
t0=stime()
for asq in Squares:
    for bsq in Squares:
        if bsq>=asq: break
        if (csq:=asq-bsq) in Squares: Triples+=[[Roots[u] for u in (bsq, csq, asq)]]
t1=stime()
print("N = %d, time=%6.3f seconds.\n"%(len(Triples), t1-t0))

prints :

N = 1762, time= 3.925 seconds.

Why ?

  • The algorithm should be in O(n^(3/2)) because :
    • It checks directly a^2-b^2 instead of looping on all possible sums.
    • It shortcuts the inner loop as soon as possible.
  • It uses precomputed "possible" values and "possible" retun values.

However, empirical check doesn't confirm this. Exploration ongoing...

EDIT : searching a precomputed list of squares seems a bad idea. Compare :

def AllTriples(Nmax):
    Roots={u^2:u for u in range(1, Nmax+1)}
    Squares=list(Roots.keys())
    Triples=[]
    for asq in Squares:
        for bsq in Squares:
            if bsq>=asq: break
            if (csq:=asq-bsq) in Squares:
                Triples+=[[Roots[u] for u in (bsq, csq, asq)]]
    return Triples

with

def AllTriplesFaster(Nmax):
    Roots={u^2:u for u in range(1, Nmax+1)}
    Squares=list(Roots.keys())
    Triples=[]
    for asq in Squares:
        for bsq in Squares:
            if bsq>=asq: break
            if Integer(csq:=asq-bsq).is_square(): # <-- Here's the rub !
                Triples+=[[Roots[u] for u in (bsq, csq, asq)]]
    return Triples

The latter is about 40 times faster :

sage: %time foo=AllTriples(1000)
CPU times: user 3.86 s, sys: 0 ns, total: 3.86 s
Wall time: 3.86 s
sage: %time foo=AllTriplesFaster(1000)
CPU times: user 95.1 ms, sys: 1 ยตs, total: 95.1 ms
Wall time: 95.1 ms

Further exploration : run

from time import time as stime
Res=[]
Perfs=[]
for J in range(1, 21):
    t0=stime()
    Res+=[((N:=100*J),len(AllTriples(N)))]
    t1=stime()
    Perfs+=[(N, t1-t0)]

Then :

sage: r.summary(r.lm(r.formula("logt~logn"), data=r.data_frame(logn=[log(u[0]).n() for u in Perfs], logt=[log(u[1]) for u in Perfs])))

Call:
lm(formula = sage58, data = sage61)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.36117 -0.28585  0.09187  0.13295  0.65187 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -18.30137    0.49735  -36.80   <2e-16 ***
logn          2.91594    0.07348   39.68   <2e-16 ***
---
Signif. codes:  0 โ€˜***โ€™ 0.001 โ€˜**โ€™ 0.01 โ€˜*โ€™ 0.05 โ€˜.โ€™ 0.1 โ€˜ โ€™ 1

Residual standard error: 0.2603 on 18 degrees of freedom
Multiple R-squared:  0.9887,    Adjusted R-squared:  0.9881 
F-statistic:  1575 on 1 and 18 DF,  p-value: < 2.2e-16

suggest a running time in O(n= 3), whereas :

    Res2=[]
    Perfs2=[]
    for J in range(1, 21):
        t2=stime()
        Res2+=[((N:=100*J),len(AllTriplesFaster(N)))]
        t3=stime()
        Perfs2+=[(N, t3-t2)]

sage: r.summary(r.lm(r.formula("logt~logn"), data=r.data_frame(logn=[log(u[0]).n() for u in Perfs2], logt=[log(u[1]) for u in Perfs2])))

Call:
lm(formula = sage55, data = sage62)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.051962 -0.026751 -0.005048  0.025395  0.076644 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -15.97022    0.07168  -222.8   <2e-16 ***
logn          1.99332    0.01059   188.2   <2e-16 ***
---
Signif. codes:  0 โ€˜***โ€™ 0.001 โ€˜**โ€™ 0.01 โ€˜*โ€™ 0.05 โ€˜.โ€™ 0.1 โ€˜ โ€™ 1

Residual standard error: 0.03752 on 18 degrees of freedom
Multiple R-squared:  0.9995,    Adjusted R-squared:  0.9995 
F-statistic: 3.543e+04 on 1 and 18 DF,  p-value: < 2.2e-16

suggests O(n^2).

Searching in a precomputed list seems to dominate the computing time and seems to be in O(n^3).

HTH,

edit flag offensive delete link more

Comments

This is for self study and I'm just diving headfirst into coding for the first time since that one C++ class I took over 10 years ago, so I'm very bad and can pretty much only learn from examples and being told what the characters mean lol.

In any case, I appreciate the help you've offered. I now see that, indeed, Max's code above is what I'm most interested in understanding. It's short and produces the same result that two other people in a Facebook thread got using different programming languages.

Jwlmug gravatar imageJwlmug ( 2024-02-22 23:28:52 +0100 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

Stats

Asked: 2024-02-22 06:02:42 +0100

Seen: 243 times

Last updated: Feb 22