Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

I suppose that this is not homework. If it is, you would benefit from stopping reading this and solving it yoursef...

Running this :

# https://ask.sagemath.org/question/76159/pythagorean-triple-count/
Nmax=1000
Roots={u^2:u for u in range(1, Nmax+1)}
Squares=list(Roots.keys())
Triples=[]
from time import time as stime
t0=stime()
for asq in Squares:
    for bsq in Squares:
        if bsq>=asq: break
        if (csq:=asq-bsq) in Squares: Triples+=[[Roots[u] for u in (bsq, csq, asq)]]
t1=stime()
print("N = %d, time=%6.3f seconds.\n"%(len(Triples), t1-t0))

prints :

N = 1762, time= 3.925 seconds.

Why ?

  • The algorithm is in O(n^(3/2)) because :
    • It checks directly a^2-b^2 instead of looping on all possible sums.
    • It shortcuts the inner loop as soon as possible.
  • It uses precomputed "possible" values and "possible" retun values.

HTH,

I suppose that this is not homework. If it is, you would benefit from stopping reading this and solving it yoursef...

Running this :

# https://ask.sagemath.org/question/76159/pythagorean-triple-count/
Nmax=1000
Roots={u^2:u for u in range(1, Nmax+1)}
Squares=list(Roots.keys())
Triples=[]
from time import time as stime
t0=stime()
for asq in Squares:
    for bsq in Squares:
        if bsq>=asq: break
        if (csq:=asq-bsq) in Squares: Triples+=[[Roots[u] for u in (bsq, csq, asq)]]
t1=stime()
print("N = %d, time=%6.3f seconds.\n"%(len(Triples), t1-t0))

prints :

N = 1762, time= 3.925 seconds.

Why ?

  • The algorithm is should be in O(n^(3/2)) because :
    • It checks directly a^2-b^2 instead of looping on all possible sums.
    • It shortcuts the inner loop as soon as possible.
  • It uses precomputed "possible" values and "possible" retun values.

However, empirical check doesn't confirm this. Exploration ongoing...

HTH,

I suppose that this is not homework. If it is, you would benefit from stopping reading this and solving it yoursef...

Running this :

# https://ask.sagemath.org/question/76159/pythagorean-triple-count/
Nmax=1000
Roots={u^2:u for u in range(1, Nmax+1)}
Squares=list(Roots.keys())
Triples=[]
from time import time as stime
t0=stime()
for asq in Squares:
    for bsq in Squares:
        if bsq>=asq: break
        if (csq:=asq-bsq) in Squares: Triples+=[[Roots[u] for u in (bsq, csq, asq)]]
t1=stime()
print("N = %d, time=%6.3f seconds.\n"%(len(Triples), t1-t0))

prints :

N = 1762, time= 3.925 seconds.

Why ?

  • The algorithm should be in O(n^(3/2)) because :
    • It checks directly a^2-b^2 instead of looping on all possible sums.
    • It shortcuts the inner loop as soon as possible.
  • It uses precomputed "possible" values and "possible" retun values.

However, empirical check doesn't confirm this. Exploration ongoing...

EDIT : searching a precomputed list of squares seems a bad idea. Compare :

def AllTriples(Nmax):
    Roots={u^2:u for u in range(1, Nmax+1)}
    Squares=list(Roots.keys())
    Triples=[]
    for asq in Squares:
        for bsq in Squares:
            if bsq>=asq: break
            if (csq:=asq-bsq) in Squares:
                Triples+=[[Roots[u] for u in (bsq, csq, asq)]]
    return Triples

with

def AllTriplesFaster(Nmax):
    Roots={u^2:u for u in range(1, Nmax+1)}
    Squares=list(Roots.keys())
    Triples=[]
    for asq in Squares:
        for bsq in Squares:
            if bsq>=asq: break
            if Integer(csq:=asq-bsq).is_square():
                Triples+=[[Roots[u] for u in (bsq, csq, asq)]]
    return Triples

The latter is about 40 times faster :

sage: %time foo=AllTriples(1000)
CPU times: user 3.86 s, sys: 0 ns, total: 3.86 s
Wall time: 3.86 s
sage: %time foo=AllTriplesFaster(1000)
CPU times: user 95.1 ms, sys: 1 µs, total: 95.1 ms
Wall time: 95.1 ms

Further exploration : run

from time import time as stime
Res=[]
Perfs=[]
for J in range(1, 21):
    t0=stime()
    Res+=[((N:=100*J),len(AllTriples(N)))]
    t1=stime()
    Perfs+=[(N, t1-t0)]

Then :

sage: r.summary(r.lm(r.formula("logt~logn"), data=r.data_frame(logn=[log(u[0]).n() for u in Perfs], logt=[log(u[1]) for u in Perfs])))

Call:
lm(formula = sage58, data = sage61)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.36117 -0.28585  0.09187  0.13295  0.65187 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -18.30137    0.49735  -36.80   <2e-16 ***
logn          2.91594    0.07348   39.68   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2603 on 18 degrees of freedom
Multiple R-squared:  0.9887,    Adjusted R-squared:  0.9881 
F-statistic:  1575 on 1 and 18 DF,  p-value: < 2.2e-16

suggest a running time in O(n= 3), whereas :

    Res2=[]
    Perfs2=[]
    for J in range(1, 21):
        t2=stime()
        Res2+=[((N:=100*J),len(AllTriplesFaster(N)))]
        t3=stime()
        Perfs2+=[(N, t3-t2)]

sage: r.summary(r.lm(r.formula("logt~logn"), data=r.data_frame(logn=[log(u[0]).n() for u in Perfs2], logt=[log(u[1]) for u in Perfs2])))

Call:
lm(formula = sage55, data = sage62)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.051962 -0.026751 -0.005048  0.025395  0.076644 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -15.97022    0.07168  -222.8   <2e-16 ***
logn          1.99332    0.01059   188.2   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.03752 on 18 degrees of freedom
Multiple R-squared:  0.9995,    Adjusted R-squared:  0.9995 
F-statistic: 3.543e+04 on 1 and 18 DF,  p-value: < 2.2e-16

suggests O(n^2).

Searching in a precomputed list seems to dominate the computing time and seems to be in O(n^3).

HTH,

Note : @Max Alekseyev's answer, using more mathematical knowledge, is, of course, much better than this one, which illustrates a programming point.

I suppose that this is not homework. If it is, you would benefit from stopping reading this and solving it yoursef...

Running this :

# https://ask.sagemath.org/question/76159/pythagorean-triple-count/
Nmax=1000
Roots={u^2:u for u in range(1, Nmax+1)}
Squares=list(Roots.keys())
Triples=[]
from time import time as stime
t0=stime()
for asq in Squares:
    for bsq in Squares:
        if bsq>=asq: break
        if (csq:=asq-bsq) in Squares: Triples+=[[Roots[u] for u in (bsq, csq, asq)]]
t1=stime()
print("N = %d, time=%6.3f seconds.\n"%(len(Triples), t1-t0))

prints :

N = 1762, time= 3.925 seconds.

Why ?

  • The algorithm should be in O(n^(3/2)) because :
    • It checks directly a^2-b^2 instead of looping on all possible sums.
    • It shortcuts the inner loop as soon as possible.
  • It uses precomputed "possible" values and "possible" retun values.

However, empirical check doesn't confirm this. Exploration ongoing...

EDIT : searching a precomputed list of squares seems a bad idea. Compare :

def AllTriples(Nmax):
    Roots={u^2:u for u in range(1, Nmax+1)}
    Squares=list(Roots.keys())
    Triples=[]
    for asq in Squares:
        for bsq in Squares:
            if bsq>=asq: break
            if (csq:=asq-bsq) in Squares:
                Triples+=[[Roots[u] for u in (bsq, csq, asq)]]
    return Triples

with

def AllTriplesFaster(Nmax):
    Roots={u^2:u for u in range(1, Nmax+1)}
    Squares=list(Roots.keys())
    Triples=[]
    for asq in Squares:
        for bsq in Squares:
            if bsq>=asq: break
            if Integer(csq:=asq-bsq).is_square():
Integer(csq:=asq-bsq).is_square(): # <-- Here's the rub !
                Triples+=[[Roots[u] for u in (bsq, csq, asq)]]
    return Triples

The latter is about 40 times faster :

sage: %time foo=AllTriples(1000)
CPU times: user 3.86 s, sys: 0 ns, total: 3.86 s
Wall time: 3.86 s
sage: %time foo=AllTriplesFaster(1000)
CPU times: user 95.1 ms, sys: 1 µs, total: 95.1 ms
Wall time: 95.1 ms

Further exploration : run

from time import time as stime
Res=[]
Perfs=[]
for J in range(1, 21):
    t0=stime()
    Res+=[((N:=100*J),len(AllTriples(N)))]
    t1=stime()
    Perfs+=[(N, t1-t0)]

Then :

sage: r.summary(r.lm(r.formula("logt~logn"), data=r.data_frame(logn=[log(u[0]).n() for u in Perfs], logt=[log(u[1]) for u in Perfs])))

Call:
lm(formula = sage58, data = sage61)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.36117 -0.28585  0.09187  0.13295  0.65187 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -18.30137    0.49735  -36.80   <2e-16 ***
logn          2.91594    0.07348   39.68   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2603 on 18 degrees of freedom
Multiple R-squared:  0.9887,    Adjusted R-squared:  0.9881 
F-statistic:  1575 on 1 and 18 DF,  p-value: < 2.2e-16

suggest a running time in O(n= 3), whereas :

    Res2=[]
    Perfs2=[]
    for J in range(1, 21):
        t2=stime()
        Res2+=[((N:=100*J),len(AllTriplesFaster(N)))]
        t3=stime()
        Perfs2+=[(N, t3-t2)]

sage: r.summary(r.lm(r.formula("logt~logn"), data=r.data_frame(logn=[log(u[0]).n() for u in Perfs2], logt=[log(u[1]) for u in Perfs2])))

Call:
lm(formula = sage55, data = sage62)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.051962 -0.026751 -0.005048  0.025395  0.076644 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -15.97022    0.07168  -222.8   <2e-16 ***
logn          1.99332    0.01059   188.2   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.03752 on 18 degrees of freedom
Multiple R-squared:  0.9995,    Adjusted R-squared:  0.9995 
F-statistic: 3.543e+04 on 1 and 18 DF,  p-value: < 2.2e-16

suggests O(n^2).

Searching in a precomputed list seems to dominate the computing time and seems to be in O(n^3).

HTH,