# Calculating large symbolic determinant is too slow

I want to calculate a series of symbolic Hankel matrices. The sizes of the matrices are from 2*2 to 50*50. I found the speed of calculating the determinants is too slow. It took 5 seconds to get the result of a 7*7 matrix, 45 seconds to get the result of an 8*8 matrix, and it was always killed when calculating the 9*9 determinants. According to https://math.stackexchange.com/questions/3561347/complexity-of-calculating-the-determinant (Complexity of calculating the determinant - Mathematics Stack Exchange), the time complexities for calculating determinants can lower to O(n^2.38), however, Sage seems to run in O(n!) from my measurement. How can I make the calculation more efficient?

Here is my full code.

import time
from functools import cache

import sage.libs.ecl
from sympy import simplify

# increase memory limit
sage.libs.ecl.ecl_eval("(ext:set-limit 'ext:heap-size 0)")

time_table = {}

l = 1
energy = var('E')

def begin(name):
time_table[name] = time.time()

def end(name):
print(f'{name}: {time.time() - time_table[name]}')
del time_table[name]

@cache
def expected_distance(n, l):
if n <= -2:
return 0
if n == -1:
return -2 * energy
if n == 0:
return 1
return  simplify((
- 4 * (2 * (n + 1) - 1) * expected_distance(n - 1, l) \
- n * ((n + 1) * (n - 1) - 4 * l * (l + 1)) * expected_distance(n - 2, l)
) / (8 * (n + 1) * energy))

min_rank = 2
max_rank = 50
positive_range_graph = Graphics()
xmin = -0.7
xmax = 0.5
current_ranges = [[xmin, xmax]]
offset = 1

for rank in range(min_rank, max_rank + 1):
print(f'rank:{rank}')
begin(f'rank:{rank}')

begin('generate matrix')
hankel = matrix.hankel([expected_distance(i + offset, l) for i in range(rank)],
[expected_distance(i + offset + rank - 1, l) for i in range(1, rank)], SR)
end('generate matrix')

begin('calculate det')
# pretty slow!
# It is always killed when the matrix size is 9*9
f = hankel.det()
end('calculate det')

begin('simplify')
f = f.simplify_full()
# print(f'rank:{rank} {f}' )
end('simplify')

begin('solve inequation')
positive_solutions = solve(f >= 0, energy)
end('solve inequation')

begin('plot')
positive_ranges = []
for solution in positive_solutions:
if len(solution) > 2:
print(f'more than two expression')
if len(solution) == 1:
if solution[0].operator() in [operator.le, operator.lt]:
positive_ranges.append((xmin, solution[0].rhs()))
elif solution[0].operator() in [operator.ge, operator.gt]:
positive_ranges.append((solution[0].rhs(), xmax))
elif solution[0].operator() == operator.eq:
positive_ranges.append((solution[0].rhs(), solution[0].rhs()))
else:
print(f'Cannot recognize operator: {solution[0]}')
elif len(solution) == 2:
positive_ranges.append((solution[0].rhs(), solution[1].rhs()))

hue = (rank - min_rank) / (max_rank - min_rank)
for positive_range in positive_ranges:
dots = [(x, rank) for x in positive_range]
positive_range_graph += line(dots, hue=hue)
positive_range_graph += points(dots, color='red')
end('plot')
end(f'rank:{rank}')
print()

edit retag close merge delete

Sort by » oldest newest most voted

I'm rather sceptical about symbolic investigations of big, dense matrices, but it seems that for n=20 your determinant is computable

import time

P.<E>=QQ[]
energy=E

@cached_function
def expected_distance(n, l):
if n <= -2:
return 0
if n == -1:
return -2 * energy
if n == 0:
return 1
return  (
- 4 * (2 * (n + 1) - 1) * expected_distance(n - 1, l) \
- n * ((n + 1) * (n - 1) - 4 * l * (l + 1)) * expected_distance(n - 2, l)
) / (8 * (n + 1) * energy)

offset=1
l=2
rank=20

hank = matrix.hankel([expected_distance(i + offset, l) for i in range(rank)],
[expected_distance(i + offset + rank - 1, l) for i in range(1, rank)],
P.fraction_field())

%time L, d = hank.indefinite_factorization('symmetric');d1=mul(d)


CPU times: user 776 ms, sys: 0 ns, total: 776 ms Wall time: 775 ms

The value of d1 is of the form (a*E^201+bE^200+...)/E^400, so numerical computations in usual precision are not adequate

Finding positive values works:

pos_sol=solve(SR(d1)>0,var('E'))


but its precision may be also to low. Using for example

nu=d1.numerator()
sol=nu.roots(ring=RealField(200))


one can check it with higher precision.

If you need faster method use pure numerical approach

import time
# I don't now if 3000 is a good choice,
# cf hd value below
R3000=RealField(3000)
energy=R3000(-0.1)

@cached_function
def expected_distance(n, l):
if n <= -2:
return 0
if n == -1:
return -2 * energy
if n == 0:
return 1
return  (
- 4 * (2 * (n + 1) - 1) * expected_distance(n - 1, l) \
- n * ((n + 1) * (n - 1) - 4 * l * (l + 1)) * expected_distance(n - 2, l)
) / (8 * (n + 1) * energy)

l=2
offset=1
rank=20
a1=[expected_distance(i + offset, l) for i in range(rank)];
a2=[expected_distance(i + offset + rank - 1, l) for i in range(1, rank)]
v1=vector(R3000,a1)
v2=vector(R3000,a2)
hank = matrix.hankel(v1,v2,R3000)
%time hd=hank.det()


CPU times: user 66.9 ms, sys: 0 ns, total: 66.9 ms Wall time: 66.4 ms

hd.n(20)


4.3908e394

more

P.fraction_field() changes the matrix from Matrix_symbolic_dense to Matrix_generic_dense but also keep the functionality of substitute(). So amazing! By changing SR to P.fraction_field() is computation time reduce a lot. Does it drop any feature when changing to Matrix_generic_dense?

( 2023-01-04 17:23:21 +0200 )edit

I tried your version on different rings. It seems that the time was reduced on QQ, but not RealField and RDF. On RealField and RDF, it's slower than on Matrix_symbolic_dense. The root of E (energy) is not rational, can I use QQ?

( 2023-01-05 13:27:58 +0200 )edit

( 2023-01-08 15:08:51 +0200 )edit

For energy = -0.1 your matrix has huge elements and is not comparable to usually tested matrices

hank[-2:,-2:].n(20)

( 2023-01-09 15:30:21 +0200 )edit

Unforunately Sage has no appropriate factorizations for RealField(precision) for larger precisions. Mpmath uses LU in determinant computations and is ~2 times faster for rank = 30

( 2023-01-11 11:39:31 +0200 )edit