# What is the effect of declaring a polynomial sparse'?

I do some work on polynomials over finite fields. Generated polynomials of larger degree are sparse in the sense that only a few coefficients are not zero. I tried to utilize this property by defining PR=PolynomialRing(GF(7),name='x',sparse=True). Then type(PR) is <class 'sage.rings.polynomial.polynomial_ring.polynomialring_field_with_category'=""> what is not very informative. It is not clear which library is involved (FLINT, NTL, Sage's own implementation, or something else), and libraries FLINT, NTL seem not to support sparse polynomials (sparse in the sense of storing only the non-zero coefficients similar to sparse matrices). So what does "sparse=True" mean? Is there a reduction of occupied memory? Anyway, time consumption for multiplication increases tremendously (I expected a major reduction). Thanks in advance.

Function "make_pol" defined below generates for any target degree "deg" a polynomial over a finite field. Only about "deg^ex" coefficients are not zero where "ex" depends on the field's order and is in interval (1/2, 1), in particular "ex" is about 3/4 for field order 7. The reason for the sparsity is not yet understood. Multiplication of polynomials over GF(7) of degrees 100043 and 100361 (and 6144 coefficients each) generated this way takes 36 milliseconds for dense implementation but 29 seconds (without "milli"!) for sparse implementation (similarly for the function call, the function uses multiplications). If you are curious about the chosen degrees: these are Sophie Germain primes, they are of special interest since the generated polynomials are irreducible.

Platform info: 64 bit, 3.3GHz, Linux, Sage 7.1 (Release Date: 2016-03-20)

from sage.all import ZZ, GF, PolynomialRing

def make_pol(deg, char, sparse):
"""
Generate a sparse polynomial over a finite field.

INPUT:

"deg" target degree (natural number)

"ch" field characteristic (prime number)

"sparse" whether a sparse implementation to use (boolean)

OUTPUT:

Polynomial in 'x' of degree "deg" over field "GF(ch)".
Only about "deg^ex" (where "ex"<1) coefficients are not zero,
value of "ex" depends on "ch", e.g. "ex==3/4" for "ch==7".
"""
if not deg in ZZ or deg<0:
raise ValueError("First arg must be a non-negative integer.")
if not char in ZZ or not is_prime(ch):
raise ValueError("Second arg must be a prime number.")
R = PolynomialRing(GF(ch),'x',sparse=sparse)
i0 = 0
r0 = R(1)
s = R.gen()
if deg == 0:
return r0
i1 = 1
r1 = s+R(1)
if deg == 1:
return r1
i2 = 2
r2 = s*r1 - r0
if deg == 2:
return r2
k = 1
while i2 < deg:
if deg & k != 0:
i3 = 2*i2-i1
r3 = s*r2 - r1
i0 = i1
r0 = r1
i1 = i3
r1 = r3
else:
i1 = i2
r1 = r2
k = 2*k
s = s**2 - 2
i2 = 2*i1 - i0
r2 = s*r1 - r0
if deg == i2:
return r2
return r3

edit retag close merge delete

Sort by » oldest newest most voted

Actually, the types do say a lot, but you have to look at the type of an element in this case, not of the ring. No mention of finite field is made, so it looks like you're ending up with a generic implementation of sparse polynomials, implemented in python. If you investigate the code a bit you'll find it is indeed a sparse implementation, based on python dictionaries. The class itself is also a straight python class.

When you're working with polynomials over rings with expensive arithmetic, or when you're computing with polynomials that would not fit in memory with dense representation, this is a reasonable solution. However, when you're working with polynomials over a finite field, the dense implementations are so efficient that your polynomial has to be very sparse before this generic implementation is faster.

You could try defining your polynomial ring as a multivariate one (e.g., add a dummy second variable). Multivariate polynomials tend to be implemented sparsely to begin with, and there probably are fairly efficient implementations for multivariate polynomial rings over finite fields in sage, and they may well be used by default.

Seeing the type of an element:

sage: P=PolynomialRing(GF(7),sparse=True,name="x")
sage: x=P.0
sage: type(x)
<class 'sage.rings.polynomial.polynomial_element_generic.Polynomial_generic_sparse_field'>


Furthermore, x? and x?? give you documentation and source, and the file names in which you can look up further details.

Some timing data for polynomial multiplication with a density of 0.1%

sage: e1=[ZZ.random_element(0,100000) for i in range(100)]
sage: e2=[ZZ.random_element(0,100000) for i in range(100)]
sage:
sage: P.<x>=PolynomialRing(GF(7),sparse=True)
sage: f1=sum(x^e for e in e1)
sage: f2=sum(x^e for e in e2)
sage: %timeit f1*f2
100 loops, best of 3: 4.84 ms per loop
sage:
sage: P.<x>=PolynomialRing(GF(7))
sage: f1=sum(x^e for e in e1) #this is a very bad way of initializing polynomials
sage: f2=sum(x^e for e in e2)
sage: %timeit f1*f2
10 loops, best of 3: 27.6 ms per loop
sage:
sage:
sage: P.<x,y>=PolynomialRing(GF(7))
sage: f1=sum(x^e for e in e1)
sage: f2=sum(x^e for e in e2)
sage: %timeit f1*f2
1000 loops, best of 3: 719 µs per loop


So even over GF(7) it seems that the generic sparse polynomial code can outperform the dense code in a reasonable regime. You can also see that using libsingular (via a multivariate ring) gives much better performance.

more

1

Thanks for the investigation. Let me just add a trick to use singular in the single variable case without having to add a dummy second variable, you can make a multivariate polynomial ring where the number of variables is set to 1 as follows:

sage: P.<x> = PolynomialRing(GF(7), 1)
`

You could see that there is not only a question of performance, but also the elements have different methods available. I was hoping that the category framework would be a tool for merging the best of the two implementations (since it is the same mathematical object), but apparently not.

( 2016-07-16 17:20:47 -0500 )edit

I see that I misunderstood the words "generic" and "with_category" in type name "sage.rings.polynomial.polynomial_element_generic.PolynomialRing_field_with_category.element_class". I thought the type name means something like a common superclass to the various implementations ("generic") with a switch to the actual implementation ("with_category"). Now I see that the type name specifies a third implementation besides FLINT and NTL. I read documentation "Sage Reference Manual: Polynomials" forth and back but I did not find a hint on this meaning. Would it be possible to add a relating comment into this paper (and into the "Tutorial")? Moreover, it may be wise to add a comment into the online documentation of "PolynomialRing.__init__" stating that argument "implementation" is ignored if "sparse=True" and a third (not otherwise mentioned) implementation is used instead.

Thanks for the hint to use multivariate polynomials instead of the univariate ones. Polynomials of a degree so big that they do not fit into memory when using a dense univariate implementation do not make problems anymore. But, of course, the multivariate implementation adds some computational overhead. For my polynomials when the dense univariate implementation is on its limits (degree about 10^7, density 1.2%), the computing times for polynomial generation (most time is spent for multiplication) are coarsely related as

univariate_dense : multivariate : univariate_sparse = 1 : 500 : 1000.

So switching to multivariate polynomials does not much help. In other words, the implementation of sparse univariate polynomials by a Python dictionary is not as bad as one might think.

more