# Goppa Codes: Cannot create decoder with large parameters

Anonymous

Hello,

I've been playing around with linear error-correcting codes with the intention of programming a demonstration of McEliece's Cryptosystem, which utilises Goppa codes.

I've been following the format outlined in the SAGE tutorial. I can't post the link for some reason, but it can be found by searching "sagemath Goppa codes". I have no issue running the following code to create a Goppa code over field GF(2^4) and its corresponding decoder.

F = GF(2^4)
R.<x> = F[]
g = x^2+x+1
L = [a for a in F.list if g(a) != 0]
C = codes.GoppaCode(g,L)
D = C.decoder()


However, the cryposystem I'm trying to program requires parameters higher than this, specifically over the field of GF(2^10). The code that I've used to create this Goppa code and its decoder are therefore as follows.

F = GF(2^10)
R.<x> = F[]
g = x^50+x^3+1
L = [a for a in F.list if g(a) != 0]
C = codes.GoppaCode(g,L)
D = C.decoder()


This creates a [1024, 524] Goppa code over GF(2) and there's no issue in generating the code C, or even using it to encode a message vector (of length 524).

Unfortunately, I seem to encounter a memory issue when I run the final line to create the decoder D. My initial response was to try it on a machine with larger memory (64GB), but I had no luck there. I assume it's because the code that I'm trying to create is pretty huge, but it would be really helpful if there's somehow a way around the problem to allow me implement this Goppa code and generate its decoder.

Any help would be greatly appreciated and I'm happy to provide more details if needed.

Thank you very much!!

edit retag close merge delete

Sort by » oldest newest most voted

This is an answer showing why there is a memory issue while building the decoder for a bigger field $F$ of characteristic two and/or a higher degree of the involved polynomial $g\in \Bbb F_2[x]$.

To illustrate this, let us consider first the "smaller" case from the question:

F = GF(2^4)
R.<x> = F[]
g = x^2 + x + 1
L = [a for a in F if g(a)]
C = codes.GoppaCode(g, L)
D = C.decoder()


Then the object D has a lot of "hidden structure", for instance there is a dictionary D.__dict__, it has a lot of keys,

sage: for key in D.__dict__:
....:     print(key)
....:
_maximum_error_weight
decoder_type
_code
_input_space
_connected_encoder_name
_build_lookup_table
_code_minimum_distance
_decoder_type
_lookup_table


and for instance D.__dict__['_lookup_table'] is again a dictionary,

sage: print(type(D.__dict__['_lookup_table']))
<class 'dict'>


it has "a lot of keys", translated into "a lot of values". We can count:

sage: len(D.__dict__['_lookup_table'])
256


so there are $256$ keys, and some first few lookup translations are...

sage: D.__dict__['_lookup_table']
{(0, 0, 0, 0, 0, 0, 0, 0): (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
(1, 0, 0, 0, 0, 0, 0, 0): (1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
(0, 1, 1, 0, 0, 0, 1, 1): (0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
(1, 1, 1, 0, 1, 1, 1, 1): (0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
(1, 1, 0, 1, 1, 1, 1, 0): (0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
(0, 1, 1, 0, 0, 1, 0, 1): (0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0),
(1, 0, 0, 1, 0, 1, 1, 0): (0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0),
(0, 1, 1, 1, 0, 0, 0, 1): (0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0),
(1, 1, 1, 0, 0, 0, 0, 1): (0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0),


and so on, totally $256$ key-value pairs.

If we try the same with a field $F=\Bbb F_{2^{10}}=$GF(2^10) and then further with a polynomial of degree $50$... well, there is no chance to build the corresponding structure. In such cases, maybe an ad-hoc solution should and may be given, try to see exactly what is needed from the decoder instance in the intended application.

Some further notes. One can look into the code, e.g. via C.decoder?? - and the code is mainly building...

    if decoder_name in self._registered_decoders:
decClass = self._registered_decoders[decoder_name]
try:
return decClass(self, *args, **kwargs)


where self becomes the instance C, and the decoder_name is the default decoder,

sage: C._default_decoder_name
'Syndrome'


So we can build the corresponding class, and ask for some information...

sage: S = C._registered_decoders['Syndrome']
sage: S
<class 'sage.coding.linear_code.LinearCodeSyndromeDecoder'>
sage: S??


The double question mark gives...

Init signature: S(code, maximum_error_weight=None)
Docstring:
Constructs a decoder for Linear Codes based on syndrome lookup
table.

The decoding algorithm works as follows:

* First, a lookup table is built by computing the syndrome of every
error pattern of weight up to "maximum_error_weight".

* Then, whenever one tries to decode a word "r", the syndrome of
"r" is computed. The corresponding error pattern is recovered
from the pre-computed lookup table.

* Finally, the recovered error pattern is subtracted from "r" to
recover the original word.

"maximum_error_weight" need never exceed the covering radius of the
code, since there are then always lower-weight errors with the same
syndrome. If one sets "maximum_error_weight" to a value greater
determined while building the lookup-table. This lower value is
then returned if you query "decoding_radius" after construction.

If "maximum_error_weight" is left unspecified or set to a number at
least the covering radius of the code, this decoder is complete,
i.e. it decodes every vector in the ambient space.

Note:

Constructing the lookup table takes time exponential in the
length of the code and the size of the code's base field.
Afterwards, the individual decodings are fast.

INPUT:

* "code" -- A code associated to this decoder

* "maximum_error_weight" -- (default: "None") the maximum number of
errors to look for when building the table. An error is raised if
it is set greater than n-k, since this is an upper bound on the
covering radius on any linear code. If "maximum_error_weight" is
kept unspecified, it will be set to n - k, where n is the length
of "code" and k its dimension.


and a lot of further lines.

So a possibility to force some build of a decoder would be...

F = GF(2^10)
R.<x> = F[]
g = x^50 + x^3 + 1
L = [a for a in F if g(a)]
C = codes.GoppaCode(g, L)
D = C.decoder(maximum_error_weight=1)


Above, the option maximum_error_weight=1 may make the decoder useless for a lot of purposes, but this shows that the code is building D. We have in the above case:

sage: len(L)
1024
sage: len(D.__dict__['_lookup_table'])
1025


Using maximum_error_weight=2 instead, after a looonger wait we get a decoder with...

sage: len(D.__dict__['_lookup_table'])
524801

more

Thanks a lot for this, it's really helped shed light on the problem! It looks as though the decoder just creates a lookup table for all the possible syndromes and their corresponding codewords, so there's no surprise I run out of memory. Unless there's a more efficient syndrome decoder out there that I'm unaware of, it looks as though creating a decoder with these parameters is out of the question then, which is a shame. This has been very helpful though, so thank you!

( 2021-04-02 15:16:08 +0200 )edit