# Why is the Subwords class separated from FiniteWord?

Subwords could be a method of FiniteWord as factors is already one. Is there a reason why this is not the case?

edit retag close merge delete

Sort by » oldest newest most voted

Here is a function that compute the set of subwords (no repetition)

def subwords(w, W=None):
r"""
Return the subwords of w without repetition.

INPUT:

- w - a finite word (or string or list)

- W - an optional parent

EXAMPLES::

sage: subwords('aab')
['', 'a', 'aa', 'b', 'ab', 'aab']
sage: subwords('aba')
['', 'a', 'b', 'ab', 'ba', 'aa', 'aba']
sage: subwords('aaa')
['', 'a', 'aa', 'aaa']

sage: subwords(Word('abc'))
[word: , word: a, word: b, word: ab, word: c, word: bc, word: ac, word: abc]
"""
if W is None:
try:
W = w.parent()
except AttributeError:
W = type(w)

T = {}
L = [W()]
for a in w:
todo = [(W(),T)]
while todo:
cw, t = todo.pop()
todo.extend((cw+W(b),tt) for b,tt in t.items())
if a not in t:
t[a] = {}
L.append(cw + W(a))

return L

more

The module subword.py was created in 2007 during the translation by Mike Hansen of 30k lines of code from Mupad-Combinat to sage. Around the same time, in 2008, the module sage/combinat/words was created and nothing has ever been done to merge the two.

Notice that the code in subword.py is based on itertools.combinations and no wiser algorithm is coded to enumerate distinct subwords.

sage: s = Subwords('aaaaa', 3)
sage: s.__iter__??
...
iterator = itertools.combinations(self._w, self._k)
...
sage: list(s)
['aaa', 'aaa', 'aaa', 'aaa', 'aaa', 'aaa', 'aaa', 'aaa', 'aaa', 'aaa']


If you know the existence of a better algorithm, you are welcome to add it to the finite_word.py file which contains already 3 methods related to subwords:

sage: w = Word('abc')
sage: w.*subword*?
w.is_subword_of
w.longest_common_subword
w.nb_subword_occurrences_in

more