Ask Your Question

# How to find the longest consecutive repeated part of a string?

I found a programming exercise from https://www.ohjelmointiputka.net/post... . The problem goes as follows:

A composer has made a new song but forgot to denote the repetition parts. Help the composer to find the longest consecutive repetition part. For example, if the song has notes hfhfggccaggccagccafff, it has several parts that is repeated twice and consecutively, like hf, ggcca, gccag and gcca. The task is to output the case, how many notes the longest repetition part contains, and where the repetition part begins. For example above, the output should be 1 5 5. The rest cases are from https://www.ohjelmointiputka.net/tied... .

I tried to solve the problem by Sage by donwloadin the zip file, unzip it and by runnig the following code, that uses Ukkonen's suffix tree I guess:

from sage.combinat.words.suffix_trees import DecoratedSuffixTree
for case in range(1,11):
with open(str(case)+'.txt','r') as file:
s = file.read().replace('\n', '')
start, run2 = max((x[1], x) for x in DecoratedSuffixTree(Word(s)).square_vocabulary())[1]
print ("{} {} {}".format(case, run2/2, start))


The output was

1 5 5
2 8 387
3 14 3947
4 10 18374
5 19 285340
6 157 312
7 3663 12673
8 115 0
Traceback (most recent call last):
File "tree.sage.py", line 11, in <module>
start, run2 = max((x[_sage_const_1 ], x) for x in DecoratedSuffixTree(Word(s)).square_vocabulary())[_sage_const_1 ]
File "/home/jaakko/SageMath/local/lib/python3.7/site-packages/sage/combinat/words/suffix_trees.py", line 1627, in __init__
self.labeling = self._complete_labeling()
File "/home/jaakko/SageMath/local/lib/python3.7/site-packages/sage/combinat/words/suffix_trees.py", line 1810, in _complete_labeling
treat_node(0, 0, 0)
File "/home/jaakko/SageMath/local/lib/python3.7/site-packages/sage/combinat/words/suffix_trees.py", line 1801, in treat_node
treat_node(child, edge_label[0]-(j-i), edge_label[1])
File "/home/jaakko/SageMath/local/lib/python3.7/site-packages/sage/combinat/words/suffix_trees.py", line 1801, in treat_node
treat_node(child, edge_label[0]-(j-i), edge_label[1])
File "/home/jaakko/SageMath/local/lib/python3.7/site-packages/sage/combinat/words/suffix_trees.py", line 1801, in treat_node
treat_node(child, edge_label[0]-(j-i), edge_label[1])
[Previous line repeated 77 more times]
File "/home/jaakko/SageMath/local/lib/python3.7/site-packages/sage/combinat/words/suffix_trees.py", line 1805, in treat_node
walk_chain(current_node, child, l, square_start)
File "/home/jaakko/SageMath/local/lib/python3.7/site-packages/sage/combinat/words/suffix_trees.py", line 1782, in walk_chain
walk_chain(parent, child, depth, start+1)
File "/home/jaakko/SageMath/local/lib/python3.7/site-packages/sage/combinat/words/suffix_trees.py", line 1782, in walk_chain
walk_chain(parent, child, depth, start+1)
File "/home/jaakko/SageMath/local/lib/python3.7/site-packages/sage/combinat/words/suffix_trees.py", line 1782, in walk_chain
walk_chain(parent, child, depth, start+1)
[Previous line repeated 907 more times]
File "/home/jaakko/SageMath/local/lib/python3.7/site-packages/sage/combinat/words/suffix_trees.py", line 1759, in walk_chain
final_state = self.suffix_walk((u, v), l)
File "/home/jaakko/SageMath/local/lib/python3.7/site-packages/sage/combinat/words/suffix_trees.py", line 1418, in suffix_walk
return self._count_and_skip(parent, i, i+l)
File "/home/jaakko/SageMath/local/lib/python3.7/site-packages/sage/combinat/words/suffix_trees.py", line 1372, in _count_and_skip
trans = self._find_transition(node, self._letters[i])
File "/home/jaakko/SageMath/local/lib/python3.7/site-packages/sage/combinat/words/suffix_trees.py", line 759, in _find_transition
for ((k,p),s) in iteritems(self._transition_function[state]):
File "/home/jaakko/SageMath/local/lib/python3.7/site-packages/six.py", line 587, in iteritems
return iter(d.items(**kw))
RecursionError: maximum recursion depth exceeded while calling a Python object


Is there a way to fix this code or use some other approach?

edit retag close merge delete

## 1 Answer

Sort by ยป oldest newest most voted

The following worked for me, after a little dance around to understand what happens:

from sage.combinat.words.suffix_trees import DecoratedSuffixTree as DST

for case in range(1, 11):
with open(f'/home/dan/Downloads/mpera/{case}.txt', 'r') as f:
s = f.read().replace('\n', '')

sq_vocabulary = DST(Word(s)).square_vocabulary()
sq_vocabulary.sort(key=lambda x: (x[1], x[0]), reverse=True)
start, square_string_len = sq_vocabulary[0]
print(case, square_string_len/2, start)


(The zip file was downloaded, then unpacked under /home/dan/Downloads/mpera.)

Results:

1 5 5
2 8 387
3 14 3947
4 10 18374
5 19 285340
6 157 312
7 3663 12673
8 115 0
9 102971 80
10 355765 19558

more

## Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

## Stats

Asked: 2020-06-08 17:09:54 +0100

Seen: 228 times

Last updated: Jun 12 '20