# The shortest string containing all given substrings

I found a problem from https://www.ohjelmointiputka.net/post... . An English translation goes like this:

A person wants to learn playing tuba. His neigbours get angry for the noise so he tries to find a song that contains as few tunes as possible that he is able to play all four or less tune combinations. The tunes are c, d, e, f, g, a, and h. Output the minimal song.

I tried a code from https://artofproblemsolving.com/commu... .

def de_bruijn(k, n):
"""
de Bruijn sequence for alphabet k
and subsequences of length n.
"""
alphabet = k
k = len(k)

a =  * k * n
sequence = []

def db(t, p):
if t > n:
if n % p == 0:
sequence.extend(a[1:p + 1])
else:
a[t] = a[t - p]
db(t + 1, p)
for j in range(a[t - p] + 1, k):
a[t] = j
db(t + 1, t)
db(1, 1)
return "".join(alphabet[i] for i in sequence)

seq = de_bruijn("cdefgah", 4)
print(seq)


But the validator on the site says that hccc is missing.

So, how can I use Sagemath to solve the problem that if a set contains letters a, c, d, e, f, g, h, how to find the shortest string containing all one to four letter long substrings?

edit retag close merge delete

If you want a Sagemath solution, there is http://doc.sagemath.org/html/en/refer.... You would have to translate the numbers 1, 2, ..., 7 to the letters a, c, d, ... h, but that should be easy.

Sort by » oldest newest most voted

The program doesn't use any sage-specific functionality, and it doesn't have to. You can just use any correct python implementation. The routine presented seems like a verbatim copy of the one on https://en.wikipedia.org/wiki/De_Brui... . Note that the definition of a "de bruijn sequence" includes it's a cyclic sequence. So you should be appending the head of the sequence to the tail:

"hccc" in seq+seq[:3]

more