Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

The shortest string containing all given substrings

I found a problem from https://www.ohjelmointiputka.net/postit/tehtava.php?tunnus=tuet . An English translation goes like this:

A person wants to learn playing tuba. His neigbours get angry for the noise so he tries to find a song that contains as few tunes as possible that he is able to play all four tune combinations. The tunes are c, d, e, f, g, a, and h. Output the minimal song.

I tried a code from https://artofproblemsolving.com/community/c163h1889399_shortest_string_containing_all_substrings .

def de_bruijn(k, n):
    """
    de Bruijn sequence for alphabet k
    and subsequences of length n.
    """
    alphabet = k
    k = len(k)

    a = [0] * k * n
    sequence = []

    def db(t, p):
        if t > n:
            if n % p == 0:
                sequence.extend(a[1:p + 1])
        else:
            a[t] = a[t - p]
            db(t + 1, p)
            for j in range(a[t - p] + 1, k):
                a[t] = j
                db(t + 1, t)
    db(1, 1)
    return "".join(alphabet[i] for i in sequence)

seq = de_bruijn("cdefgah", 4)
print(seq)

But the validator on the site says that hccc is missing.

So, how can I use Sagemath to solve the problem that if a set contains letters a, c, d, e, f, g, h, how to find the shortest string containing all one to four letter long substrings?

The shortest string containing all given substrings

I found a problem from https://www.ohjelmointiputka.net/postit/tehtava.php?tunnus=tuet . An English translation goes like this:

A person wants to learn playing tuba. His neigbours get angry for the noise so he tries to find a song that contains as few tunes as possible that he is able to play all four or less tune combinations. The tunes are c, d, e, f, g, a, and h. Output the minimal song.

I tried a code from https://artofproblemsolving.com/community/c163h1889399_shortest_string_containing_all_substrings .

def de_bruijn(k, n):
    """
    de Bruijn sequence for alphabet k
    and subsequences of length n.
    """
    alphabet = k
    k = len(k)

    a = [0] * k * n
    sequence = []

    def db(t, p):
        if t > n:
            if n % p == 0:
                sequence.extend(a[1:p + 1])
        else:
            a[t] = a[t - p]
            db(t + 1, p)
            for j in range(a[t - p] + 1, k):
                a[t] = j
                db(t + 1, t)
    db(1, 1)
    return "".join(alphabet[i] for i in sequence)

seq = de_bruijn("cdefgah", 4)
print(seq)

But the validator on the site says that hccc is missing.

So, how can I use Sagemath to solve the problem that if a set contains letters a, c, d, e, f, g, h, how to find the shortest string containing all one to four letter long substrings?