First time here? Check out the FAQ!

Ask Your Question
1

Why is Words('ab',50).random_element().count('aa') always incorrect ???

asked 4 years ago

magviana gravatar image

On the other hand, Words('ab',50).random_element().count('a')
or Words('ab',50).random_element().count('b') are always correct, and so are

Preview: (hide)

Comments

This query was motivated by implementing a counter for occurrences of words on a given or random symbolic sequence e.g. counting the occurrences of ACTs on a fixed or random DNA sequence. It is now implemented as

def fcount(factor,word):  
    return Word(word).parent()(Word(factor)).nb_factor_occurrences_in(Word(word))

so that

fcount('ACT','ACTTCATTTCCCTTCTTTACTTTCT') ## =2

which with the added function

def syse(sym,pos):
    return [w.string_rep() for w in FiniteWords(sym).iterate_by_length(pos)]

can generate all counts for that class, for example

table([(x,fcount(x,'yyyyyyuuuy')) for x in syse('yu',1)])

retuns

y   7
u   3

and

table([(x,fcount(x,'yyyyyyuuuy')) for x in syse('yu',2)])

returns

yy  5
yu  1
uy  1
uu  2
magviana gravatar imagemagviana ( 4 years ago )

1 Answer

Sort by » oldest newest most voted
0

answered 4 years ago

slelievre gravatar image

updated 4 years ago

Correct or incorrect?

Surprisingly, the answer is actually correct. As often, it is a matter of definition!

Let us walk through this.

What does count count?

Define a random word and give it a name:

sage: w = Words('ab', 20).random_element()
sage: w
baabaaaaaaabbabbbbba

Count for 'aa' in w:

sage: w.count('aa')
0

Since we may be surprised, let us read the documentation for the count method:

sage: w.count?

or its source code:

sage: w.count??

Oh, so count counts the occurrences of letters.

Here, w is a word on the alphabet {'a', 'b'}, and 'aa' is not a letter in that alphabet.

So the count of how many times 'aa' appears in w as a letter must be zero.

Compare:

sage: w = Words(['a', 'b', 'aa'], 10).random_element()
sage: w
word: aa,aa,a,aa,a,a,a,b,a,aa
sage: w.count('aa')
4

Factors and subwords

So how do we count factors? or subwords?

Get hold of the set of words.

sage: W = w.parent()
sage: W
Finite words over {'a', 'b'}

Define the factor we are looking for:

sage: f = W('aa')

Count its occurrences as a factor or a subword in w.

sage: f.nb_factor_occurrences_in(w)
7
sage: f.nb_subword_occurrences_in(w)
55

Improve the documentation?

The question raises a valid point! The documentation for count should at least point to the nb_factor_occurrences, maybe with an example such as the one here.

This is now tracked at

Preview: (hide)
link

Comments

What prompted my query is the fact that Word('abbaabababbbb').count('ab') works as I expected, including when the word is an output of the form Words('ab', 20).random_element(). I only wanted to avoid the copy/paste of long random words into the former.

magviana gravatar imagemagviana ( 4 years ago )

That inconsistency is more of a bug, thanks for pointing it out. This is now tracked at

slelievre gravatar imageslelievre ( 4 years ago )

That example was maybe meant to be part of the question, which seems to suffer an editing error: it ends mid-sentence with "and so are". To edit the question, click the "Edit" button below it.

slelievre gravatar imageslelievre ( 4 years ago )

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 4 years ago

Seen: 249 times

Last updated: Jul 14 '20