Correct or incorrect?
Surprisingly, the answer is actually correct.
As often, it is a matter of definition!
Let us walk through this.
What does count
count?
Define a random word and give it a name:
sage: w = Words('ab', 20).random_element()
sage: w
baabaaaaaaabbabbbbba
Count for 'aa'
in w
:
sage: w.count('aa')
0
Since we may be surprised, let us read the documentation
for the count
method:
sage: w.count?
or its source code:
sage: w.count??
Oh, so count
counts the occurrences of letters.
Here, w
is a word on the alphabet {'a', 'b'}
,
and 'aa'
is not a letter in that alphabet.
So the count of how many times 'aa'
appears in w
as a letter must be zero.
Compare:
sage: w = Words(['a', 'b', 'aa'], 10).random_element()
sage: w
word: aa,aa,a,aa,a,a,a,b,a,aa
sage: w.count('aa')
4
Factors and subwords
So how do we count factors? or subwords?
Get hold of the set of words.
sage: W = w.parent()
sage: W
Finite words over {'a', 'b'}
Define the factor we are looking for:
sage: f = W('aa')
Count its occurrences as a factor or a subword in w
.
sage: f.nb_factor_occurrences_in(w)
7
sage: f.nb_subword_occurrences_in(w)
55
Improve the documentation?
The question raises a valid point! The documentation
for count
should at least point to the nb_factor_occurrences
,
maybe with an example such as the one here.
This is now tracked at
This query was motivated by implementing a counter for occurrences of words on a given or random symbolic sequence e.g. counting the occurrences of ACTs on a fixed or random DNA sequence. It is now implemented as
so that
which with the added function
can generate all counts for that class, for example
retuns
and
returns