# Finding the frequency of each number in a string

Hi, I have a string of numbers like s=[(11,23),(33,47),(98,20),...,(34,65)] produced by a code in Jupyter. The biggest number is 100. How can I order the numbers based on their frequency of repetition in s? For example for a simple case like s1=[(1,4),(2,4),(4,1)] the result is

Number 4 frequency 3 Number 1 frequency 2 number 2 frequency 1

edit retag close merge delete

Sort by » oldest newest most voted

First, your data is a list of tuples, so you have to flatten it to make it a simple list:

sage:  s=[(11,23),(33,47),(98,20),(34,65)]
sage: flatten(s)
[11, 23, 33, 47, 98, 20, 34, 65]


Then, my favorite way to count occurrences of unknown objects is defaultdict:

sage: from collections import defaultdict
sage: d = defaultdict(int)

sage: for i in flatten(s1):
....:     d[i] += 1
sage: d
defaultdict(<type 'int'>, {1: 2, 2: 1, 4: 3})


Then you can ask for th frequency of the numbers that appeared:

sage: d
3
sage: d
2
sage: d
1


But also numbers that did no appear:

sage: d
0


Then you can sort the keys of the dictonary according to theirs value:

sage: sorted(d, key=d.get, reverse=True)
[4, 1, 2, 12]


(note that when we called d it created a "real" entry for it)

Then you can do something like:

sage: for k in sorted(d, key=d.get, reverse=True):
....:     print 'Number {} has freqency {}'.format(k, d[k])
Number 4 has freqency 3
Number 1 has freqency 2
Number 2 has freqency 1
Number 12 has freqency 0

more

Instead of using defaultdict(int), one may use Counter also from collections:

>>> from collections import Counter
>>> c = Counter()
>>> c['a'] += 1
>>> c += 1
>>> c += 1
>>> c
Counter({2: 2, 'a': 1})


or even better:

>>> Counter([2,2,'a'])
Counter({2: 2, 'a': 1})