The collections library has many wonderful tools, including the
Counter
object, which greatly simplifies the counting of objects.
Let’s look at several examples:
- counting letters in a string
- counting items in a list
- counting tuples within a tuple
- special functions focused on using the results of a count
We’ll start by looking at three simple Python variables that contain elements we want to count:
>>> mystring = 'python pythonista pythonic the pythons' >>> mylist = [11, 22, 22, 33, 33, 33, 42] >>> mytuples = (('first', 'alpha'), ('first', 'alpha'), ('second', 'beta'), ('third', 'gamma'))
We import the Counter
object from the collections
module:
from collections import Counter
Then we provide the item we want counted as an argument to the Counter
class:
>>> mystring = 'python pythonista pythonic the pythons' >>> chars = Counter(mystring) >>> chars Counter({' ': 4, # NOTE: the space is a character 'a': 1, # The counts are not in any order 'c': 1, 'e': 1, 'h': 5, 'i': 2, 'n': 4, 'o': 4, 'p': 4, 's': 2, 't': 6, 'y': 4})
This works just as well with our other samples:
>>> mylist = [11, 22, 22, 33, 33, 33, 42] >>> integers = Counter(mylist) >>> integers Counter({11: 1, 22: 2, 33: 3, 42: 1})
>>> mytuples = (('first', 'alpha'), ('first', 'alpha'), ('second', 'beta'), ('third', 'gamma')) >>> tuples = Counter(mytuples) >>> tuples Counter({('first', 'alpha'): 2, ('second', 'beta'): 1, ('third', 'gamma'): 1})
We can display the most common items in a Counter
:
We do so, using the .most_common()
function.
By providing an argument n
to the function, we can limit our results to just n
elements:
>>> chars.most_common(3) [('t', 6), ('h', 5), ('p', 4)]
Interesting, we can see the elements organized into groups, based on the actual character, number, etc.
The .elements()
method creates an iterable object, so to easily see the contents, it is common to encapsulate the result in another function, such as list()
or sorted()
.
>>> list(chars.elements()) ['p', 'p', 'p', 'p', 'y', 'y', 'y', 'y', 't', 't', 't', 't', 't', 't', 'h', 'h', 'h', 'h', 'h', 'o', 'o', 'o', 'o', 'n', 'n', 'n', 'n', ' ', ' ', ' ', ' ', 'i', 'i', 's', 's', 'a', 'c', 'e'] >>> sorted(chars.elements()) [' ', ' ', ' ', ' ', 'a', 'c', 'e', 'h', 'h', 'h', 'h', 'h', 'i', 'i', 'n', 'n', 'n', 'n', 'o', 'o', 'o', 'o', 'p', 'p', 'p', 'p', 's', 's', 't', 't', 't', 't', 't', 't', 'y', 'y', 'y', 'y']
Counters come with a variety of other methods, most of which mirror dictionary methods:
['clear', 'copy', 'elements', 'fromkeys', 'get', 'items', 'keys', 'most_common', 'pop', 'popitem', 'setdefault', 'subtract', 'update', 'values']
Happy coding!