Python Iterator Protocol, Iterable, Iterator

created:

updated:

tags: python

What is Iterator Protocol?

Python’s Iterator Protocol defines how Python for loops and related expressions traverse the contents of a container type.

When Python seems a statement like for x in foo, it actually calls iter(foo). The iter built-in function calls the foo.__iter__ special method in turn. The __iter__ method must return an iterator object (which implements the __next__ special method.). Then, the for loop repeatedly calls the next built-in function on the iterator object until it’s exhausted (indicated by raising a StopIteration exception).

What is an Iterable?

Iterables are anything that we can loop over.

From Python’s perspective, an iterable is anything that you can pass to the built-in iter function without having a TypeError being raised.

What is an Iterator?

“An iterator is the thing you get when you pass any iterable to the iter function”.

“An iterator is an iterable that you can loop over using next”.

Iterators are consumed as you ask for items. Once there are no more items left in an iterator, calling next on it will raise a StopIteration exception. Iterators that have been fully consumed are sometimes called exhausted.

How for loops work

Python’s for loop do not rely on indexes. They rely on iterators. We can use the rules of the iterator protocol to re-implement a for loop using a while loop, essentially recreating the work that Python does whenever it evaluates a for loop.

# `for` loop use
for print_each(iterable):
    for item in iterable:
        print(item)
# Equivalent using `while` loop and iterator
def print_each(iterable):
    iterator = iter(iterable)
    while True:
        try:
            item = next(iterator)
        except StopIteration:
            break  # Iterator exhausted: stop the loop
        else:
            print(item)

You can see that the while loop will go on forever unless the iterator we got from the input iterable has ends (and StopIteration is raised). It is possible to make infinitely long iterables, so it’s possible this loop will go forever.

Custom Iterable Container Class

We can create our classes an iterable by implementing __iter__ special method.

# Example of Iterable class
class ReadVisits:
    def __init__(self, data_path):
        self.data_path = data_path

    def __iter__(self):
        # file at data_path contains numeric data in each line
        with open(self.data_path) as f:
            for line in f:
                yield int(line)

When a ReadVisit object gets passed to methods such as sum or for loop, the methods call ReadVisits.__iter__ method which allocates a new iterator object.

def normalize(numbers):
    total = sum(numbers)
    result = []
    for value in numbers:
        percent = 100 * value / total
        result.append(percent)
    return result

# file at path contains numeric data in each line
visits = ReadVisits(path)
percentages = normalize(visits)
print(percentages)

>>> [15, 35, 50]

How to check an object is Iterator

Iterator is also an iterable

When an iterator is passed to iter built-in function, iter returns the iterator passed.

>>> iter_list = iter([1, 2, 3])
>>> same_iter_list = iter(iter_list)
>>> iter_list is same_iter_list
>>> True

Use isinstance() method

We can use isinstance method to test the object is an Iterator (collections.abc built-int module defines an Iterator class)

from collections.abc import Iterator

if isinstance(an_object, Iterator):
    ...

References