What is Iterator Protocol?
Python’s Iterator Protocol defines how Python for
loops and related expressions traverse the contents of a container type.
When Python seems a statement like
for x in foo
, it actually callsiter(foo)
. Theiter
built-in function calls thefoo.__iter__
special method in turn. The__iter__
method must return an iterator object (which implements the__next__
special method.). Then, thefor
loop repeatedly calls thenext
built-in function on the iterator object until it’s exhausted (indicated by raising aStopIteration
exception).
What is an Iterable?
Iterables are anything that we can loop over.
From Python’s perspective, an iterable is anything that you can pass to the built-in
iter
function without having aTypeError
being raised.
What is an Iterator?
“An iterator is the thing you get when you pass any iterable to the iter
function”.
“An iterator is an iterable that you can loop over using next
”.
Iterators are consumed as you ask for items. Once there are no more items left in an iterator, calling
next
on it will raise aStopIteration
exception. Iterators that have been fully consumed are sometimes called exhausted.
How for
loops work
Python’s
for
loop do not rely on indexes. They rely on iterators. We can use the rules of the iterator protocol to re-implement afor
loop using awhile
loop, essentially recreating the work that Python does whenever it evaluates afor
loop.
# `for` loop use
for print_each(iterable):
for item in iterable:
print(item)
# Equivalent using `while` loop and iterator
def print_each(iterable):
iterator = iter(iterable)
while True:
try:
item = next(iterator)
except StopIteration:
break # Iterator exhausted: stop the loop
else:
print(item)
You can see that the while loop will go on forever unless the iterator we got from the input iterable has ends (and
StopIteration
is raised). It is possible to make infinitely long iterables, so it’s possible this loop will go forever.
Custom Iterable Container Class
We can create our classes an iterable by implementing __iter__
special method.
# Example of Iterable class
class ReadVisits:
def __init__(self, data_path):
self.data_path = data_path
def __iter__(self):
# file at data_path contains numeric data in each line
with open(self.data_path) as f:
for line in f:
yield int(line)
When a ReadVisit
object gets passed to methods such as sum
or for
loop, the methods call ReadVisits.__iter__
method which
allocates a new iterator object.
def normalize(numbers):
total = sum(numbers)
result = []
for value in numbers:
percent = 100 * value / total
result.append(percent)
return result
# file at path contains numeric data in each line
visits = ReadVisits(path)
percentages = normalize(visits)
print(percentages)
>>> [15, 35, 50]
How to check an object is Iterator
Iterator is also an iterable
When an iterator is passed to iter
built-in function, iter
returns the iterator passed.
>>> iter_list = iter([1, 2, 3])
>>> same_iter_list = iter(iter_list)
>>> iter_list is same_iter_list
>>> True
Use isinstance()
method
We can use isinstance
method to test the object is an Iterator (collections.abc
built-int module defines an Iterator
class)
from collections.abc import Iterator
if isinstance(an_object, Iterator):
...
References
- Effective Python: Item 31: Be Defensive When Iterating Over Arguments
- Python Morsels: The Iterator Protocol