What is Iterator Protocol?
Python’s Iterator Protocol defines how Python for loops and related expressions traverse the contents of a container type.
When Python seems a statement like
for x in foo, it actually callsiter(foo). Theiterbuilt-in function calls thefoo.__iter__special method in turn. The__iter__method must return an iterator object (which implements the__next__special method.). Then, theforloop repeatedly calls thenextbuilt-in function on the iterator object until it’s exhausted (indicated by raising aStopIterationexception).
What is an Iterable?
Iterables are anything that we can loop over.
From Python’s perspective, an iterable is anything that you can pass to the built-in
iterfunction without having aTypeErrorbeing raised.
What is an Iterator?
“An iterator is the thing you get when you pass any iterable to the iter function”.
“An iterator is an iterable that you can loop over using next”.
Iterators are consumed as you ask for items. Once there are no more items left in an iterator, calling
nexton it will raise aStopIterationexception. Iterators that have been fully consumed are sometimes called exhausted.
How for loops work
Python’s
forloop do not rely on indexes. They rely on iterators. We can use the rules of the iterator protocol to re-implement aforloop using awhileloop, essentially recreating the work that Python does whenever it evaluates aforloop.
# `for` loop use
for print_each(iterable):
for item in iterable:
print(item)
# Equivalent using `while` loop and iterator
def print_each(iterable):
iterator = iter(iterable)
while True:
try:
item = next(iterator)
except StopIteration:
break # Iterator exhausted: stop the loop
else:
print(item)
You can see that the while loop will go on forever unless the iterator we got from the input iterable has ends (and
StopIterationis raised). It is possible to make infinitely long iterables, so it’s possible this loop will go forever.
Custom Iterable Container Class
We can create our classes an iterable by implementing __iter__ special method.
# Example of Iterable class
class ReadVisits:
def __init__(self, data_path):
self.data_path = data_path
def __iter__(self):
# file at data_path contains numeric data in each line
with open(self.data_path) as f:
for line in f:
yield int(line)
When a ReadVisit object gets passed to methods such as sum or for loop, the methods call ReadVisits.__iter__ method which
allocates a new iterator object.
def normalize(numbers):
total = sum(numbers)
result = []
for value in numbers:
percent = 100 * value / total
result.append(percent)
return result
# file at path contains numeric data in each line
visits = ReadVisits(path)
percentages = normalize(visits)
print(percentages)
>>> [15, 35, 50]
How to check an object is Iterator
Iterator is also an iterable
When an iterator is passed to iter built-in function, iter returns the iterator passed.
>>> iter_list = iter([1, 2, 3])
>>> same_iter_list = iter(iter_list)
>>> iter_list is same_iter_list
>>> True
Use isinstance() method
We can use isinstance method to test the object is an Iterator (collections.abc built-int module defines an Iterator class)
from collections.abc import Iterator
if isinstance(an_object, Iterator):
...
References
- Effective Python: Item 31: Be Defensive When Iterating Over Arguments
- Python Morsels: The Iterator Protocol