I remember I was very confused for a while when I first encountered yield
in Python code. Once I learn more about what generators are, I think it’s one of the coolest things about Python.
What are Generators?
Introduced with PEP 255, generator functions are a special kind of function that return a lazy iterator. These are objects that you can loop over like a
list
. However, unlike lists, lazy iterators do not store their contents in memory.
Example of Reading Large Files
def csv_reader(file_name):
file = open(file_name)
result = file.read().split("\n")
return result
The above snippet opens the given file and read the whole file and split them line by line into a list. This can be a problem when the file size is very large as it can cause MemoryError
.
Generator way of Reading Large Files
Below is a generator function that we can use to achieve the same thing: open the file and read line by line.
def csv_reader(file_name):
for row in open(file_name, "r"):
yield row
In the above snippet, it opens the given file, loop through it line by line, and yield a row. In this case, csv_reader()
is a generator function that yields each row instead of returning it. This way, however large the file may be, we’ll be able to loop through it line by line.
- “Using
yield
will result in a generator object” - “Using
return
will result in the first line of the file only.”
Example of Generating an Infinite Sequence
Our computer’s memory is finite and at some point, it may run out of it especially if we have an infinite amount of data. Generator comes in handy in this case as well:
def infinite_seq():
num = 0
while True:
yield num
num += 1
for s in infinite_seq():
print(s, end=" ")
>>> 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
[... continue until we stop manually]
We can also use next()
to iterate over a generator manually:
>>> gen = infinite_seq()
>>> next(gen)
0
>>> next(gen)
1
>>> next(gen)
2
Generator Expression
Generator expression (also known as a generator comprehension) is similar to list expression in syntax.
csv_gen = (row for row in open(file_name))
What is yield
?
yield
indicates where a value is sent back to the caller, but unlikereturn
, you don’t exit the function afterward. Instead, the state of the function is remembered. That way, whennext()
is called on a generator object (either explicitly or implicitly within a for loop), the previously yielded variable num is incremented, and then yielded again. Since generator functions look like other functions and act very similarly to them, you can assume that generator expressions are very similar to other comprehensions available in Python.
yield from
for Nested Generators
yield from
enables to yield all values from nested generators before returning value from the parent generator.
Description from Python Docs
PEP 380 adds the yield from expression, allowing a generator to delegate part of its operations to another generator. This allows a section of code containing yield to be factored out and placed in another generator. Additionally, the subgenerator is allowed to return with a value, and the value is made available to the delegating generator.
# Example from Python docs
def g(x):
yield from range(x, 0, -1)
yield from range(x)
list(g(5))
[5, 4, 3, 2, 1, 0, 1, 2, 3, 4]
# Example from Effective Python
def child():
for i in range(1_000_000):
yield i
def parent_using_yield_from():
yield from child()
def parent_yield():
for i in child():
yield i
Advantages of using yield from
- The code is clearer and more intuitive by using
yield from
yield from
provides better performance than iterating nested generators and yielding their outputs
References
- Real Python: Introduction to Python Generators
- Python Docs: PEP380
- Effective Python: Item 33: Compose Multiple Generators with
yield from