List Comprehensions
Python's signature feature. Learn to replace clunky loops with elegant, mathematical one-liners that are not just shorter, but significantly faster at the C-level.
If you show a C++ or Java developer a Python codebase, the first thing that confuses them is the "List Comprehension". It looks like a loop inside a list.
In reality, it is a declarative way to construct a list. Instead of telling the computer how to build the list (step-by-step appending), you describe what the list should be. This shift from Imperative to Declarative thinking is the hallway that leads to becoming a Senior Python Engineer.
What You'll Learn
- Syntax Anatomy: The `[expression for item in iterable]` formula.
- Performance: Why they are faster than `for` loops (Bytecode analysis).
- Filtering: Integrating `if` statements concisely.
- Nested Loops: Flattening matrices in one line.
- Readability: When NOT to use them.
The Mental Model: Assembly Line vs Shopping Cart
To understand why comprehensions are special, let's use an analogy.
1. The "Shopping Cart" (For Loop)
A standard `for` loop is like pushing a shopping cart through a store. You stop at each shelf, pick up an item, and manually place it in the cart (`append`). It's a manual process repeated for every item.
# Imperative Style
cart = []
for item in store_items:
if item.is_fresh():
cart.append(item.buy())2. The "Assembly Line" (Comprehension)
A list comprehension is an industrial assembly line. You define the blueprint at the start. The raw materials (`store_items`) are fed in, defective ones are rejected (`if`), and the final product is stamped out automatically. There is no manual "placing" step.
# Declarative Style
cart = [item.buy() for item in store_items if item.is_fresh()]Syntax Deep Dive
The syntax can be broken down into three parts: Output, Input, and Predicate.
# [ OUTPUT | INPUT | PREDICATE ]
# [ expression for var in source if condition ]
numbers = [1, 2, 3, 4, 5]
# 1. Basic Mapping
squares = [n * n for n in numbers]
# Output: [1, 4, 9, 16, 25]
# 2. Variable Transformation
# Expression can be a function call, string formatting, etc.
hex_codes = [hex(n) for n in numbers]
# Output: ['0x1', '0x2', '0x3', '0x4', '0x5']Filtering items (`if` clause)
You can filter items by adding an `if` at the end. Note that this `if` is essentially a "filter" that runs before the expression is calculated.
# Get only even squares
# Note: The if statement does NOT have an 'else' here!
# We are filtering, not choosing values.
even_squares = [n*n for n in numbers if n % 2 == 0]
# Output: [4, 16]If you are filtering (skipping items), the `if` goes at the end.
If you are choosing (ternary operator), the `if/else` goes at the start (in the expression).
# Choosing (Ternary Operator)
# "Mark even numbers as 0, keep odds as is"
# We keep ALL items, but change their value.
masked = [0 if n % 2 == 0 else n for n in numbers]
# Output: [1, 0, 3, 0, 5]Performance: Why Comprehensions are Faster
You will often hear that comprehensions are "faster" than loops. But why? They both do O(N) work. The answer lies in the C-level implementation.
Let's look at the Bytecode (what the Python interpreter actually executes).
1. The For Loop Bytecode
In a loop, Python has to look up the `.append` attribute on the list object every single iteration.
# Looping overhead:
# 1. LOAD_NAME 'sq'
# 2. LOAD_METHOD 'append' (Slow attribute lookup!)
# 3. CALL_METHOD
# 4. POP_TOP2. The Comprehension Bytecode
The compiler knows you are building a list. It optimizes the process using a special bytecode instruction called `LIST_APPEND`. It bypasses the attribute lookup entirely and writes directly to the C-struct.
# Comprehension optimization:
# 1. LIST_APPEND (Fast C-level operation)Advanced: Flattening & Nested Loops
Here is where it gets tricky. You can put multiple `for` clauses in a single comprehension. They execute from left to right, exactly like nested loops.
matrix = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]
# ⌠Nested Loop Way
flat = []
for row in matrix:
for num in row:
flat.append(num)
# ✅ Comprehension Way
# Read it like English: "num FOR row in matrix FOR num in row"
flat = [num for row in matrix for num in row]
# Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]Scope Leakage (Python 2 vs 3)
In Python 2, list comprehensions leaked their loop variable into the global scope. This was a major source of bugs. Python 3 fixed this; variables inside the comprehension are local.
x = "Global"
# Python 3
dummy = [x for x in range(5)]
print(x) # Output: "Global" (Safe!)
# Python 2 (Historical)
# print(x) # Output: 4 (The loop overwrote it!)Common Pitfalls
⌠The "Unreadable One-Liner"
The Trap: Just because you can do it in one line, doesn't mean you should.
# ⌠What does this even do?
result = [x*y for x in range(10) if x > 5 for y in range(5) if x+y < 12]
# ✅ Use a loop for complex logic
result = []
for x in range(10):
if x > 5:
for y in range(5):
if x+y < 12:
result.append(x*y)Rule of thumb: If it wraps to a second line, use a real loop.
⌠Side Effects in Comprehensions
The Trap: Using comprehensions just to run a function (like `print`) without using the resulting list.
# ⌠Creates a useless list of [None, None, None...]
[print(x) for x in range(10)]
# ✅ Use a loop for actions
for x in range(10):
print(x)Modern Python: The Walrus Operator
Python 3.8's Assignment Expression (`:=`) works perfectly inside comprehensions. This is useful when you need to calculate a value, check it, and then use it.
import random
def expensive_calc(x):
return x * random.random()
# Without Walrus (Calculates twice!)
# result = [expensive_calc(x) for x in range(10) if expensive_calc(x) > 0.5]
# With Walrus (Calculates once, captures value)
result = [val for x in range(10) if (val := expensive_calc(x)) > 0.5]