For Loops & The Iteration Protocol
Forget for (i=0; i<n; i++). Python loops are fundamentally different. They are idiomatic, protocol-driven, and designed to work seamlessly with any custom object. Mastering iteration is mastering Python itself.
In C-style languages (C, Java, JavaScript), a for loop is effectively a while loop with counter management boilerplate. In Python, a for loop is a request to an object: "Give me your next item."
This shift from "index-based" to "iterator-based" logic completely changes how we write code. It eliminates off-by-one errors, reduces visual noise, and allows for lazy evaluation of infinite streams. In this deep dive, we will explore not just how to loop, but how the Iteration Protocol powers the entire language.
What You'll Learn
- The Protocol:
__iter__,__next__, andStopIteration. - Lazy Evaluation: Why
range(10**9)doesn't crash your RAM. - Pythonic Patterns:
enumerate(),zip(), and dictionary unpacking. - The `else` Block: The most misunderstood feature of Python loops.
- Pitfalls: Modifying a list while iterating over it.
The Iteration Protocol: Under the Hood
When you write for item in my_list:, Python does a lot of work behind the scenes. It doesn't "know" what a list is; it only cares that the object follows the Iterator Protocol.
1. Iterable vs. Iterator
| Concept | Definition | Example |
|---|---|---|
| Iterable | An object that can be looped over. It has an __iter__() method that returns an Iterator. | Lists, Strings, Dicts, Ranges. |
| Iterator | An object that represents a stream of data. It has a __next__() method that gives the next value. | File objects, enumerate() result. |
2. Simulating a For Loop
To truly understand loops, let's manually do what the Python interpreter does.
# The High-Level Way
colors = ["red", "green", "blue"]
for color in colors:
print(color)
# What Python ACTUALLY does (The Low-Level Way)
colors = ["red", "green", "blue"]
# 1. Get the iterator
iterator = iter(colors) # Calls colors.__iter__()
while True:
try:
# 2. Get the next item
item = next(iterator) # Calls iterator.__next__()
print(item)
except StopIteration:
# 3. Handle the end of the loop
break # The loop finishes gracefully__iter__ and __next__, and your custom objects work withfor loops, list comprehensions, and unpacking automatically.The Power of `range()`
In Python 2, range(10) created a list: [0, 1, 2, ..., 9]. In Python 3, range is an immutable sequence type that generates numbers on demand (lazy evaluation).
# 1. Basic: 0 to stop-1
for i in range(5):
print(i) # 0, 1, 2, 3, 4
# 2. Start & Stop
for i in range(2, 5):
print(i) # 2, 3, 4
# 3. Start, Stop, Step
for i in range(0, 10, 2):
print(i) # 0, 2, 4, 6, 8
# 4. Negative Step (Reverse)
for i in range(5, 0, -1):
print(i) # 5, 4, 3, 2, 1
# Memory Magic
import sys
# A list of 1 million numbers takes ~8MB
print(sys.getsizeof(list(range(1000000))))
# A range object of 1 million numbers takes... 48 bytes!
print(sys.getsizeof(range(1000000)))The Cost of Nesting: Big O Analysis
A single loop runs n times (Linear Time). But what happens when you put a loop inside a loop?
Nested loops are multiplicative. If the outer loop runs 1,000 times and the inner loop runs 1,000 times, the code inside the inner loop runs 1,000,000 times. This is called Quadratic Time O(n²), and it is the #1 killer of performance in Python scripts.
# Task: Find common items in two lists
list_a = range(10000)
list_b = range(5000)
# ⌠The Naive Way (Nested Loops)
common = []
for a in list_a: # Runs 10,000 times
for b in list_b: # Runs 5,000 times PER 'a'
if a == b: # Runs 50,000,000 times!
common.append(a)
# This will freeze your computer for seconds.The Optimization: Hash Maps (Dictionaries)
We can reduce this from 50 million operations to just 15,000 using a Set or Dictionary. Lookups in sets are O(1) (Instant).
# ✅ The Optimized Way (O(n))
set_b = set(list_b) # O(n) to create set
common = []
for a in list_a: # Runs 10,000 times
if a in set_b: # O(1) Instant Lookup!
common.append(a)
# Total operations: ~15,000.
# This runs instantly.Loop Control Statements
Sometimes you need to intervene in the loop's execution. Python provides three keywords for this:
| Keyword | Action | Analogy |
|---|---|---|
break | Terminates the loop entirely. | "Abort mission!" |
continue | Skips the rest of the current iteration and starts the next one. | "Skip this song." |
pass | Does nothing. It is a placeholder for empty code blocks. | "To be decided." |
The "Placeholder" Pass
Unlike `break` or `continue`, `pass` has no effect on logic. It is vital when syntax rules require a statement but you have nothing to do yet.
for user in users:
if user.is_banned:
# TODO: Implement ban logic later
pass
else:
email(user)Pythonic Looping: No Indexes Allowed!
Coming from C/Java, you might be tempted to write: for i in range(len(items)):. In Python, this is considered an anti-pattern. We have better tools that are cleaner and faster.
1. enumerate(): The Index Tracker
When you need the index, don't fallback to `range(len())`. Use `enumerate()`, which yields pairs of `(index, item)`.
names = ["Alice", "Bob", "Charlie"]
# ⌠The Anti-Pattern (Manual Indexing)
for i in range(len(names)):
print(f"{i + 1}: {names[i]}")
# ✅ The Pythonic Way (Enumerate)
# start=1 makes it human-readable (1-based index)
for i, name in enumerate(names, start=1):
print(f"{i}: {name}")2. zip(): Parallel Iteration
Looping over two lists at once? `zip()` stitches them together.
names = ["Alice", "Bob"]
scores = [85, 92]
# ⌠The Anti-Pattern
for i in range(len(names)):
print(f"{names[i]}: {scores[i]}")
# ✅ The Pythonic Way
for name, score in zip(names, scores):
print(f"{name}: {score}")
# Note: zip stops at the shortest list!
# Use itertools.zip_longest if you need to keep going.3. Dictionary Unpacking
Iterating over a dict yields keys by default. Use `.items()` to get everything.
scores = {"Alice": 85, "Bob": 92}
# Just keys
for name in scores:
print(name)
# Keys and Values
for name, score in scores.items():
print(f"{name} scored {score}")The `else` Block: A Hidden Gem
Python loops allow an `else` clause! It's confusingly named, but powerful. Think of it as "nobreak". It runs only if the loop completes without hitting a `break` statement.
def find_user(users, target_id):
for user in users:
if user.id == target_id:
print("Found user!")
break # Skips the else block
else:
# Runs ONLY if we never found the user
print("User not found in database.")
raise ValueError("User missing")
# Without 'else', you need a flag variable:
# found = False
# for ...
# if ... found = True; break
# if not found: ...Advanced: Building Your Own Iterator
We used `range()` and lists, but what if you want to make your own objects loopable? You just need to implement the Iterator Protocol.
Let's build a `Countdown` class that works exactly like `range()`, but backwards.
class Countdown:
def __init__(self, start):
self.current = start
def __iter__(self):
# 1. Boilerplate: Return self
return self
def __next__(self):
# 2. Logic: Return value or raise StopIteration
if self.current <= 0:
raise StopIteration
value = self.current
self.current -= 1
return value
# Usage
# It works in a for loop automatically!
for num in Countdown(3):
print(num) # 3, 2, 1for x in obj, Python calls iter(obj). If that returns an object with __next__, it calls next() repeatedly untilStopIteration is raised. That matches exactly what we built above.Loops vs List Comprehensions
You cannot talk about loops in Python without mentioning List Comprehensions. They are a concise way to create lists using a single line of looping logic.
numbers = [1, 2, 3, 4, 5]
# 1. The Standard Loop Way
squares = []
for n in numbers:
squares.append(n * n)
# 2. The Comprehension Way
# [expression for item in iterable]
squares = [n * n for n in numbers]When to Use Which?
| Feature | For Loop | Comprehension |
|---|---|---|
| Readability | Better for complex logic/multiple steps. | Better for simple mapping/filtering. |
| Performance | Slightly slower (function call overhead of .append). | Faster (C-level optimization). |
| Side Effects | Designed for side effects (print, save to DB). | Avoid! Don't use for side effects. |
Filtering with Comprehensions
Comprehensions can also filter items using an `if` clause at the end.
# Get only even numbers
evens = [n for n in numbers if n % 2 == 0]
# Equivalent to:
# for n in numbers:
# if n % 2 == 0:
# evens.append(n)Advanced Looping with `itertools`
Python's standard library includes a module called itertools that is dedicated to efficient looping. It provides C-optimized iterators for common complex patterns.
1. itertools.product: Flattening Nested Loops
Remember the nested loop example? You can rewrite it cleanly using `product`. It generates the Cartesian product of input iterables.
import itertools
ranks = ['2', '3', '4', '5', '6', '7', '8', '9', '10', 'J', 'Q', 'K', 'A']
suits = ['Hearts', 'Diamonds', 'Clubs', 'Spades']
# ⌠Nested Loop Way
deck = []
for suit in suits:
for rank in ranks:
deck.append((rank, suit))
# ✅ The itertools Way
# cleaner, flat, and expresses intent better
deck = list(itertools.product(ranks, suits))2. itertools.cycle: Infinite Repetition
Need to loop over a list forever? Don't use `while True` with a modulo operator.
colors = ["Red", "Green", "Yellow"]
light = itertools.cycle(colors)
next(light) # Red
next(light) # Green
next(light) # Yellow
next(light) # Red (Starts over!)3. itertools.chain: Looping Over Multiple Lists
If you need to loop over list A, then list B, then list C, don't concatenate them (`A + B + C`)! That creates a massive new list in memory. Use `chain` to loop over them one by one without copying.
list1 = range(1000)
list2 = range(1000)
# ⌠Bad (Creates 3rd list of 2000 items)
for i in list1 + list2: ...
# ✅ Good (Zero memory overhead)
for i in itertools.chain(list1, list2): ...Memory Mastery: Generators & Laziness
Everything we've discussed so far assumes we have the data ready. But what if the data is massive? What if it's infinite? Enter Generators.
A generator is a function that pauses its execution. Instead of `return`, it uses `yield`. When you loop over a generator, it calculates one value, gives it to you, and waits for you to ask for the next one.
# 1. A Generator Function
def fibonacci():
a, b = 0, 1
while True:
yield a # Pauses here and gives 'a' to the loop
a, b = b, a + b
# 2. Consuming it
# This object takes constant memory, even if we loop forever
fib_gen = fibonacci()
for i in range(10):
print(next(fib_gen)) # Manually getting next itemsGenerator Expressions
Just like List Comprehensions, but with parentheses (). They create a generator, not a list.
# List Comprehension (Eager)
# Creates a 100MB list in RAM instantly
squares_list = [x**2 for x in range(10000000)]
# Generator Expression (Lazy)
# Creates a generator object (48 bytes)
squares_gen = (x**2 for x in range(10000000))
# We can loop over it just the same!
for s in squares_gen:
# process s...
if s > 100: breakLooping over Files & Streams
A common task is reading a file line-by-line. Python makes this incredibly efficient because file objects are Iterators.
Lazy vs Eager Reading
If you have a 10GB log file, you cannot load it all into memory.
# ⌠Bad (Eager Pattern)
# Reads the ENTIRE file into a list of strings
with open("massive_log.txt") as f:
lines = f.readlines() # 💥 Memory Error!
for line in lines:
process(line)
# ✅ Good (Lazy Pattern)
# The file object yields one line at a time
with open("massive_log.txt") as f:
for line in f: # Reads line, processes, discards from memory
process(line)Looping Like a Pro: Built-in Helpers
Python provides optimized built-in functions to alter how you loop without manually slicing lists or writing complex logic.
1. sorted() vs .sort()
Use `sorted()` to iterate in order without changing the original list.
scores = [88, 50, 99, 70]
# ✅ Loop in ascending order
for score in sorted(scores):
print(score) # 50, 70, 88, 99
# ✅ Loop in descending order
for score in sorted(scores, reverse=True):
print(score)
# âš ï¸ Note: sorted() creates a NEW list. It uses O(n) memory.2. reversed()
To loop backwards, don't use slice notation `[::-1]` which creates a copy of the list (wasting memory). Use `reversed()`, which returns an iterator (O(1) memory).
# ⌠Slice (Uses Extra Memory)
for item in large_list[::-1]: ...
# ✅ Iterator (Zero Memory Overhead)
for item in reversed(large_list): ...3. filter()
Similar to `map()`, `filter()` creates an iterator that yields only items matching a condition.
# Get only valid users
users = [user1, None, user2, None]
# ⌠Manual Check
for u in users:
if u is not None:
process(u)
# ✅ Filter Iterator
for u in filter(None, users):
process(u)Strategies for Safe Modification
One of the most common bugs in beginner Python code is modifying a list while iterating over it. When you remove an item, all subsequent items shift left. The loop iterator, however, just moves the index forward. This causes it to skip the item that just slid into the current spot.
Strategy 1: Iterate Over a Copy
The simplest fix is to slice the list [:]. This creates a shallow copy. The loop reads the original snapshot, but you modify the live list.
users = ["Active", "Banned", "Active", "Banned"]
# This works perfectly
for user in users[:]:
if user == "Banned":
users.remove(user)
print(users) # ["Active", "Active"]Strategy 2: Collection Logic (The Better Way)
Often, "removing items" is better thought of as "keeping valid items". Use a list comprehension to create a new list of only what you want.
users = ["Active", "Banned", "Active", "Banned"]
# Create a new list instead of fighting the old one
users = [u for u in users if u != "Banned"]Deep Dive: How Enumerate Works
We used `enumerate()` earlier, but how does it actually work? It's not magic; it's a generator! Let's re-implement it to see the simple beauty of `yield`.
def my_enumerate(iterable, start=0):
count = start
for item in iterable:
yield count, item # Pause and return the pair
count += 1
# Testing our version
colors = ["Red", "Blue"]
for i, color in my_enumerate(colors, 1):
print(f"{i}: {color}")
# Output:
# 1: Red
# 2: BlueCommon Pitfalls
⌠Modifying While Iterating (The Wrong Way)
Why it's wrong: See the section above! But just to remind you:
# ⌠Buggy Code
for n in nums:
if n % 2 == 0:
nums.remove(n) # SKIPS ITEMS!⌠Exhausting Iterators
Why it's wrong: Iterators (like `zip` or `map` or file objects) are one-time use. Once consumed, they are empty.
numbers = [1, 2, 3]
squares = map(lambda x: x**2, numbers)
print(list(squares)) # [1, 4, 9]
print(list(squares)) # [] - Empty! The iterator is exhausted.⌠Shadowing Built-in Names
Why it's wrong: If you use `list`, `str`, or `id` as variable names, you break Python's built-in functions.
# ⌠Naming variable 'list'
list = [1, 2, 3]
# Later in the code...
my_copy = list(range(5)) # TypeError! 'list' is now a list, not a class.
# ✅ Use descriptive names
numbers = [1, 2, 3]Language Comparison: Python vs The World
| Feature | C / Java (Classic) | JavaScript (Modern) | Python (Idiomatic) |
|---|---|---|---|
| Loop Style | Index-based (i++) | .forEach() or for...of | Iterator-based (in) |
| Simplicity | Verbose | Verbose or callback-heavy | Clean & English-like |
| Laziness | No | No (array usually pre-allocated) | Yes (generators/range) |
Modern Python: Asynchronous Iteration
In modern web and network programming, you often have to wait for data (e.g., from an API). Blocking the entire loop while waiting is inefficient.
Python 3.5+ introduced async for. This allows the loop to "pause" and let other code run while waiting for the next item.
import asyncio
async def fetch_data():
# Simulate network delay
for i in range(3):
await asyncio.sleep(1) # Pause here
yield f"Data chunk {i}"
async def main():
# âš¡ The loop yields control while waiting
async for chunk in fetch_data():
print(chunk)
# Run the async loop
# asyncio.run(main())Real World Application: processing A CSV
Let's combine everything we've learned (Files, Splitting, Enumerate, List Comprehensions) into one real-world task: Parsing a messy CSV file.
# raw_data.csv content:
# id,name,score
# 1,Alice,85
# 2,Bob,90
# 3,Charlie,invalid
# 4,Dave,75
cleaned_data = []
with open("raw_data.csv") as f:
# 1. Enumerate to track line numbers for error logs
for line_num, line in enumerate(f, start=1):
# 2. Guard Clause: Skip header or empty lines
line = line.strip()
if not line or line.startswith("id"):
continue
# 3. Unpacking
parts = line.split(",")
if len(parts) != 3:
print(f"Error on line {line_num}: Bad format")
continue
user_id, name, score_str = parts
# 4. Data Processing
if not score_str.isdigit():
print(f"Warning line {line_num}: Invalid score for {name}")
continue
# 5. Success! Add to valid list
cleaned_data.append({
"id": int(user_id),
"name": name,
"score": int(score_str)
})
print(f"Successfully loaded {len(cleaned_data)} users.")Summary: Best Practices Checklist
- Prefer `for` loops over `while` loops whenever possible (safer, cleaner).
- Use `enumerate()` instead of `range(len())` if you need the index.
- Use List Comprehensions for simple mapping/filtering tasks (faster than `.append()`).
- Use Generators (`yield`) when processing large files or infinite data streams.
- Use `itertools` for complex iteration patterns (nested loops, chaining).
- Never modify the list you are currently looping over; iterate over a copy or use a comprehension instead.