Threading & The GIL
Why Python threads can't calculate Pi faster, but can download internet data at lightspeed.
1. The Big Idea (ELI5)
👶 Explain Like I'm 10: The Kitchen Staff
Imagine a Restaurant Kitchen (Your CPU) with only One Stove (The GIL).
- Concurrency (Threading): You have 4 chefs. They can all chop vegetables, wash dishes, and take orders at the same time. But when it comes to cooking (executing CPU bytecode), they must take turns using the single Stove.
- The Result: If the work is "chopping/waiting" (I/O Bound), 4 chefs are faster. If the work is "cooking" (CPU Bound), 4 chefs are just as slow as 1, plus they get in each other's way.
2. The GIL (Global Interpreter Lock)
The GIL is Python's most famous "feature". It is a lock that prevents multiple native threads from executing Python bytecodes at once. This prevents memory management (reference counting) issues but limits pure CPU parallelism.
PYTHON
import threading
import time
def cpu_task():
# Attempting to use 2 threads for math will NOT be faster
# They fight over the GIL.
for _ in range(10**7):
pass
def io_task():
# Waiting for network/disk RELEASES the GIL!
# Other threads can run while this sleeps.
time.sleep(2)
print("Download complete")
# This works great!
t1 = threading.Thread(target=io_task)
t2 = threading.Thread(target=io_task)
t1.start()
t2.start()3. The Danger: Race Conditions
Threads share memory. If two threads modify the same variable, they can overwrite each other's work. This is called a Race Condition.
PYTHON
import threading
balance = 0
lock = threading.Lock()
def deposit():
global balance
for _ in range(100000):
# ⌠UNSAFE: balance += 1 is not atomic!
# It involves: Read, Add, Write. Context switch usually happens in between.
# ✅ SAFE: Use a Lock
with lock:
balance += 1
t1 = threading.Thread(target=deposit)
t2 = threading.Thread(target=deposit)
t1.start()
t2.start()
t1.join()
t2.join()
print(balance) # With lock: 200000. Without lock: Random number (e.g., 145921)4. Modern Pattern: ThreadPoolExecutor
Manually creating `Thread` objects is the "old way". It's hard to manage return values and exceptions. The modern standard is `concurrent.futures`.
PYTHON
from concurrent.futures import ThreadPoolExecutor
import time
def fetch_url(url):
time.sleep(1) # Simulate request
return f"Data from {url}"
urls = ["google.com", "bing.com", "yahoo.com"]
# Automatically manages a pool of 3 worker threads
with ThreadPoolExecutor(max_workers=3) as executor:
results = executor.map(fetch_url, urls)
for res in results:
print(res)5. When to use Threading?
| Scenario | Use Threading? | Why? |
|---|---|---|
| Web Scraping (100 sites) | ✅ YES | Waiting for network releases GIL. Huge speedup. |
| Video Processing / Math | ⌠NO | CPU bound. Use `multiprocessing` instead. |
| Background Auto-Save | ✅ YES | Disk I/O releases GIL. UI stays responsive. |