Threading & The GIL

Why Python threads can't calculate Pi faster, but can download internet data at lightspeed.

1. The Big Idea (ELI5)

ðŸ‘¶ Explain Like I'm 10: The Kitchen Staff

Imagine a Restaurant Kitchen (Your CPU) with only One Stove (The GIL).

Concurrency (Threading): You have 4 chefs. They can all chop vegetables, wash dishes, and take orders at the same time. But when it comes to cooking (executing CPU bytecode), they must take turns using the single Stove.
The Result: If the work is "chopping/waiting" (I/O Bound), 4 chefs are faster. If the work is "cooking" (CPU Bound), 4 chefs are just as slow as 1, plus they get in each other's way.

2. The GIL (Global Interpreter Lock)

The GIL is Python's most famous "feature". It is a lock that prevents multiple native threads from executing Python bytecodes at once. This prevents memory management (reference counting) issues but limits pure CPU parallelism.

PYTHON

import threading
import time

def cpu_task():
    # Attempting to use 2 threads for math will NOT be faster
    # They fight over the GIL.
    for _ in range(10**7):
        pass

def io_task():
    # Waiting for network/disk RELEASES the GIL!
    # Other threads can run while this sleeps.
    time.sleep(2) 
    print("Download complete")

# This works great!
t1 = threading.Thread(target=io_task)
t2 = threading.Thread(target=io_task)
t1.start()
t2.start()

3. The Danger: Race Conditions

Threads share memory. If two threads modify the same variable, they can overwrite each other's work. This is called a Race Condition.

PYTHON

import threading

balance = 0
lock = threading.Lock()

def deposit():
    global balance
    for _ in range(100000):
        # âŒ UNSAFE: balance += 1 is not atomic!
        # It involves: Read, Add, Write. Context switch usually happens in between.
        
        # âœ… SAFE: Use a Lock
        with lock:
            balance += 1

t1 = threading.Thread(target=deposit)
t2 = threading.Thread(target=deposit)

t1.start()
t2.start()
t1.join()
t2.join()

print(balance) # With lock: 200000. Without lock: Random number (e.g., 145921)

4. Modern Pattern: ThreadPoolExecutor

Manually creating `Thread` objects is the "old way". It's hard to manage return values and exceptions. The modern standard is `concurrent.futures`.

PYTHON

from concurrent.futures import ThreadPoolExecutor
import time

def fetch_url(url):
    time.sleep(1) # Simulate request
    return f"Data from {url}"

urls = ["google.com", "bing.com", "yahoo.com"]

# Automatically manages a pool of 3 worker threads
with ThreadPoolExecutor(max_workers=3) as executor:
    results = executor.map(fetch_url, urls)

for res in results:
    print(res)

5. When to use Threading?

Scenario	Use Threading?	Why?
Web Scraping (100 sites)	âœ… YES	Waiting for network releases GIL. Huge speedup.
Video Processing / Math	âŒ NO	CPU bound. Use `multiprocessing` instead.
Background Auto-Save	âœ… YES	Disk I/O releases GIL. UI stays responsive.

What's Next?

Threading is "Pre-emptive" multitasking (the OS decides when to switch). For massive scale (10,000+ connections), we need "Cooperative" multitasking. Next, we explore Asyncio.