Python Mastery: Complete Beginner to Professional
HomeInsightsCoursesPythonJSON, CSV & Serialization

Data Serialization: JSON & Pickle

Converting Python objects into shareable formats. The art of the "Universal Translator".

1. The Big Idea (ELI5)

👶 Explain Like I'm 10: The Teleporter

Imagine you have a LEGO Castle (Complex Python Object) in your room. You want to send it to your friend in Japan.

  • Serialization (Packing): You can't fit the castle in an envelope. So you take it apart and write an Instruction Manual (JSON String) on how to build it.
  • Transmission: You email the manual (Text) to your friend.
  • Deserialization (Unpacking): Your friend reads the manual and rebuilds the LEGO Castle exactly as it was.

2. JSON (The Universal Standard)

JSON (JavaScript Object Notation) is the language of the web. Python lists become arrays, dictionaries become objects.

PYTHON
import json

data = {
    "name": "Alice",
    "role": "Engineer",
    "skills": ["Python", "React"]
}

# 1. Dump (Serialize) to String
json_string = json.dumps(data, indent=4)
print(json_string)
# Output:
# {
#     "name": "Alice",
#     ...
# }

# 2. Load (Deserialize) from String
new_data = json.loads(json_string)
print(new_data["name"]) # Alice

3. Deep Dive: Custom JSON Encoders

JSON is simple. Too simple. It doesn't know what a `datetime` or `Decimal` is. If you try to serialize them, Python crashes with `TypeError`. To fix this, we must write a Custom Encoder.

PYTHON
import json
from datetime import datetime

class DateTimeEncoder(json.JSONEncoder):
    def default(self, obj):
        # If it's a date, convert to string (ISO format)
        if isinstance(obj, datetime):
            return obj.isoformat()
        # Otherwise, let the default encoder handle it
        return super().default(obj)

user = {
    "name": "Bob",
    "joined_at": datetime.now() # JSON can't handle this natively!
}

# Tell json.dumps to use our helper
json_str = json.dumps(user, cls=DateTimeEncoder)
print(json_str) 
# Output: {"name": "Bob", "joined_at": "2023-10-25T14:30:00..."}

4. CSV Processing

For tabular data (Excel sheets), CSV is king. Python's `csv.DictReader` is a hidden gem. It maps every row to a dictionary using the header row.

PYTHON
import csv

# Writing
with open('employees.csv', 'w', newline='') as f:
    fieldnames = ['name', 'salary']
    writer = csv.DictWriter(f, fieldnames=fieldnames)

    writer.writeheader()
    writer.writerow({'name': 'Alice', 'salary': 90000})
    writer.writerow({'name': 'Bob', 'salary': 85000})

# Reading
with open('employees.csv', 'r') as f:
    reader = csv.DictReader(f)
    for row in reader:
        print(f"{row['name']} earns ${row['salary']}")

5. Pickle (Python's Dark Magic)

JSON is text. Pickle is binary.Pickle can save almost anything: Checkpoints of Machine Learning models, complex class instances, even functions!

PYTHON
import pickle

class Hero:
    def __init__(self, name, level):
        self.name = name
        self.level = level

conan = Hero("Conan", 99)

# Serialize to Binary File (.pkl)
with open("savegame.pkl", "wb") as f:
    pickle.dump(conan, f)

# Deserialize
with open("savegame.pkl", "rb") as f:
    loaded_hero = pickle.load(f)

print(f"{loaded_hero.name} is Level {loaded_hero.level}")

💀 Security Warning: The Pickle Bomb

NEVER load a pickle file from an untrusted stranger.Pickle is not just data format; it is a Stack Machine. A hacker can craft a malicious `.pkl` file that, when you `pickle.load()` it, executes code to delete your hard drive or steal your passwords.Rule: Use JSON for public data. Use Pickle only for your own internal temp files.

What's Next?

We've handled File Content (Text/JSON). But what about File Paths? Navigating directories with string manipulation is prone to errors. Next, we look at Pathlib, the object-oriented way to touch the filesystem.