Data Serialization: JSON & Pickle
Converting Python objects into shareable formats. The art of the "Universal Translator".
1. The Big Idea (ELI5)
👶 Explain Like I'm 10: The Teleporter
Imagine you have a LEGO Castle (Complex Python Object) in your room. You want to send it to your friend in Japan.
- Serialization (Packing): You can't fit the castle in an envelope. So you take it apart and write an Instruction Manual (JSON String) on how to build it.
- Transmission: You email the manual (Text) to your friend.
- Deserialization (Unpacking): Your friend reads the manual and rebuilds the LEGO Castle exactly as it was.
2. JSON (The Universal Standard)
JSON (JavaScript Object Notation) is the language of the web. Python lists become arrays, dictionaries become objects.
import json
data = {
"name": "Alice",
"role": "Engineer",
"skills": ["Python", "React"]
}
# 1. Dump (Serialize) to String
json_string = json.dumps(data, indent=4)
print(json_string)
# Output:
# {
# "name": "Alice",
# ...
# }
# 2. Load (Deserialize) from String
new_data = json.loads(json_string)
print(new_data["name"]) # Alice3. Deep Dive: Custom JSON Encoders
JSON is simple. Too simple. It doesn't know what a `datetime` or `Decimal` is. If you try to serialize them, Python crashes with `TypeError`. To fix this, we must write a Custom Encoder.
import json
from datetime import datetime
class DateTimeEncoder(json.JSONEncoder):
def default(self, obj):
# If it's a date, convert to string (ISO format)
if isinstance(obj, datetime):
return obj.isoformat()
# Otherwise, let the default encoder handle it
return super().default(obj)
user = {
"name": "Bob",
"joined_at": datetime.now() # JSON can't handle this natively!
}
# Tell json.dumps to use our helper
json_str = json.dumps(user, cls=DateTimeEncoder)
print(json_str)
# Output: {"name": "Bob", "joined_at": "2023-10-25T14:30:00..."}4. CSV Processing
For tabular data (Excel sheets), CSV is king. Python's `csv.DictReader` is a hidden gem. It maps every row to a dictionary using the header row.
import csv
# Writing
with open('employees.csv', 'w', newline='') as f:
fieldnames = ['name', 'salary']
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerow({'name': 'Alice', 'salary': 90000})
writer.writerow({'name': 'Bob', 'salary': 85000})
# Reading
with open('employees.csv', 'r') as f:
reader = csv.DictReader(f)
for row in reader:
print(f"{row['name']} earns ${row['salary']}")5. Pickle (Python's Dark Magic)
JSON is text. Pickle is binary.Pickle can save almost anything: Checkpoints of Machine Learning models, complex class instances, even functions!
import pickle
class Hero:
def __init__(self, name, level):
self.name = name
self.level = level
conan = Hero("Conan", 99)
# Serialize to Binary File (.pkl)
with open("savegame.pkl", "wb") as f:
pickle.dump(conan, f)
# Deserialize
with open("savegame.pkl", "rb") as f:
loaded_hero = pickle.load(f)
print(f"{loaded_hero.name} is Level {loaded_hero.level}")💀 Security Warning: The Pickle Bomb
NEVER load a pickle file from an untrusted stranger.Pickle is not just data format; it is a Stack Machine. A hacker can craft a malicious `.pkl` file that, when you `pickle.load()` it, executes code to delete your hard drive or steal your passwords.Rule: Use JSON for public data. Use Pickle only for your own internal temp files.