December 30, 202518 min read

The Physics of Image Compression: JPEG, PNG, and WebP Fundamentals

Explore the mathematics and physics behind image compression. Learn how JPEG, PNG, and WebP work, when to use each format, and how to achieve optimal file sizes without sacrificing quality.

FileFusion Editorial Team

The Physics of Image Compression: JPEG, PNG, and WebP Fundamentals

Introduction: Why Image Compression Matters

Every second, millions of images travel across the internet—from social media photos to product catalogs to medical scans. Behind each of these images lies a fascinating intersection of mathematics, physics, and human perception. Understanding how image compression works isn't just academic curiosity; it's the key to making informed decisions about file sizes, quality, and format selection.

When you use a file size reducer, you're leveraging decades of research into how computers can represent visual information more efficiently. This article explores the fundamental physics and mathematics that make image compression possible.

The Fundamental Problem: Raw Image Data

Before compression, digital images are simply grids of colored pixels. A typical 1920×1080 pixel image contains 2,073,600 individual pixels. If each pixel stores red, green, and blue values (each requiring 8 bits, or 1 byte), the raw image requires:

1920 × 1080 pixels × 3 bytes per pixel = 6,220,800 bytes (≈6 MB)

Without compression, a single photo would fill up storage quickly. A 256 GB phone could store only about 42,000 raw photos—but with compression, that number increases to over 100,000 images.

Exploiting Human Visual Perception

The foundation of image compression lies in a crucial insight: human vision is imperfect and predictable. Our eyes and brain process visual information in specific ways that compression algorithms can exploit.

Color Sensitivity: The Luminance-Chrominance Trick

Human eyes are far more sensitive to brightness (luminance) than to color (chrominance). This physiological reality is built into our retina—we have approximately 120 million rod cells (detecting brightness) but only 6-7 million cone cells (detecting color).

Image compression algorithms exploit this by:

Separating luminance from chrominance: Converting from RGB (red, green, blue) to YCbCr (luminance, blue-difference, red-difference) color space
Subsampling chrominance: Storing color information at lower resolution than brightness
Allocating more data to brightness: Preserving detail where our eyes are most sensitive

This technique, called chroma subsampling, can reduce file size by 50% with minimal perceived quality loss. Common subsampling ratios include:

4:4:4 – No subsampling (full color resolution)
4:2:2 – Horizontal color resolution halved
4:2:0 – Color resolution halved both horizontally and vertically (JPEG standard)

Frequency Sensitivity: High vs. Low Details

Our visual system is more sensitive to gradual changes (low frequencies) than to sharp edges and fine details (high frequencies). This is why slightly blurred photos often look "softer" rather than obviously degraded.

Compression algorithms quantize (round off) high-frequency information more aggressively than low-frequency information. This is the mathematical basis for the "lossy" in lossy compression.

JPEG: The Mathematics of Lossy Compression

JPEG (Joint Photographic Experts Group) has been the dominant photo format since 1992. Its success stems from an elegant mathematical approach that balances compression ratio with perceptual quality.

Step 1: Color Space Conversion

JPEG converts RGB pixels to YCbCr format, separating brightness from color information. This allows independent processing of luminance and chrominance channels.

Step 2: Chroma Subsampling

Color channels (Cb and Cr) are downsampled, typically using 4:2:0 subsampling. This immediately reduces data by approximately 50% with minimal perceptual impact.

Step 3: Block-Based Discrete Cosine Transform (DCT)

Here's where the mathematical magic happens. The image is divided into 8×8 pixel blocks, and each block undergoes a Discrete Cosine Transform (DCT)—a mathematical operation that converts spatial information (pixel values) into frequency information (patterns and textures).

Think of DCT as decomposing a complex wave into simpler sine and cosine waves. After the DCT, each 8×8 block is represented by 64 frequency coefficients:

Top-left coefficient: The average color of the block (DC coefficient)
Other coefficients: How much each frequency pattern contributes to the block's appearance (AC coefficients)

Low frequencies (smooth gradients) appear in the top-left; high frequencies (sharp edges, noise) appear in the bottom-right.

Step 4: Quantization – Where Compression Happens

Quantization is where JPEG becomes lossy. Each DCT coefficient is divided by a corresponding value in a quantization table, then rounded to the nearest integer.

Quantized_Value = round(DCT_Coefficient / Quantization_Value)

Since high-frequency coefficients are divided by larger quantization values, they're more aggressively rounded—many become zero. This is intentional: our eyes are less sensitive to high-frequency details.

Quality settings in JPEG directly control quantization:

Quality 100: Minimal quantization, very little data loss
Quality 85-95: Sweet spot for photos (recommended by most professionals)
Quality 50-70: Heavy quantization, visible artifacts on close inspection
Quality below 50: Severe blocky artifacts, color banding

Step 5: Entropy Coding

After quantization, the data contains many zeros (especially in high-frequency areas). JPEG uses lossless compression techniques—Run-Length Encoding (RLE) and Huffman coding—to efficiently store these patterns.

RLE compresses consecutive zeros into a count ("twenty-three zeros" instead of "0,0,0,0..."), while Huffman coding uses shorter codes for frequently occurring values.

When to Use JPEG

✅ Photographs and natural images with gradual color transitions
✅ Images where slight quality loss is acceptable
✅ Web images where file size is critical
❌ Images with text, line art, or sharp edges (creates artifacts)
❌ Images requiring transparency (JPEG doesn't support alpha channels)
❌ Images that will be edited repeatedly (quality degrades with each save)

PNG: Lossless Compression Through Predictability

PNG (Portable Network Graphics) was created in 1996 as a patent-free alternative to GIF. Unlike JPEG, PNG is lossless—the decompressed image is pixel-perfect identical to the original.

The Filtering Stage: Exploiting Spatial Redundancy

Natural images exhibit spatial coherence—neighboring pixels tend to have similar colors. PNG exploits this by applying prediction filters to each row of pixels:

None: No filtering (useful for random noise)
Sub: Predict each pixel based on the pixel to its left
Up: Predict based on the pixel above
Average: Predict based on the average of left and up pixels
Paeth: Predict based on left, up, or upper-left pixel (whichever is closest)

Instead of storing actual pixel values, PNG stores the prediction error—the difference between the actual value and the predicted value. For smooth gradients, these errors are small, often close to zero.

DEFLATE Compression: The Final Stage

After filtering, PNG applies DEFLATE compression (the same algorithm used in ZIP files). DEFLATE combines two techniques:

LZ77 sliding window: Finds repeated sequences and replaces them with references to earlier occurrences
Huffman coding: Uses shorter codes for frequently occurring values

The effectiveness of DEFLATE depends on the data's redundancy. Images with large areas of solid color compress excellently; images with random noise (like photos) compress poorly.

PNG Color Modes: Choosing the Right Bit Depth

PNG supports multiple color modes, each with different storage requirements:

Grayscale (1, 2, 4, 8, 16 bits): For black-and-white images
Indexed color (1, 2, 4, 8 bits): Up to 256 colors from a palette (like GIF)
RGB (8, 16 bits per channel): True color
RGBA (8, 16 bits per channel): True color plus transparency

A logo with 20 colors stored as indexed PNG (8-bit) requires far less space than the same logo as 24-bit RGB.

When to Use PNG

✅ Images with transparency (supports alpha channel)
✅ Screenshots, UI elements, icons
✅ Images with text or sharp edges
✅ Graphics that will be edited multiple times (no generational loss)
✅ Images where perfect quality is required
❌ Photographs (JPEG compresses better with acceptable quality)
❌ Situations requiring maximum compression (JPEG achieves smaller sizes for photos)

WebP: The Modern Hybrid Approach

WebP, developed by Google in 2010, attempts to combine the best of both worlds—JPEG's compression efficiency for photos and PNG's lossless transparency support.

Lossy WebP: Predictive Coding

Unlike JPEG's DCT-based approach, lossy WebP uses predictive coding borrowed from VP8 video compression:

Block prediction: Image divided into 4×4 or 16×16 pixel blocks
Intra-prediction: Each block is predicted from previously decoded blocks using one of several prediction modes (horizontal, vertical, diagonal gradients, etc.)
Transform coding: The prediction error is transformed using DCT or WHT (Walsh-Hadamard Transform)
Quantization: Coefficients are quantized (similar to JPEG)
Entropy coding: Compressed using arithmetic coding (more efficient than Huffman)

The key advantage: WebP adapts its prediction mode per block, achieving better compression than JPEG's fixed 8×8 DCT approach.

Lossless WebP: Advanced Prediction and Entropy Coding

Lossless WebP improves on PNG through:

Predictive coding: More sophisticated prediction than PNG's filters
Color transformation: Converting correlated RGB values to decorrelated values
LZ77 backward references: Like PNG, but with adaptive block sizes
Entropy coding: More efficient than PNG's DEFLATE

Result: Lossless WebP files are typically 26% smaller than equivalent PNGs.

Alpha Channel Compression

WebP's handling of transparency is particularly elegant. The alpha channel can be compressed either:

Losslessly: For images requiring perfect transparency edges
Lossily: For photos with transparency where slight alpha variations are imperceptible

This mixed approach (lossy RGB + lossless or lossy alpha) enables WebP to create incredibly small files with transparency—something JPEG cannot do.

When to Use WebP

✅ Modern web applications (supported by all major browsers since 2020)
✅ Photos with transparency requirements
✅ Situations requiring the absolute smallest file size
✅ Responsive web images (better compression = faster loading)
❌ Email attachments (some email clients don't support WebP)
❌ Print workflows (professional print software often lacks WebP support)
❌ Legacy system compatibility (older software and devices)

Comparative Analysis: Compression Efficiency

Here's how the three formats compare for typical use cases:

Scenario 1: High-Resolution Photograph (3000×2000px, landscape)

Uncompressed RAW: ~17 MB
PNG (lossless): ~14 MB (minimal compression due to photo complexity)
JPEG (quality 95): ~2.8 MB (83% reduction, imperceptible quality loss)
JPEG (quality 85): ~1.1 MB (93% reduction, slight quality loss on close inspection)
WebP (quality 95): ~2.1 MB (25% smaller than equivalent JPEG)
WebP (quality 85): ~850 KB (23% smaller than equivalent JPEG)

Scenario 2: Screenshot with Text (1920×1080px)

Uncompressed: ~6 MB
PNG (lossless): ~450 KB (excellent compression due to large solid-color areas)
JPEG (quality 95): ~180 KB (smaller, but visible artifacts around text)
WebP (lossless): ~330 KB (27% smaller than PNG, pixel-perfect)

Scenario 3: Logo with Transparency (500×500px)

PNG-8 (indexed, 256 colors): ~25 KB
PNG-24 (true color + alpha): ~85 KB
WebP (lossless): ~18 KB (64% smaller than PNG-24)
WebP (lossy, quality 90): ~12 KB (further reduction with minimal quality loss)

The Tradeoffs: Quality, Size, and Compatibility

Choosing the right format requires balancing three factors:

Quality Considerations

Lossless required? Use PNG or lossless WebP
Slight quality loss acceptable? Use JPEG or lossy WebP for much smaller files
Will the image be edited repeatedly? Avoid lossy formats (quality degrades with each save/edit cycle)

File Size Considerations

Bandwidth-constrained? WebP offers the best compression
Storage-limited? Lossy formats (JPEG, lossy WebP) provide massive savings
Need transparency? PNG or WebP only (JPEG doesn't support alpha)

Compatibility Considerations

Universal compatibility? JPEG and PNG work everywhere
Modern web only? WebP is safe (95%+ browser support)
Print/professional workflows? Stick with JPEG and PNG

Practical Guidelines: Format Selection Decision Tree

🎯 Quick Format Selector

Is it a photograph or natural image?

YES →

Need transparency? → WebP (lossy) or PNG
For web (modern browsers)? → WebP (lossy)
Universal compatibility needed? → JPEG (quality 85-95)

NO (it's a graphic, screenshot, or logo) →

Has transparency? → PNG or WebP (lossless)
No transparency, for web? → WebP (lossless)
No transparency, universal compatibility? → PNG

Advanced Topic: Compression Artifacts

Understanding common compression artifacts helps you identify when you've chosen the wrong format or quality setting:

JPEG Artifacts

Blocking: Visible 8×8 pixel grid patterns (heavy compression)
Ringing: Halos around sharp edges (from high-frequency quantization)
Color bleeding: Colors smearing into adjacent areas (chroma subsampling)
Mosquito noise: Shimmering around high-contrast edges

Solution: Increase quality setting, or use PNG for graphics with sharp edges.

PNG Artifacts

Banding: Visible steps in gradients (insufficient bit depth)
Large file sizes: PNG struggles with photographic complexity

Solution: Use JPEG or WebP for photographs; use 16-bit PNG for smooth gradients requiring lossless compression.

WebP Artifacts

Block prediction errors: Visible in flat areas at high compression
Color banding: Similar to JPEG in smooth gradients

Solution: Increase quality setting or use lossless WebP.

The Mathematics of Quality Settings

Quality sliders in image editors aren't arbitrary—they control mathematical precision:

JPEG Quality Scale:
- Quality 100: Quantization tables multiplied by 0.1 (minimal loss)
- Quality 90: Quantization tables multiplied by 0.5
- Quality 50: Quantization tables multiplied by 1.0 (baseline)
- Quality 10: Quantization tables multiplied by 5.0 (severe loss)

Diminishing returns: The difference between quality 95 and 100 is often a 3× file size increase with imperceptible quality improvement. The sweet spot for photos is typically 85-92.

Future-Proofing: Next-Generation Formats

While JPEG, PNG, and WebP dominate today, emerging formats offer further improvements:

AVIF: Based on AV1 video codec; 50% smaller than WebP with comparable quality (growing browser support)
JPEG XL: Next-generation JPEG with better compression, wider color gamut, and progressive decoding
HEIF/HEIC: Apple's format based on HEVC video codec; excellent compression but patent-encumbered

However, WebP's widespread browser support and royalty-free status make it the current pragmatic choice for modern web applications.

Practical Compression Workflow

When preparing images for the web or storage:

Start with the highest-quality source: Original camera files, not already-compressed versions
Resize before compressing: Don't store a 4000px image if you only need 800px
Choose the right format: Use the decision tree above
Test quality settings: Compare file sizes at different quality levels; find the sweet spot
Use tools that preview quality: Our File Size Reducer lets you see before-and-after comparisons
Keep originals: Always preserve uncompressed sources; never repeatedly save lossy formats

Conclusion: The Art and Science of Compression

Image compression is a beautiful example of applied mathematics and perceptual psychology. By understanding the physics of how our eyes work and the mathematics of data redundancy, engineers have created algorithms that can reduce image file sizes by 90% or more while maintaining perceptual quality.

The key takeaways:

JPEG: Best for photographs where slight quality loss is acceptable
PNG: Best for graphics, screenshots, and anything requiring transparency or lossless quality
WebP: Best overall for modern web applications, combining JPEG's efficiency with PNG's features

Armed with this understanding, you can make informed decisions about image formats, quality settings, and compression strategies—ensuring your images look great while minimizing file sizes and loading times.