All posts
tutorials

What Is PDF Compression and How Does It Actually Work?

A clear explanation of how PDF compression works — lossy vs lossless, image downsampling, font subsetting, and why some PDFs compress better than others.

June 1, 20268 min read
Alex

Written by Alex · Developer & Founder

Solo developer based in Adelaide, Australia. Built MyEasyTools to make everyday file and text tasks faster and free for everyone.

Get more from MyEasyToolsNo ads, higher limits, faster processing

PDF compression reduces file size — but "reducing file size" hides a surprisingly complex set of decisions that happen under the hood. Understanding how it works helps you choose the right compression settings and interpret the results you get. It also explains why some PDFs shrink by 80% and others barely budge.


What a PDF actually contains

A PDF file is not a single thing — it's a container format that bundles several different types of content:

  • Text — stored as character codes and glyph positions referencing a font
  • Fonts — the typeface definitions used to render characters
  • Raster images — JPEG or PNG photos and graphics embedded in the document
  • Vector graphics — mathematical path descriptions for shapes and illustrations
  • Document structure — page definitions, bookmarks, annotations, form fields, metadata

Each of these types compresses differently. This is why blanket statements like "compress the PDF" require clarification — you're not applying a single operation to a uniform file.


Lossy vs lossless compression

This distinction is the most important concept in PDF compression.

Lossless compression reorganizes data more efficiently without discarding anything. The original content can be restored perfectly. ZIP compression is lossless — every bit of original data is preserved. Lossless compression applied to PDF affects things like object stream packing and cross-reference table optimization.

Lossy compression discards data permanently to achieve larger size reductions. JPEG compression of images is lossy — fine detail is discarded to achieve smaller file sizes, and you cannot recover the original. Once lossy compression is applied, there's no going back.

Most PDF compression tools apply both, in different ways to different content types within the document.


How image compression works inside PDFs

Raster images are almost always the largest component of a PDF. A scanned document might be 95% raster image data. A typical business report might have 5–10 JPEG images embedded in it, each several hundred kilobytes.

PDF compression tools target embedded images with two techniques:

Image downsampling (DPI reduction)

Images are stored at a specific resolution measured in dots per inch (DPI). A photo scanned at 600 DPI contains four times as many pixels as the same photo scanned at 300 DPI, and takes roughly four times the storage space.

For screen viewing, 150 DPI is more than sufficient — a typical monitor displays 72–110 DPI. For standard printing, 300 DPI is the practical ceiling. So a compression tool that downsamples all embedded images from 600 DPI to 150 DPI will typically halve the file size from images alone.

The trade-off: downsampled images appear softer when printed at large size or zoomed in significantly. For a scanned contract you'll read on screen, the difference is invisible. For a photography portfolio, it matters.

Image re-encoding (JPEG quality reduction)

JPEG compression itself is parameterized — you can encode the same image at quality 90 (near-lossless) or quality 40 (heavily compressed). A compression tool re-encodes embedded images at a lower quality setting to save space.

Combined with DPI reduction, this can cut image-heavy PDFs by 60–80% while leaving the document readable on screen and printable at standard sizes.


Font subsetting and stream compression

Less dramatic but still meaningful:

Font subsetting removes glyph data for characters not used in the document. A full Arial font includes data for thousands of characters. If your PDF only uses the 80 characters that appear in the text, the other 2,000+ can be stripped. This is already done by most PDF creators, but older or poorly-made PDFs sometimes contain full, unsubseted fonts.

Object stream compression applies deflate (the same algorithm as gzip/ZIP) to the binary streams that describe PDF objects — page content, form definitions, metadata. Well-formed PDFs from modern software already do this. Old or manually assembled PDFs sometimes don't, and can see 20–40% size reductions from this alone.


Why some PDFs don't compress much

If you run a PDF through a compressor and the result is nearly the same size, one of these is usually why:

  1. The PDF is already optimized. Modern PDF exporters (Adobe, Microsoft, Apple Preview) do a good job optimizing on export. Running a well-made PDF through another compressor has little to gain.

  2. It contains mostly text and vectors. Text is tiny — a 10,000-word PDF with no images might be 100 KB. There's nothing to recompress.

  3. The images are already compressed. If the embedded JPEGs were already saved at quality 60, recompressing at quality 60 saves nothing (and may even add overhead).

  4. The images are vectors. Logos and diagrams in SVG or PDF vector format cannot be lossy-compressed — they're mathematical descriptions, not pixel grids.


What "compression levels" mean in practice

Most tools offer Low / Medium / High (or similar). Here's what they translate to:

Low compression: DPI reduction to ~150, JPEG quality around 80. Visually near-identical. Typical size reduction: 20–40%.

Medium compression: DPI to ~120, JPEG quality around 65. Slightly softer at high zoom. Typical reduction: 40–60%.

High compression: DPI to ~96, JPEG quality around 45. Visible quality loss when printed at large format. Typical reduction: 60–75%.

Maximum compression: DPI to ~72, JPEG quality around 30. Significant quality loss, best only for thumbnail-quality output. Typical reduction: 70–85%.

For most business documents and web delivery, Medium is the right default. Try our PDF Compressor with the Medium setting first — if the result is still too large, step up to High.


Does compression affect text quality?

No. PDF text is stored as Unicode characters + font references + glyph coordinates — a mathematical description, not a pixel grid. Compression of the image layer has no effect on text. Text in a PDF remains perfectly sharp at any zoom level regardless of compression level.

This is why scanned PDFs and digital PDFs behave so differently. A scanned PDF is entirely made of images — the "text" you see is actually pixels in a JPEG. A digital PDF (created by saving/exporting from Word, for example) has real text data that's immune to image compression.


FAQ

Can I undo PDF compression? Lossy image compression within a PDF cannot be undone — the discarded pixel data is gone. Always keep the original file before compressing. Lossless compression (deflate/ZIP) is technically reversible, but tools don't "decompress" PDFs.

Will PDF compression affect my digital signatures? Modifying a signed PDF invalidates the signature, since the signature cryptographically hashes the document's contents. Compress your PDF before signing, not after.

Why did my compressed file come out larger? This happens when the PDF is already well-optimized (modern PDF from Word or Adobe), contains mostly text and vectors, or contains images already at low quality. The compressor added overhead without finding much to remove.

What's the maximum size I can compress? File size limits depend on the tool. MyEasyTools PDF Compressor handles up to 25 MB for free users and 200 MB for Pro. There's no page limit.

Is online PDF compression safe? With reputable tools, yes. MyEasyTools processes files in-memory — nothing is written to disk or stored after your session. Always check the privacy policy of any tool you upload documents to, especially if the PDFs contain personal or sensitive information.

Get more from MyEasyToolsNo ads, higher limits, faster processing