PDF to Markdown

Extract PDF text with heading detection. Best for text-heavy documents.

Drop a PDF here or click to browse

How heading detection works

pdf.js exposes the font size of every text run. The tool clusters sizes:

  1. Find the most common (modal) size — that's body text
  2. Sizes ~1.4× larger → H2, ~1.8× → H1, ~1.2× → H3
  3. Anything below body size → footnotes / small text

This works well for documents with consistent typographic hierarchy. Magazines, decorative layouts, and PDFs exported from inconsistent sources may need cleanup.

What's preserved

What's not

Privacy

The PDF is parsed in your browser via pdf.js. Nothing is uploaded.