Run OCR on every page, output a PDF that looks the same but has a hidden searchable text layer.
Drop a scanned PDF here or click to browse
First run downloads language data (~10-30 MB) — subsequent OCR is faster.
A searchable PDF looks identical to the original scan but has invisible text positioned over each scanned word. Search tools (Ctrl-F in Acrobat / Preview / browsers), copy-paste, and screen readers can all access the text.
Tesseract.js runs entirely in your browser via WebAssembly. Language model files come from cdnjs. Your PDF and the OCR'd text never leave the page.