Convert · Image to text (OCR)

Receipt photo in, searchable
text out.

Drop in a JPG / PNG of a printed receipt and the tool runs Tesseract OCR locally in your browser. Get clean plain text, Markdown, or structured JSON with auto-detected vendor, dates, amounts, tax lines, and card-ending fields.

Explore More Tools
No signup, ever 100% local · nothing uploaded Auto-detected fields Esc to close
9
Language packs
Auto
Fields detected
Local
100% in browser
Free
Always · no signup

01 — What you create

Receipt photo in searchable text out.

Tesseract OCR runs locally in your browser, returns the recognised text, and the tool auto-detects the receipt fields most expense systems care about — vendor, date, amounts, tax, card ending. Export as plain text, Markdown, or structured JSON.

OCR Form
English · receipt mode
Source
IMG_2031.jpg · 3024×4032 px · 1.8 MB
Language
English
Post-processing
Receipt mode (tidy + group breaks)
Output format
Plain text (.txt)
Confidence
92% · 142 words
Detected vendor
BOMBAY CANTEEN
Detected total
INR 7,209.80
File-name base
receipt-2026-05-23
Output142 words · 5 fields
OUTPUT.TXT
Searchable

receipt-2026-05-23.txt

142 words · 92% confidence · 6 detected fields

eng · receipt mode

EXTRACTED TEXT

BOMBAY CANTEEN
Kamala Mills, Mumbai 400013
GSTIN 27ABCDE1234F1Z9

Bill no: BC-09812
Date: 05-May-2026  Table 14

Tasting menu          1   4,200.00
Wine pairing          2     980.00
Sparkling water       1     250.00
Service charge       --     680.00

Subtotal                  6,110.00
CGST 9%                     549.90
SGST 9%                     549.90
TOTAL                     7,209.80

Card ending 4421
Paid · 05-May-2026 21:14

DETECTED FIELDS

VendorBOMBAY CANTEEN
Date05-May-2026
TotalINR 7,209.80
TaxCGST 9% · SGST 9%
Card···· 4421
Need more power?

When this tool isn't enough, pdfFiller takes over.

Scanned invoices, multi-page batches, multi-currency stacks, and direct push into your accounting system. Free for 30 days, no card required.

Try Premium Free

Free 30 days · no credit card · cancel anytime

02 — How it works

From image bytes to readable text.

Most "OCR" tools want a signup. This one runs the open-source Tesseract recognition engine entirely in your browser via WebAssembly. The image, the recognition, and the output text all stay on your machine — useful for receipts that you don’t want sitting in a third party’s logs.

01

Drop the receipt

Drag a JPG / PNG of a printed receipt — phone photo, scan, screenshot — into the picker. The image stays on your machine.

02

Pick language + extract

Choose the language (English by default, English + a second language available for multilingual receipts), and tap "Extract". Tesseract OCR runs locally with a progress bar.

03

Copy or download

The extracted text appears immediately, with auto-detected vendor, date, amounts, and card-ending fields shown alongside. Copy to clipboard or download as .txt / .md / .json.

03 — Built for receipts

Read the receipt — properly.

In-browser Tesseract OCR

Open-source Tesseract recognition engine runs entirely client-side via WebAssembly. The image bytes never touch a server.

9 language packs

English by default, plus French / German / Spanish / Italian / Hindi / Portuguese / Japanese / Simplified Chinese paired with English for multilingual receipts.

Auto-detected fields

Quick-and-pragmatic regex passes surface vendor (first non-numeric line), dates, currency amounts, tax lines, and card-ending digits.

Receipt-aware tidying

Default post-processing collapses Tesseract's noisy whitespace runs and groups blank-line breaks so the output reads like the source receipt did.

Three output formats

Plain text (.txt) for copy-paste, Markdown (.md) for readable archives with structured field summaries, or JSON (.json) for machine consumption.

Confidence score

Tesseract returns a 0–100 confidence per recognition. Green above 80, amber above 60, red below — at a glance, you know whether the OCR is trustworthy.

PdfFiller · 30-Day Free Trial

When one-off documents aren't enough.

Bulk OCR, batch invoicing, multi-party e-signing, redaction, audit logs — pdfFiller picks up where Sonchoy ends. Free for 30 days, no credit card.

Try Premium FreeNo card · Cancel anytime

Batch & bulk

Run 100+ invoices, statements, or conversions in one go.

OCR scanned PDFs

Turn paper invoices into searchable, exportable data.

E-sign & request

Multi-party signatures with full audit trails.

Redact & approve

Mask sensitive ledger lines before sending to auditors.

04 — Common questions

Everything about receipt OCR.

01Why is the first run slow?

The first "Extract" downloads the Tesseract OCR engine (~3 MB of WebAssembly) and the language data (~5 MB for English; multilingual packs are larger) from a public CDN. After that, the engine is cached in the browser and subsequent runs are fast (typically 2–6 seconds per receipt depending on image size).

02How accurate is the recognition?

For clean, well-lit, printed receipts: typically 85–95% confidence. Phone-camera receipts under good lighting do well. Crumpled receipts, faded thermal paper (the kind that turns black after a few days), or hand-written notes drop substantially — Tesseract is not great at handwriting. The confidence score on the output indicates how much to trust the result.

03Does it handle handwritten receipts?

Poorly. Tesseract is trained on printed text; handwriting recognition is a separate harder problem that needs different models (Google Cloud Vision, Microsoft Read API, AWS Textract). For handwritten receipts, the pdfFiller premium tier uses cloud-grade OCR with much better handwriting support.

04Which language pack should I pick?

English is the right default for most receipts globally (Anglo brands, English on numerical bits). Switch to "English + <language>" when the receipt has substantial non-English text — French for Paris cafe receipts, German for Berlin restaurant receipts, Hindi for some Indian small-shop receipts. Multilingual packs are bigger and slower on first load.

05Are the detected fields always correct?

No — they're quick-and-pragmatic regex passes, not a fine-tuned receipt parser. Treat them as suggestions to pre-fill an expense report row, not as the authoritative answer. Always glance at the full extracted text before using the detected vendor / total / date.

06Does my data leave the browser?

Tesseract.js runs entirely in your browser via WebAssembly. The OCR engine itself is loaded from a public CDN (jsDelivr) on first use — that's a one-time engine download, not an image upload. Your receipt image bytes, the recognition pass, the detected fields, and the output text all stay on your machine.

05 — Related tools

Often used together.

Browse all 91 tools