Convert · OCR · Invoice

Photo of an invoice in,
structured .xlsx out.

Drop a JPG / PNG of a printed invoice. Tesseract OCR runs in your browser, then a field detector pulls out invoice number, PO ref, dates, vendor, buyer, tax IDs, subtotal, tax, and total — all into a clean three-sheet .xlsx (Summary · Amounts · Raw text).

Explore More Tools
No signup, ever 100% local · nothing uploaded Three-sheet workbook Esc to close
9
Language packs
Header
Auto-extracted
Local
100% in browser
Free
Always · no signup

01 — What you create

Invoice photo in three-sheet workbook out.

Tesseract OCR runs locally, a field detector pulls out invoice number / PO / dates / vendor / buyer / tax IDs / subtotal / tax / total, and the result is a three-sheet .xlsx workbook (Summary, Amounts, Raw text) — ready for an accountant to review.

Invoice OCR Form
English · 91% OCR
Source
INV-2026-0042.jpg · 2480×3508 · 1.4 MB
Language
English
Invoice number
INV-2026-0042 (detected)
Issue date
23 May 2026 (detected)
Vendor
Sonchoy Studio Pvt Ltd
Total
INR 6,69,720
Tax ID
GST 29ABCDE1234F1Z5
Output base
invoice-INV-2026-0042
Output3-sheet .xlsx
OUTPUT.XLSX
Accountant-ready

invoice-INV-2026-0042.xlsx

3 sheets · Summary · Amounts · Raw text

field-fill 82% · OCR 91%

SUMMARY SHEET

Invoice numberINV-2026-0042
PO numberPO-NWB-019
Issue date23 May 2026
Due date06 Jun 2026
VendorSonchoy Studio Pvt Ltd
BuyerNorthwind Books Pvt Ltd
Tax IDGST 29ABCDE1234F1Z5
SubtotalINR 5,68,200
TaxINR 1,01,520
TotalINR 6,69,720
SummaryAmountsRaw text
Need more power?

When this tool isn't enough, pdfFiller takes over.

Scanned invoices, multi-page batches, multi-currency stacks, and direct push into your accounting system. Free for 30 days, no card required.

Try Premium Free

Free 30 days · no credit card · cancel anytime

02 — How it works

From paper invoice to working workbook.

Most vendor invoices arrive as PDFs you can copy text from — the standard Invoice PDF → Excel tool is the right fit. This tool is for the harder case: a scanned or photographed paper invoice with no text layer. OCR plus a field detector gets you 80% of the way; spot-check the remaining 20%.

01

Drop the invoice image

JPG / PNG of a printed invoice — phone photo or scan. The image stays on your machine.

02

OCR + detect fields

Tesseract reads the text locally; a field detector pulls out invoice #, dates, vendor, buyer, tax IDs, subtotal / tax / total. Confidence shown for both passes.

03

Export the workbook

One click writes a three-sheet .xlsx: Summary (header fields), Amounts (every detected currency value), Raw text (full OCR output, one line per row).

03 — Built for AP teams

Paper invoice — digital row.

In-browser OCR

Tesseract runs locally via WebAssembly. Your invoice image bytes never touch a server.

Invoice-aware field detector

Regex passes for invoice number, PO ref, issue date, due date, vendor, buyer, GST / VAT / EIN / TIN / PAN tax IDs, contact info, subtotal, tax, and total.

Three-sheet workbook

Summary (header fields ready to import), Amounts (every detected currency value with position), Raw text (one row per OCR line for audit).

Number-typed cells

Subtotal, tax, and total land as real number cells in the Summary sheet, so SUM and AVG formulas just work.

Two confidence scores

OCR confidence (how confident Tesseract is about the recognition) and field-fill confidence (how many of the key invoice fields were populated).

100% in browser

Image, OCR, field detection, and workbook assembly all run locally. Tesseract.js loads from a public CDN on first use; that's the only network step.

PdfFiller · 30-Day Free Trial

When one-off documents aren't enough.

Bulk OCR, batch invoicing, multi-party e-signing, redaction, audit logs — pdfFiller picks up where Sonchoy ends. Free for 30 days, no credit card.

Try Premium FreeNo card · Cancel anytime

Batch & bulk

Run 100+ invoices, statements, or conversions in one go.

OCR scanned PDFs

Turn paper invoices into searchable, exportable data.

E-sign & request

Multi-party signatures with full audit trails.

Redact & approve

Mask sensitive ledger lines before sending to auditors.

04 — Common questions

Everything about invoice OCR.

01How is this different from Invoice PDF → Excel?

Invoice PDF → Excel works on PDFs with a text layer (most digitally-created invoices). It's fast and accurate. This OCR tool works on images and scanned invoices that have no text layer — it has to read the pixels first. OCR is slower and less accurate than reading a text layer, so use the PDF tool whenever possible and only fall back to this one for true paper invoices.

02How accurate is the field detection?

On clean printed invoices, 80–90% of key fields land correctly on first pass. Invoice number, issue date, vendor, total are usually right. Buyer, tax IDs, and PO refs depend heavily on the invoice layout; some templates put them in places the regex passes don't look. Always review the Summary sheet before importing into accounting.

03What invoice layouts work best?

Clean printed invoices with labelled fields ("Invoice #:", "Date:", "Total:") work very well. Highly stylised "designy" invoices with non-standard labels score lower. Scanned faxed invoices score lower still. Phone-photo invoices under good lighting are fine; tilted, blurry, or shadow-heavy phones photos significantly hurt OCR accuracy.

04Are subtotal / tax / total real number cells in the output?

Yes — the Summary sheet stores them as number-typed cells so SUM and AVG formulas work. The currency is captured in the adjacent cell as text. If the detector got confused (e.g., picked up a line-item total instead of the grand total), the cell will be wrong but still typed correctly.

05What does the Amounts sheet contain?

Every currency-prefixed value the OCR detected, with its source position in the raw text. Useful for cross-checking: if the Summary sheet shows the wrong total, look at the Amounts sheet to find the correct one and copy it over.

06Does my data leave the browser?

Tesseract.js runs entirely in your browser via WebAssembly. The OCR engine itself is loaded from a public CDN (jsDelivr) on first use — that's a one-time engine download, not an image upload. Your invoice image, the recognition pass, the field detection, and the output workbook all stay on your machine.

05 — Related tools

Often used together.

Browse all 91 tools