Convert · Table extraction

Every table in a PDF, extracted
with types intact.

Drop any text-based PDF and the tool detects every table on every page — preserving headers, column types, and numeric formatting. Export to a proper .xlsx workbook (one sheet per page, or flattened) ready to sort, pivot, and edit.

Explore More Tools
No signup, ever 100% local · nothing uploaded Number-typed cells Esc to close
Auto
Number cells detected
Multi
Sheet-per-page mode
Local
100% in browser
Free
Always · no signup

01 — What you create

Tabular PDF in editable workbook out.

Auto-detect rows and columns from any text-based PDF, then export a real .xlsx — not just a CSV with a different extension. Numbers come through as number cells; one sheet per page or all pages flattened, your choice.

XLSX Form
All pages · auto numbers
Source PDF
vendor-ledger-q1.pdf · 7 pages
Page mode
All pages → one sheet
Row tolerance
Normal (4pt)
Column tolerance
Normal (10pt)
Header row
First detected row
Number cells
Auto-detect (recommended)
Output base
vendor-ledger-q1
Output
142 rows × 6 cols + _meta
Output1 .xlsx · 2 sheets
OUTPUT.XLSX
Editable

vendor-ledger-q1.xlsx

Extracted from 7-page PDF · 142 rows · 6 columns · numbers as numbers

Sheet: Extracted

DateVendorInvoice #AmountGSTTotal
02-Apr-26Westline HardwareWL-2604-022142,20025,596167,796
04-Apr-26BlueDart SurfaceBD-0408-1174,5008105,310
08-Apr-26Crossword BooksCW-0418-0882,24002,240
12-Apr-26IndiGo AirlinesIG-77418,4201,1809,600
15-Apr-26Trident HotelsTR-2025-4418,9002,26821,168
18-Apr-26Adobe IncADOBE-44211,2402231,463
22-Apr-26AWS MarketplaceAWS-MAY-2611,2002,01613,216

+ 135 more rows · numeric cells are real numbers (SUM, AVG, sort all work)

Need more power?

When this tool isn't enough, pdfFiller takes over.

Scanned invoices, multi-page batches, multi-currency stacks, and direct push into your accounting system. Free for 30 days, no card required.

Try Premium Free

Free 30 days · no credit card · cancel anytime

02 — How it works

From tabular PDF to working spreadsheet.

The difference between a CSV export and a proper .xlsx is whether the recipient can immediately SUM a column. This tool coerces numbers to number cells so the workbook is genuinely editable — pivot it, sort it, chart it without manual re-typing.

01

Drop the PDF

Drag a ledger, statement, report, or vendor invoice in. The tool reads its text layer locally with pdfjs — nothing uploads.

02

Tune the detection

Live preview shows the table the moment you change settings. Cells flagged in green become real number cells in the output.

03

Export the workbook

One click writes a proper .xlsx — one sheet per page or all pages combined — with a _meta sheet recording how the extraction was configured.

03 — Built for accounting

Real .xlsx — not CSV in disguise.

Real number cells

Currency-prefixed, comma-thousands, EU-decimal, accounting-negative, percentage — all detected and upgraded to number-typed cells. SUM and AVG work without manual cleanup.

Sheet-per-page or combined

Render every page as its own named sheet (great for multi-section reports), or flatten everything into one sheet (great for long tables that span pages).

Live preview with flagged cells

See the first page rendered as a table the moment you tune tolerances. Cells in green will become real numbers in the output — adjust before exporting.

Custom range support

Skip the cover or appendices by extracting just "2-5, 7" — page-local column anchors per page so each section's columns line up.

_meta sheet for audit

Every output workbook includes a tiny _meta sheet recording the source file, mode, tolerance, header setting, and number-coercion choice. Reproducibility for free.

100% in browser

PDFs and the assembled workbook never touch the network. Extraction runs via pdfjs, output assembled via SheetJS — entirely locally.

PdfFiller · 30-Day Free Trial

When one-off documents aren't enough.

Bulk OCR, batch invoicing, multi-party e-signing, redaction, audit logs — pdfFiller picks up where Sonchoy ends. Free for 30 days, no credit card.

Try Premium FreeNo card · Cancel anytime

Batch & bulk

Run 100+ invoices, statements, or conversions in one go.

OCR scanned PDFs

Turn paper invoices into searchable, exportable data.

E-sign & request

Multi-party signatures with full audit trails.

Redact & approve

Mask sensitive ledger lines before sending to auditors.

04 — Common questions

Everything about extracting workbooks.

01How is this different from PDF to CSV?

Both use the same table detector under the hood. CSV is a flat text format; .xlsx is a real workbook with typed cells, multiple sheets, column widths, and number formats. Use PDF to CSV when the downstream tool expects CSV (most accounting systems). Use PDF to Excel when you want a real spreadsheet you can open, sort, pivot, and edit in Excel / Numbers / Google Sheets without further cleanup.

02Why are some cells flagged green in the preview?

Those are the cells the number-coercion step will upgrade to actual number cells in the output. Currency-prefixed values ("INR 4,521.50"), accounting negatives ("(1,200.00)"), and percentages ("12.5%") all get detected and converted. Cells in default colour stay as strings.

03Does this work on scanned PDFs?

No — this tool needs a text layer. Scanned (image-only) PDFs need an OCR step first; the pdfFiller premium tier handles that. Most modern PDFs (statements, invoices, GST returns) have text layers and work fine.

04What's the _meta sheet?

A small one-page sheet at the end of the workbook that records the source PDF name, page count, mode, tolerance, header setting, and number-coercion choice. Useful for audit trails ("how was this extracted?") and for sharing with colleagues who need to know whether to trust the numbers.

05My table came out badly — what should I tweak first?

Two knobs: row tolerance (if too tight, items that should share a row land on separate rows; loosen it) and column tolerance (if too tight, single columns get split into many; loosen it). The live preview shows the result instantly so you can iterate without exporting.

06Does my data leave the browser?

Never. The PDF is read with pdfjs locally, tabularised in JavaScript, and serialised to .xlsx with SheetJS locally. The browser triggers the download. No upload, no third-party API, no logging.

05 — Related tools

Often used together.

Browse all 91 tools