Core Technology

AI Reads Unstructured Data

Contracts, receipt screenshots, email confirmations, timesheet exports, scanned documents, photos—throw everything at it. AI reads all formats, understands context, and extracts clean invoice data. No organizing, no templates, no formatting required.

See It In Action

How AI Extraction Works

Our AI doesn't just scan for keywords—it actually understands your documents. Here's the 5-step process:

Document Upload

You upload contracts, receipts, timesheets, or invoices. Supported formats: PDF, PNG, JPG, JPEG, and most image types. Maximum 10MB per file.

Behind the scenes: Files are uploaded to secure temporary storage (Vercel Blob) and encrypted in transit via TLS.

OCR for Images & Scanned PDFs

If you uploaded photos or scanned documents, our OCR (Optical Character Recognition) engine converts images into searchable text. Uses Tesseract.js for reliable character recognition.

Tech details: Handles poor image quality, rotated documents, and even JPEG 2000 compressed images. Accuracy: 85-90% on scanned docs, 95%+ on photos.

Semantic AI Extraction (GPT-5)

This is where the magic happens. OpenAI's GPT-5-nano model reads your documents like a human would. It understands context: "Bill To" vs "From", "rate per hour" vs "total amount", "due in 30 days" vs "paid on".

What it extracts: Client name, client address, vendor name, vendor address, invoice number, dates, line items (description, quantity, rate, amount), payment terms, notes, reimbursable expenses.

Structured Output (JSON)

The AI formats extracted data into a structured JSON object using Zod schema validation. This ensures data consistency and prevents hallucinations.

{ "client_name": "Acme Corp", "amount": 5000, "line_items": [...] }

Review & Edit (Optional)

You can review the extracted data before generating the PDF. Click any field to edit. The AI is 95%+ accurate, but you have full control.

Best practice: Most users skip the review step—the AI is that accurate. But it's there if you need it.

The Real Value: It Reads Messy, Unstructured Data

Traditional tools require clean, formatted input. Instant Invoice reads the chaos you actually have:

📸 Screenshots

Toggl timesheet screenshot, Harvest export, time tracking app photo

📧 Email Confirmations

Flight booking emails, hotel confirmations, Uber receipt emails

📄 Scanned PDFs

Photocopied contracts, scanned restaurant receipts, faxed documents

📱 Phone Photos

Crumpled receipts, handwritten notes, whiteboard time logs

💬 Slack/Text Messages

Screenshot of client approval messages, scope clarifications, rate confirmations

📊 Excel/CSV Exports

Time logs, expense reports, project hour summaries (save as PDF first)

Don't organize. Don't rename files. Don't format anything.

Just dump it all in one upload. The AI figures it out.

What Documents Can It Read?

Contracts & Agreements

• Retainer agreements
• SOWs (Statements of Work)
• Engagement letters
• Freelance contracts
• Consulting agreements

What it extracts: Client name, rate, payment terms, scope

Receipts & Expenses

• Uber/Lyft receipts
• Flight confirmations
• Hotel invoices
• Restaurant bills
• Software subscriptions

What it extracts: Vendor, date, amount, currency, purpose

Timesheets & Logs

• Time tracking exports
• Work logs
• Project hour summaries
• Daily activity reports

What it extracts: Hours, dates, task descriptions

Scanned Documents

• Photos of paper documents
• Scanned PDFs
• Screenshots
• Faxes (yes, really)

Note: OCR accuracy depends on image quality

How Accurate Is It?

95%+

Typed Documents

Native PDFs, Word docs, clear digital documents

85-90%

Scanned PDFs

Scanned documents, depends on scan quality

80%+

Photos & Handwriting

Camera photos, some handwritten text (limited)

What Can Go Wrong?

• Poor image quality (blurry, low resolution)
• Unusual document layouts (AI trained on standard formats)
• Non-English documents (currently English-only)
• Handwritten text (limited OCR support)
• Complex tables (works 80% of the time)

Good news: You can always edit extracted data before generating the PDF. The AI does 95% of the work, you verify the last 5%.

Technical Deep Dive

For developers and technical users who want to know how it works under the hood:

▶ AI Model: OpenAI GPT-5-nano

We use OpenAI's GPT-5-nano model with structured outputs via the Chat Completions API. This model was specifically chosen for:

Speed: Processes invoices in 3-5 seconds (faster than gpt-4)
Cost: $0.15 per 1M input tokens (80% cheaper than gpt-4)
Accuracy: 95%+ on structured data extraction tasks
Reasoning: Understands context better than keyword extraction

Parameters: reasoning_effort=minimal, temperature=0 (deterministic), max_tokens=16000

▶ OCR: Tesseract.js

For image-to-text conversion, we use Tesseract.js (WASM port of Tesseract OCR):

Language: English (eng) trained data
PSM: Page Segmentation Mode 3 (automatic)
Performance: Runs in browser Web Worker (non-blocking)
Fallback: If Tesseract fails, we use PDF.js for native PDF text extraction

▶ Schema Validation: Zod

We use Zod for runtime type validation and OpenAI's zodResponseFormat for structured outputs:

Type safety: Guarantees extracted data matches expected schema
Prevents hallucinations: AI can't return invalid JSON
Self-documenting: Schema defines all extractable fields
Validation: Catches errors before PDF generation

▶ PDF Extraction: PDF.js + unpdf

For native PDFs (not scanned), we extract text directly:

PDF.js: Mozilla's PDF renderer, extracts text with positioning
unpdf: Server-side PDF parsing for metadata
Hybrid approach: Text extraction + OCR for images within PDFs
HTML output: Preserves document structure for better AI understanding

▶ JPEG 2000 Decoding: OpenJPEG

Some PDFs contain JPEG 2000 compressed images that browsers can't decode. We handle this:

Detection: Magic header check (JP2 box signature, codestream markers)
Decoder: @abasb75/openjpeg (WASM-based OpenJPEG)
Fallback: createImageBitmap for Safari (when supported)
Normalization: 16-bit → 8-bit with min-max scaling

Privacy & Security

Encryption: All documents encrypted in transit via TLS. No unencrypted storage.

Data retention: Documents automatically deleted after 24 hours. PDF previews expire in 24 hours.

AI training: We do NOT train AI models on your data. OpenAI's API doesn't use inputs for training.

Access logs: Server logs contain anonymized event data only (no document content).

Third parties: OpenAI (AI extraction), Vercel (infrastructure). No other data sharing.

Read our full Privacy Policy for details.

See AI Extraction In Action

Upload a contract and receipts. Watch the AI extract everything in 5 seconds.

Try It Free

No signup • No credit card • 5 free invoices per month