From Text to Tables
Turn messy text (invoices, emails, reports) into clean tables or JSON using a local LLM via Ollama.

Overview
This project uses prompt-engineered extraction with a constrained schema to convert unstructured text into structured outputs. It validates fields, handles edge cases (missing/ambiguous values), and exports to CSV/Parquet for analytics.
Pipeline
- Source ingestion (PDF/TXT/HTML) → text normalization.
- Schema-guided LLM extraction (Ollama) → JSON rows.
- Validation & deduplication → tabular output.
Use Cases
- Invoice and receipt parsing
- Customer email mining
- Survey/free-text response structuring