Back to TemplatesExtract structured invoice JSON from PDFs with Mistral OCR and an LLM API
Last update
Last update a month ago
Categories
N8N AI LLM Unstructured Invoice data PDF OCR recognition to JSON output API
What this workflow does
- Accepts a PDF or image upload via Webhook as binary property "data"
- Runs OCR with the Mistral OCR node
- Normalizes OCR text
- Sends OCR text to an LLM to extract structured JSON
- Cleans and normalizes the JSON
- Returns either:
- status: ok
- status: review_needed
Setup
- Import the workflow JSON into n8n
- Create/attach Mistral AI credentials on the "Mistral OCR" node
- Create/attach your choice LLM AI credentials on the OCR text to JSON converson node
- Activate the workflow
- POST a file to:
/webhook/ocr-to-json
Notes
- This starter is tuned for invoices/documents but can be adapted for receipts, purchase orders, or forms.
- Depending on your installed n8n version, the Mistral node parameter names may need minor adjustment after import.
- The workflow returns review_needed when confidence is below 0.5.