Our OCR engine extracts structured data from Aadhaar, PAN, passport, driving licence, and voter ID — with field-level confidence scores and built-in checksum validation.
Trained on millions of real-world Indian documents — including regional variants, older formats, and digitally issued versions.
Physical cards, e-Aadhaar PDFs, m-Aadhaar screenshots. Front + back parsing with QR validation.
All PAN variants — old laminated, new QR-enabled, e-PAN (Digital). Father's name and signature extraction.
MRZ line + visual zone parsing. Auto-detects expired passports and diplomatic variants.
Supports formats from all 36 Indian states and union territories. Auto-extracts vehicle class and expiry.
EPIC cards with constituency extraction, relation-name parsing, and age verification.
For merchant onboarding: extracts GSTIN, validates checksum, and cross-verifies against embedded PAN.
No free-text dumps. Each field is extracted as a typed value with a confidence score — ready to persist in your database without post-processing.
Upload a document. Get structured JSON. That's the whole API.
# Extract fields from an Aadhaar card curl -X POST https://api.recspace.in/v1/ocr/extract \ -H "Authorization: Bearer $API_KEY" \ -F "document=@aadhaar_front.jpg" \ -F "type=aadhaar" \ -F "mask_pii=true" # → 200 OK · 680ms { "request_id": "ocr_8B3A2F", "document_type": "aadhaar_front", "confidence": 0.994, "verhoeff_valid": true, "fields": { "name": { "value": "Priya Sharma", "confidence": 0.998 }, "dob": { "value": "1992-03-14", "confidence": 0.992 }, "gender": { "value": "F", "confidence": 0.999 }, "aadhaar_number": { "value": "XXXX XXXX 9012", "confidence": 0.997 }, "address": { "house": "42-B", "street": "Saket Road", "city": "New Delhi", "pincode": 110017 } }, "processing_time_ms": 680 }
Each extracted field ships with a confidence score so you can gate downstream actions appropriately.
Verhoeff algorithm for Aadhaar, Luhn for cards, MRZ check digits for passports — validated before response.
Where documents contain machine-readable zones, we decode them alongside visual OCR and cross-verify.
Pre-processing for glare, low-light, skew, fold lines, and lamination reflection. Field officers take imperfect photos.
Send us a few samples. We'll process them through the live API and send back the results with confidence scores.