OCR · v3.2 · PRODUCTION

Read any Indian ID
with forensic accuracy.

Our OCR engine extracts structured data from Aadhaar, PAN, passport, driving licence, and voter ID — with field-level confidence scores and built-in checksum validation.

99.6%
Field accuracy
680ms
Avg response
6 formats
ID types supported
OCR EXTRACTION · LIVE Processing
आधार
Government of India
Unique Identification Authority
Name:Priya Sharma
DOB:14/03/1992
Gender:Female
1234 5678 9012
FIELDS EXTRACTED 14/14
name: "Priya Sharma"
99.8%
dob: "1992-03-14"
99.2%
gender: "F"
99.9%
aadhaar: "...9012"
99.7%
issued: "2018-06"
98.4%
verhoeff: ✓ valid
PASS
Supported documents

Every ID that regulators accept.

Trained on millions of real-world Indian documents — including regional variants, older formats, and digitally issued versions.

UIDAI

Aadhaar

Physical cards, e-Aadhaar PDFs, m-Aadhaar screenshots. Front + back parsing with QR validation.

12-digit · Verhoeff 99.8%
INCOME TAX

PAN Card

All PAN variants — old laminated, new QR-enabled, e-PAN (Digital). Father's name and signature extraction.

10-char alphanumeric 99.9%
MEA

Passport

MRZ line + visual zone parsing. Auto-detects expired passports and diplomatic variants.

MRZ · ICAO 9303 99.5%
STATE RTO

Driving Licence

Supports formats from all 36 Indian states and union territories. Auto-extracts vehicle class and expiry.

DL formats · 36 states 98.9%
ECI

Voter ID

EPIC cards with constituency extraction, relation-name parsing, and age verification.

EPIC · alphanumeric 99.1%
GSTN

GSTIN & PAN-GST

For merchant onboarding: extracts GSTIN, validates checksum, and cross-verifies against embedded PAN.

15-char · statewise 99.6%
Field coverage · Aadhaar

Fourteen fields.
Every one structured.

No free-text dumps. Each field is extracted as a typed value with a confidence score — ready to persist in your database without post-processing.

name Full name, unicode-safe
STRING
dob ISO 8601 date
DATE
gender M · F · T (third gender)
ENUM
aadhaar_number Last 4 digits by default
MASKED
address Multi-line, structured
OBJECT
pincode 6-digit Indian PIN
INTEGER
state ISO 3166-2:IN
STRING
district Administrative district
STRING
issue_date Card issuance month
DATE
photo_bbox Cropped face region
BASE64
qr_payload XML-decoded QR contents
OBJECT
verhoeff_valid Aadhaar checksum check
BOOLEAN
Integration

Simpler than you'd expect.

Upload a document. Get structured JSON. That's the whole API.

POST /v1/ocr/extract
cURL
Node.js
Python
# Extract fields from an Aadhaar card
curl -X POST https://api.recspace.in/v1/ocr/extract \
  -H "Authorization: Bearer $API_KEY" \
  -F "document=@aadhaar_front.jpg" \
  -F "type=aadhaar" \
  -F "mask_pii=true"

# → 200 OK · 680ms
{
  "request_id": "ocr_8B3A2F",
  "document_type": "aadhaar_front",
  "confidence": 0.994,
  "verhoeff_valid": true,
  "fields": {
    "name": { "value": "Priya Sharma", "confidence": 0.998 },
    "dob": { "value": "1992-03-14", "confidence": 0.992 },
    "gender": { "value": "F", "confidence": 0.999 },
    "aadhaar_number": { "value": "XXXX XXXX 9012", "confidence": 0.997 },
    "address": { "house": "42-B", "street": "Saket Road", "city": "New Delhi", "pincode": 110017 }
  },
  "processing_time_ms": 680
}
What makes it work

Engineered for production scale.

Field-level confidence

Each extracted field ships with a confidence score so you can gate downstream actions appropriately.

Built-in checksum

Verhoeff algorithm for Aadhaar, Luhn for cards, MRZ check digits for passports — validated before response.

QR + MRZ decoding

Where documents contain machine-readable zones, we decode them alongside visual OCR and cross-verify.

Handles bad scans

Pre-processing for glare, low-light, skew, fold lines, and lamination reflection. Field officers take imperfect photos.

Ready to run it against your documents?

Send us a few samples. We'll process them through the live API and send back the results with confidence scores.