OCR · v3.2 · PRODUCTION

Read any Indian ID
with forensic accuracy.

Our OCR engine extracts structured data from Aadhaar, PAN, passport, driving licence, and voter ID — with field-level confidence scores and built-in checksum validation.

Book a demo→ API reference

99.6%

Field accuracy

680ms

Avg response

6 formats

ID types supported

OCR EXTRACTION · LIVE Processing

आधार

Government of India

Unique Identification Authority

Name:Priya Sharma

DOB:14/03/1992

Gender:Female

1234 5678 9012

FIELDS EXTRACTED 14/14

name: "Priya Sharma"

99.8%

dob: "1992-03-14"

99.2%

gender: "F"

99.9%

aadhaar: "...9012"

99.7%

issued: "2018-06"

98.4%

verhoeff: ✓ valid

PASS

Supported documents

Every ID that regulators accept.

Trained on millions of real-world Indian documents — including regional variants, older formats, and digitally issued versions.

UIDAI

Aadhaar

Physical cards, e-Aadhaar PDFs, m-Aadhaar screenshots. Front + back parsing with QR validation.

12-digit · Verhoeff 99.8%

INCOME TAX

PAN Card

All PAN variants — old laminated, new QR-enabled, e-PAN (Digital). Father's name and signature extraction.

10-char alphanumeric 99.9%

MEA

Passport

MRZ line + visual zone parsing. Auto-detects expired passports and diplomatic variants.

MRZ · ICAO 9303 99.5%

STATE RTO

Driving Licence

Supports formats from all 36 Indian states and union territories. Auto-extracts vehicle class and expiry.

DL formats · 36 states 98.9%

ECI

Voter ID

EPIC cards with constituency extraction, relation-name parsing, and age verification.

EPIC · alphanumeric 99.1%

GSTN

GSTIN & PAN-GST

For merchant onboarding: extracts GSTIN, validates checksum, and cross-verifies against embedded PAN.

15-char · statewise 99.6%

Field coverage · Aadhaar

Fourteen fields.
Every one structured.

No free-text dumps. Each field is extracted as a typed value with a confidence score — ready to persist in your database without post-processing.

name Full name, unicode-safe

STRING

dob ISO 8601 date

DATE

gender M · F · T (third gender)

ENUM

aadhaar_number Last 4 digits by default

MASKED

address Multi-line, structured

OBJECT

pincode 6-digit Indian PIN

INTEGER

state ISO 3166-2:IN

STRING

district Administrative district

STRING

issue_date Card issuance month

DATE

photo_bbox Cropped face region

BASE64

qr_payload XML-decoded QR contents

OBJECT

verhoeff_valid Aadhaar checksum check

BOOLEAN

Integration

Simpler than you'd expect.

Upload a document. Get structured JSON. That's the whole API.

POST /v1/ocr/extract
cURL
Node.js
Python

# Extract fields from an Aadhaar card
curl -X POST https://api.recspace.in/v1/ocr/extract \
  -H "Authorization: Bearer $API_KEY" \
  -F "document=@aadhaar_front.jpg" \
  -F "type=aadhaar" \
  -F "mask_pii=true"

# → 200 OK · 680ms
{
  "request_id": "ocr_8B3A2F",
  "document_type": "aadhaar_front",
  "confidence": 0.994,
  "verhoeff_valid": true,
  "fields": {
    "name": { "value": "Priya Sharma", "confidence": 0.998 },
    "dob": { "value": "1992-03-14", "confidence": 0.992 },
    "gender": { "value": "F", "confidence": 0.999 },
    "aadhaar_number": { "value": "XXXX XXXX 9012", "confidence": 0.997 },
    "address": { "house": "42-B", "street": "Saket Road", "city": "New Delhi", "pincode": 110017 }
  },
  "processing_time_ms": 680
}
      

What makes it work

Engineered for production scale.

Field-level confidence

Each extracted field ships with a confidence score so you can gate downstream actions appropriately.

Built-in checksum

Verhoeff algorithm for Aadhaar, Luhn for cards, MRZ check digits for passports — validated before response.

QR + MRZ decoding

Where documents contain machine-readable zones, we decode them alongside visual OCR and cross-verify.

Handles bad scans

Pre-processing for glare, low-light, skew, fold lines, and lamination reflection. Field officers take imperfect photos.

Read any Indian IDwith forensic accuracy.

Every ID that regulators accept.

Aadhaar

PAN Card

Passport

Driving Licence

Voter ID

GSTIN & PAN-GST

Fourteen fields.Every one structured.

Simpler than you'd expect.

Engineered for production scale.

Field-level confidence

Built-in checksum

QR + MRZ decoding

Handles bad scans

Ready to run it against your documents?

Read any Indian ID
with forensic accuracy.

Fourteen fields.
Every one structured.