Shipsy’s Address Intelligence Service (AIS) normalizes unstructured, mistyped, and incomplete address strings at ingest, assigns a confidence score per address, and auto-corrects 70%+ of address defects before a shipment ever enters planning. Addresses below confidence threshold are routed to a targeted enrichment loop — either Clara-driven consignee outreach or an ops review queue — so no undeliverable address gets silently scheduled for a doomed attempt.
Why we built this
In every large logistics network we’ve audited, 20–40% of incoming addresses are defective in some way: missing landmark, swapped city/state, non-existent pincode, free-text instructions embedded in the street field, misspelled locality. In many regions (especially postal networks across atolls, islands, and unstructured urban geographies), the percentage is much higher. Every defective address is a latent NDR — a wasted attempt, a frustrated consignee, a driver who lost 20 minutes.
Enterprises were handling this with brittle regex rules and city-pincode lookup tables. That approach fails on the long tail — typos, transliterations, landmark references, and the “building name only” addresses common in dense geographies. We built AIS to handle the long tail.
How it works
AIS runs as a multi-stage enrichment pipeline at ingest time. Each stage has a specific job and produces a confidence delta:
Stage 1 — Tokenization and segmentation. The raw address string is tokenized and tagged into canonical fields (building, street, locality, landmark, city, pincode, country) using a sequence model trained on billions of addresses across Shipsy’s deployment footprint. Multi-language addresses (mixed-script, transliterated) are handled natively.
Stage 2 — Spell-correction and canonical matching. Tokens are matched against a canonical gazetteer of localities, pincodes, landmarks, and streets — per country, with fuzzy matching, phonetic matching (for transliterations), and region-aware abbreviation expansion (e.g., “Blr” to “Bangalore”).
Stage 3 — Geocode triangulation. The normalized address is geocoded against multiple providers and Shipsy’s own learned geocode cache (built from past successful deliveries at the same address). If three geocode signals converge to the same 100m polygon, the address is high-confidence. If they diverge, the address is flagged.
Stage 4 — Historical delivery learning. For every address ingested, AIS checks whether Shipsy has previously delivered successfully to this address or a very similar one. If yes, the proven delivery polygon is authoritative — overriding geocoder disagreement. This is the step that lifts confidence for repeat-delivery addresses dramatically.
Stage 5 — Confidence scoring and routing. Each address exits the pipeline with a confidence score (0–100) and a defect tag if below threshold (missing landmark, ambiguous pincode, unverified locality, geocode spread too wide). Above 80: passes to planning. 60–80: passes to planning with a driver-app hint (e.g., “verify building name on arrival”). Below 60: routed to Clara-driven consignee outreach for confirmation before the shipment is dispatched.
Here’s the bulk pipeline at a glance:
Early results
Enterprises deploying AIS at ingest typically report, within 60 days:
- 70%+ of address defects auto-corrected with no human in the loop.
- NDR rates from “incomplete address” drop 40–60%, compounding into higher FADR and lower cost per shipment.
- Driver productivity up 5–10% because drivers are not losing time hunting for unclear addresses.
- Consignee-side address confirmation loops close in hours via Clara — versus days of NDR pingpong previously.
A national postal operator serving 15+ atoll offices and 172 postal agencies uses AIS to make unstructured atoll addresses deliverable at the first attempt — a problem many national posts have given up on.
What’s next
Three upgrades: AIS for inbound ecommerce APIs so merchants see address issues at checkout (not at dispatch), language-native AIS for Arabic, Thai, and CJK with first-class handling of non-Latin scripts, and geofence-learning feedback loops — every successful delivery’s GPS tightens the polygon for that address automatically.