japananimetime / ai-address-normalizer
AI-powered address normalizer using multiple AI providers (Claude, ChatGPT, Gemini, Ollama)
Installs: 0
Dependents: 0
Suggesters: 0
Security: 0
Stars: 0
Forks: 0
pkg:composer/japananimetime/ai-address-normalizer
Requires
- php: ^8.1
- illuminate/cache: ^10.0|^11.0
- illuminate/database: ^10.0|^11.0
- illuminate/http: ^10.0|^11.0
- illuminate/support: ^10.0|^11.0
This package is not auto-updated.
Last update: 2026-01-28 19:03:53 UTC
README
AI-powered address normalizer. Cleans, standardizes, and extracts geo entities from raw address strings using configurable AI providers.
Install
composer require japananimetime/ai-address-normalizer
Or via GitLab HTTPS in composer.json:
{
"repositories": [
{
"type": "vcs",
"url": "https://gitlab.com/japananimetime/ai-address-normalizer.git"
}
]
}
Setup
php artisan migrate
Set API keys in .env:
GEMINI_API_KEY=your-key
ANTHROPIC_API_KEY=your-key
OPENAI_API_KEY=your-key
OLLAMA_HOST=http://localhost:11434
Provider config (priority, model, enabled) lives in the ai_normalizer_services DB table. Migration seeds 4 defaults: Ollama, Gemini, Claude, ChatGPT.
Seed cities
Define cities in your published config, then seed:
// config/ai-normalizer.php
'cities' => [
['name' => 'Алматы', 'alternatives' => ['Алма-Ата', 'Almaty', 'Alma-Ata']],
['name' => 'Астана', 'alternatives' => ['Нур-Султан', 'Astana']],
],
php artisan ai-normalizer:seed-cities
Usage
Basic
use Japananimetime\AiAddressNormalizer\Facades\AiNormalizer;
// Normalize with best available provider (priority fallback)
$result = AiNormalizer::normalize('123 Main St, apt 4B, New York');
// With explicit city context (scopes alias/district lookups to this city)
$result = AiNormalizer::normalize('Main St 15', city: 'New York');
The city parameter is a city name string. It is resolved against the ai_normalizer_cities table (canonical name or alternative names). The resolved city ID is used internally for scoping DB queries (aliases, cache, suggestions).
If city is omitted, auto-detection scans the address text against all cities in the ai_normalizer_cities table.
Using the factory directly
use Japananimetime\AiAddressNormalizer\Services\AiNormalizerFactory;
$factory = app(AiNormalizerFactory::class);
// Priority fallback (same as facade)
$result = $factory->normalize($address);
// With city context
$result = $factory->normalize($address, city: 'Алматы');
// Force a specific provider (throws on failure, no fallback)
$result = $factory->normalizeWithProvider('gemini', $address, city: 'Алматы');
Response structure
normalize() always returns a NormalizationResult:
$result->originalAddress; // string — raw input
$result->normalizedAddress; // string — cleaned address
$result->confidence; // float — 0.0–1.0
$result->provider; // string — provider name or failure reason
$result->fromCache; // bool — true if served from cache
$result->changes; // string[] — human-readable changes ["expanded ул.→улица", ...]
$result->suggestion; // ?AliasSuggestion — new alias if AI found a pattern
$result->geoEntities; // ?ExtractedGeoEntities — structured location parts (see below)
// Helper methods
$result->wasModified(); // bool — true if address was actually changed
$result->isUsable(); // bool — true if confidence > 0.3 and address not empty
$result->getCityId(); // ?int — city ID from extracted geo entities
// Immutable builders (for enrichers)
$result->withGeoEntities($geoEntities); // new instance with different geo entities
$result->withSuggestion($suggestion); // new instance with different suggestion
Geo entities
When available, $result->geoEntities contains structured location data:
$geo = $result->geoEntities;
$geo->region; // ?GeoEntity — область
$geo->district; // ?GeoEntity — район области
$geo->city; // ?GeoEntity — город/село
$geo->cityDistrict; // ?GeoEntity — район города
$geo->microraion; // ?GeoEntity — микрорайон
$geo->zhkComplex; // ?GeoEntity — жилой комплекс
$geo->street; // ?GeoEntity — улица/проспект
$geo->houseNumber; // ?string
$geo->apartment; // ?string
// Each GeoEntity has:
$geo->city->name; // string — original name from address
$geo->city->normalizedName; // string — normalized name
$geo->city->confidence; // float — 0.0–1.0
$geo->city->matchedDbId; // ?int — host-app DB ID (set by enricher)
// Immutable builders
$geo->city->withDbId(42); // new GeoEntity with matchedDbId set
$geo->withCity($newCity); // new ExtractedGeoEntities with different city
$geo->withCityDistrict($dist); // ...and so on for all entity types
// Helpers
$geo->hasGeocodableData(); // bool — has street, complex, or microraion
$geo->getMostSpecific(); // ?GeoEntity — most specific entity for fallback geocoding
Passthrough results
On failure (no providers, all failed, garbage input), normalize() returns a passthrough — never throws:
$result = AiNormalizer::normalize('15'); // too short / numeric
$result->normalizedAddress; // "15" (unchanged)
$result->confidence; // 0.0
$result->provider; // "too_short_or_numeric"
Built-in Providers
- Ollama (local,
qwen2.5:14b) - Gemini (Google AI)
- Claude (Anthropic)
- ChatGPT (OpenAI)
Extensibility
Add new providers or multiple models without modifying the library.
Custom provider
- Create a class extending
AbstractNormalizer:
namespace App\Services\AI;
use Japananimetime\AiAddressNormalizer\Providers\AbstractNormalizer;
use Japananimetime\AiAddressNormalizer\Dto\NormalizationResult;
use Illuminate\Support\Facades\Http;
class MistralNormalizer extends AbstractNormalizer
{
protected string $model = 'mistral-large-latest';
public function getProviderName(): string
{
return 'mistral';
}
public function normalize(string $address, array $context): NormalizationResult
{
// $this->config is set by the factory from DB config + env fallback
$apiKey = $this->config['api_key'] ?? config('services.mistral.key');
// Build prompts (inherited from AbstractNormalizer)
$systemPrompt = $this->buildSystemPrompt($context);
$userPrompt = $this->buildUserPrompt($address);
// Call your AI API
$response = Http::withHeaders(['Authorization' => "Bearer {$apiKey}"])
->post('https://api.mistral.ai/v1/chat/completions', [
'model' => $this->model, // set by factory from DB model column
'messages' => [
['role' => 'system', 'content' => $systemPrompt],
['role' => 'user', 'content' => $userPrompt],
],
'response_format' => ['type' => 'json_object'],
]);
$content = $response->json('choices.0.message.content', '');
// extractJsonFromResponse() handles markdown code blocks, raw JSON, etc.
$parsed = $this->extractJsonFromResponse($content);
if (!$parsed) {
return NormalizationResult::passthrough($address, 'parse_error');
}
// parseResponse() converts the AI JSON into a NormalizationResult DTO.
// Pass $context['city_id'] so geo entities can resolve city references.
return $this->parseResponse($address, $parsed, $context['city_id'] ?? null);
}
}
- Add a DB row:
INSERT INTO ai_normalizer_services (name, provider_class, priority, is_active, model, config)
VALUES ('mistral', 'App\Services\AI\MistralNormalizer', 5, true, 'mistral-large-latest', '{"api_key":"sk-..."}');
Multiple models from same provider
-- Two Claude models with different priorities
INSERT INTO ai_normalizer_services (name, provider_class, priority, is_active, model)
VALUES ('claude-haiku', NULL, 1, true, 'claude-3-5-haiku-20241022');
-- Two Ollama models (include host in config)
INSERT INTO ai_normalizer_services (name, provider_class, priority, is_active, model, config)
VALUES ('ollama-large', NULL, 0, true, 'qwen2.5:32b', '{"host":"http://localhost:11434"}');
When provider_class is NULL, the factory strips the suffix and resolves via built-in ProviderEnum:
claude-haiku→ base nameclaude→ClaudeNormalizerollama-large→ base nameollama→OllamaNormalizer
What AbstractNormalizer gives you for free
If your custom provider extends AbstractNormalizer, you inherit:
| Method | What it does |
|---|---|
buildSystemPrompt($context) | Builds the full AI prompt with rules, aliases, geo entities |
buildUserPrompt($address) | Wraps address in a user message |
extractJsonFromResponse($text) | Extracts JSON from raw text, markdown blocks, etc. |
parseResponse($address, $parsed, $cityId) | Converts AI JSON into NormalizationResult with geo entities and alias suggestions |
setModel() / setConfig() | Model and config injection (called by factory) |
You only need to implement normalize() (call API) and getProviderName().
Custom context builder
The default ContextBuilder uses only package-owned tables (ai_normalizer_cities, street_name_aliases). To inject additional context from your host app (e.g., city districts, microraions), extend ContextBuilder:
namespace App\Services\AI;
use Japananimetime\AiAddressNormalizer\Services\ContextBuilder;
use Illuminate\Support\Facades\DB;
class MyContextBuilder extends ContextBuilder
{
public function build(string $address, ?string $city = null): array
{
$context = parent::build($address, $city);
// Add your host-app data to the context
if ($context['city_id']) {
$context['geo_entities']['city_districts'] = DB::table('city_districts')
->where('city_id', $context['city_id'])
->pluck('name')
->toArray();
}
return $context;
}
}
Bind it in your AppServiceProvider:
use Japananimetime\AiAddressNormalizer\Contracts\ContextBuilderInterface;
public function register(): void
{
$this->app->singleton(ContextBuilderInterface::class, \App\Services\AI\MyContextBuilder::class);
}
The package binds the default ContextBuilder via singletonIf(), so your binding takes precedence.
Result enricher
The ResultEnricherInterface is a post-processing hook called AFTER AI parsing, BEFORE caching. Host apps use it to match AI-extracted geo entities against their own database tables.
namespace App\Services\AI;
use Japananimetime\AiAddressNormalizer\Contracts\ResultEnricherInterface;
use Japananimetime\AiAddressNormalizer\Dto\NormalizationResult;
use Illuminate\Support\Facades\DB;
class MyResultEnricher implements ResultEnricherInterface
{
public function enrich(NormalizationResult $result, array $context): NormalizationResult
{
$geo = $result->geoEntities;
if (!$geo) {
return $result;
}
// Match city against your DB
if ($geo->city && !$geo->city->matchedDbId) {
$city = DB::table('cities')
->where('name', 'ILIKE', "%{$geo->city->normalizedName}%")
->first();
if ($city) {
$geo = $geo->withCity($geo->city->withDbId($city->id));
}
}
// Match city district
if ($geo->cityDistrict && !$geo->cityDistrict->matchedDbId) {
$district = DB::table('city_districts')
->where('name', 'ILIKE', "%{$geo->cityDistrict->normalizedName}%")
->first();
if ($district) {
$geo = $geo->withCityDistrict($geo->cityDistrict->withDbId($district->id));
}
}
return $result->withGeoEntities($geo);
}
}
Bind it in your AppServiceProvider:
use Japananimetime\AiAddressNormalizer\Contracts\ResultEnricherInterface;
public function register(): void
{
$this->app->singleton(ResultEnricherInterface::class, \App\Services\AI\MyResultEnricher::class);
}
The default NullResultEnricher returns the result unchanged.
Artisan Commands
# Normalize a single address
php artisan ai-normalizer:normalize "ул Абая 15" --city=Алматы
# Seed cities from config
php artisan ai-normalizer:seed-cities
php artisan ai-normalizer:seed-cities --dry-run
# Seed abbreviations (global)
php artisan ai-normalizer:seed-abbreviations
# Seed historical aliases (city-specific, uses city ID from ai_normalizer_cities)
php artisan ai-normalizer:seed-historical --city=2
Config Reference
| Env Variable | Description |
|---|---|
GEMINI_API_KEY | Google Gemini API key |
ANTHROPIC_API_KEY | Anthropic Claude API key |
OPENAI_API_KEY | OpenAI ChatGPT API key |
OLLAMA_HOST | Ollama server URL (default: http://localhost:11434) |
AI_NORMALIZER_CACHE_ENABLED | Enable response caching (default: true) |
AI_NORMALIZER_CACHE_TTL | Cache TTL in seconds (default: 30 days) |