illuma-law / laravel-llm-router
A resilient AI routing and circuit-breaker system for LLM requests in Laravel.
Requires
- php: ^8.3
- illuminate/contracts: ^11.0||^12.0||^13.0
- illuminate/support: ^11.0||^12.0||^13.0
- spatie/laravel-package-tools: ^1.16
Requires (Dev)
- larastan/larastan: ^3.1
- laravel/ai: ^0.6.0
- laravel/pint: ^1.21
- nunomaduro/collision: ^8.6
- orchestra/testbench: ^10.1
- pestphp/pest: ^4.0
- pestphp/pest-plugin-arch: ^4.0
- pestphp/pest-plugin-laravel: ^4.0
- phpstan/extension-installer: ^1.4
- phpstan/phpstan-deprecation-rules: ^2.0
- phpstan/phpstan-phpunit: ^2.0
- spatie/laravel-ray: ^1.39
README
A production-grade, highly resilient LLM routing and failover system for Laravel applications.
When building AI-powered features, relying on a single API provider (like OpenAI or Anthropic) introduces significant risk due to rate limits and outages. The Laravel LLM Router solves this by providing a fluent API to execute LLM requests with automatic circuit-breaking, multi-provider fallback chains, smart error classification, and tenant-aware routing.
Features
- Fluent Request API: Execute LLM calls with a clean, builder-like syntax using the
LLMRouterfacade. - Resilient Failover: Automatically attempt alternative providers/models when primary ones fail (e.g., fallback from OpenAI to Anthropic to Google).
- Smart Retries: Intelligently distinguishes between transient network errors (retry the same provider) and rate limits (failover immediately to the next provider).
- Tenant Isolation: Support for "Sovereign" tenants where requests must stay on-premise (e.g., forcing Ollama models).
- Seamless Laravel AI SDK Integration: First-class support for
laravel/aiagents and simple prompting. - Extensible Configuration: Define your own model tiers (Small, Large, etc.) via Enums, and load chains via config or database.
- Reusable Support Primitives: Generic provider availability checks, chain row validation, and cooldown state tracking can be reused in host apps without domain coupling.
Support Utilities
The package now ships generic support classes for host applications that want to keep AI routing concerns out of domain code:
IllumaLaw\LlmRouter\Support\ConfigProviderAvailability- Checks provider credentials and enablement flags using configurable config paths.
IllumaLaw\LlmRouter\Support\ChainRowValidator- Normalizes provider/model rows, validates provider/model combinations, and removes adjacent duplicates.
IllumaLaw\LlmRouter\Support\CooldownStore- Tracks retryable failure counters and cooldown windows in cache.
IllumaLaw\LlmRouter\Contracts\ProviderAvailability- Contract for provider availability checks, bound to
ConfigProviderAvailabilityby default.
- Contract for provider availability checks, bound to
These are container-registered by LLMRouterServiceProvider, so they can be injected directly or wrapped by app-specific adapters.
Installation
Require this package with composer:
composer require illuma-law/laravel-llm-router
Publish the configuration file:
php artisan vendor:publish --tag="llm-router-config"
Configuration
The config/llm-router.php file defines your fallback tiers and retry behavior:
return [ 'enabled' => env('LLM_ROUTER_ENABLED', true), // Define chains of providers to try in order 'tiers' => [ 'large' => [ ['provider' => 'anthropic', 'model' => 'claude-3-5-sonnet-latest'], ['provider' => 'openai', 'model' => 'gpt-4o'], ['provider' => 'gemini', 'model' => 'gemini-1.5-pro'], ], 'small' => [ ['provider' => 'anthropic', 'model' => 'claude-3-5-haiku-latest'], ['provider' => 'openai', 'model' => 'gpt-4o-mini'], ['provider' => 'gemini', 'model' => 'gemini-1.5-flash'], ], ], // Override model for on-prem/sovereign routing 'priority_override' => [ 'provider' => 'ollama', 'model' => 'llama3.1:70b', ], // How many times to retry the same provider on a 5xx error 'max_same_provider_retries' => 1, 'retry_delay_ms' => 150, ];
Usage & Integration
Seamless Laravel AI SDK Integration
The router is designed to work perfectly with the official laravel/ai SDK.
Simple Prompting
You can execute a simple prompt across a failover chain with one line. If Claude fails, it will automatically try OpenAI:
use IllumaLaw\LlmRouter\Facades\LLMRouter; use IllumaLaw\LlmRouter\Enums\DefaultTier; $response = LLMRouter::tier(DefaultTier::Large)->prompt('Explain quantum computing.'); echo $response->text;
Agent Integration
You can pass a laravel/ai agent instance directly to run its specific logic through the router's failover chain:
use App\Agents\LegalAnalystAgent; use IllumaLaw\LlmRouter\Facades\LLMRouter; use IllumaLaw\LlmRouter\Enums\DefaultTier; $agent = new LegalAnalystAgent(); $response = LLMRouter::forAgent($agent) ->tier(DefaultTier::Large) ->prompt('Analyze this contract...', ['context' => '...']);
Custom Execution Closure
If you need full control over the execution, use the run() method with a closure:
use IllumaLaw\LlmRouter\Facades\LLMRouter; use IllumaLaw\LlmRouter\Enums\DefaultTier; use Laravel\Ai\Facades\Ai; $result = LLMRouter::tier(DefaultTier::Large) ->run(function (string $provider, string $model) use ($prompt) { // This closure is executed for each hop in the chain until success return Ai::provider($provider) ->model($model) ->prompt($prompt) ->generate(); });
Priority & Tenant Overrides
For enterprise applications, some customers might require their data to never leave their infrastructure. You can register a "priority resolver" in your AppServiceProvider:
use IllumaLaw\LlmRouter\Facades\LLMRouter; // AppServiceProvider.php public function boot(): void { LLMRouter::resolvePriorityUsing(function ($tenant) { // If the tenant is flagged as sovereign, trigger the priority_override (e.g. Ollama) return $tenant->is_sovereign === true; }); }
Then pass the tenant context to the router:
$result = LLMRouter::tier(DefaultTier::Large) ->forTenant($team) // Automatically switches to local models if $team is sovereign ->prompt('Summarize this document.');
Abstracting Chain Resolution
If your application stores fallback chains dynamically in a database, you can implement the ChainRepository interface.
use IllumaLaw\LlmRouter\Contracts\ChainRepository; class DatabaseChainRepository implements ChainRepository { public function getChain(?string $tier = null, ?string $operation = null): ?array { return DB::table('ai_chains')->where('tier', $tier)->get()->toArray(); } public function getAgentOverride(string $agent): ?array { return null; } }
Register it in your Service Provider:
LLMRouter::useRepository(new DatabaseChainRepository());
Error Handling
The router automatically classifies errors:
- Transient Errors (503, Timeouts, Network): Retries the same provider.
- Exhaustion Errors (429 Rate Limits): Immediately fails over to the next provider in the chain.
- Terminal Errors (401 Unauthorized, 400 Bad Request): Fails fast and propagates the exception without retrying.
If the entire chain is exhausted, a ChainExhaustedException is thrown:
use IllumaLaw\LlmRouter\Exceptions\ChainExhaustedException; use IllumaLaw\LlmRouter\Facades\LLMRouter; try { $result = LLMRouter::tier('large')->prompt('...'); } catch (ChainExhaustedException $e) { // Inspect what went wrong at each step $attempts = $e->getAttempts(); Log::critical('All AI providers failed.', ['history' => $attempts]); }
Testing
The package includes a comprehensive test suite using Pest PHP.
composer test
License
The MIT License (MIT). Please see License File for more information.