andrecorugda/ai-openrouter-gateway

A self-hostable, OpenRouter-backed AI gateway for Laravel — named integrations, versioned prompts, telemetry, rate & cost limiting, a Sanctum HTTP API, and a Filament admin UI with an AI-assisted prompt builder.

Maintainers

Package info

github.com/andrecorugda/ai-openrouter-gateway

pkg:composer/andrecorugda/ai-openrouter-gateway

Statistics

Installs: 38

Dependents: 0

Suggesters: 0

Stars: 0

Open Issues: 0

v1.1.0 2026-06-27 05:00 UTC

This package is auto-updated.

Last update: 2026-06-27 05:00:20 UTC


README

Latest Version on Packagist Tests License

A self-hostable, OpenRouter-backed AI gateway for Laravel. One API key reaches every model — Anthropic Claude, OpenAI GPT, Google Gemini, DeepSeek, and more — behind one service, one audit log, one cost view.

Each AI use case is a named integration with a versioned prompt template, a declared variable schema, generation params, and guardrails. Call it from PHP (AiGateway::invoke('my_use_case', [...])) or over an authenticated HTTP API. A bundled Filament admin UI lets non-developers create and tune integrations — complete with an AI-assisted prompt builder.

Your prompts, your customer data, and your OpenRouter key never leave your app. No third-party SaaS in the trust boundary.

Why this exists

AI features usually start with a prompt hardcoded in a controller and a single model id baked into the code. Then reality hits: the prompt needs tuning weekly, you want to try a cheaper model, and three different apps need the same capability. Every change becomes a code edit, a pull request, a review, and a deploy.

This package moves that whole surface out of code and into a managed integration you tune at runtime:

  • Stop hardcoding prompts. Prompts, variables, model choice, and generation params live in the database and are edited in the admin UI — not in your source tree.
  • Fine-tune on the fly — no PR, no redeploy. Tweak a prompt, bump temperature, or swap the model and save. The next call uses it immediately. Every save mints a new version, so you can roll back by loading an old one into the form.
  • Test across models in seconds. Pick any model from the live OpenRouter catalog, hit Test, and compare output, tokens, latency, and cost — without touching code. Switch the production model when you find a better/cheaper one.
  • One integration, many callers. Define a use case once and invoke it from anywhere — any PHP service or job (AiGateway::invoke('lead_summary', [...])) and any external app, language, or codebase over HTTPS (POST /api/ai/lead_summary/chat with a scoped token). One source of truth, reused across platforms.
  • Governance built in. Every call is logged with tokens, cost, latency, and status; per-integration rate limits and daily cost caps stop runaway spend before it happens.

In short: prompts and models become configuration, not code — owned by the people who tune them, observable, and callable from everything.

Features

  • 🔌 One key, every model — OpenRouter under the hood; switch models per-integration with no code change.
  • 🧩 Named integrations — register a use case once, invoke it by slug everywhere.
  • 🗂️ Versioned prompts — every edit mints a new version; activate/roll back without losing history.
  • 🧮 Telemetry built in — tokens, cost, latency, model, and status for every call in ai_invocations.
  • 🚦 Rate limiting — per-integration, per-caller, per-minute.
  • 💰 Cost limiting — per-integration daily USD budget, enforced before each call.
  • 🌐 HTTP APIPOST /api/ai/{integration}/chat, Sanctum-authenticated, toggleable at runtime.
  • 💬 Conversation threads — opt-in multi-turn memory with /start + /converse, per-caller ownership, TTL expiry, and a prune command.
  • 🔑 API token management — mint and revoke scoped tokens from the admin UI.
  • 🔎 OpenRouter server tools — per-version web_search / web_fetch.
  • AI prompt builder — describe what you want; a fast Haiku drafts the template + variables.
  • 🗂️ Live model catalog — searchable model picker from OpenRouter's /models, with per-model generation params and caching eligibility.
  • 📊 Invocations browser — read-only telemetry with status/caller/date filters, cost + token Σ summaries, and per-call detail.
  • 🎛️ Filament admin UI — integration CRUD, versions (load-into-form), a live test panel, an interactive prompt editor, settings.
  • ⚙️ Fully configurable — connection, table names, route prefix/middleware, cache store, limits, models.

Requirements

  • PHP 8.2+
  • Laravel 11, 12, or 13
  • An OpenRouter API key
  • Filament 3.2+ (optional — only for the admin UI; use a Filament release that supports your Laravel version)

Installation

composer require andrecorugda/ai-openrouter-gateway

Publish and run the migrations:

php artisan vendor:publish --tag="ai-openrouter-gateway-migrations"
php artisan migrate

Optionally publish the config:

php artisan vendor:publish --tag="ai-openrouter-gateway-config"

Add your key to .env:

OPENROUTER_API_KEY=sk-or-v1-...

That's the only required variable — referer/title default to your APP_URL / APP_NAME.

Quickstart

1. Create an integration

Either through the Filament UI (recommended) or in code:

use Andre\AiGateway\Models\AiIntegration;
use Andre\AiGateway\Services\AiIntegrationService;

$integration = AiIntegration::create([
    'slug' => 'expense_extract',
    'name' => 'Expense Extractor',
    'visibility' => 'internal',
]);

app(AiIntegrationService::class)->saveVersion($integration, [
    'system_prompt' => 'Extract the merchant, total, and date from this receipt:\n\n{{receipt_text}}',
    'models' => ['anthropic/claude-sonnet-4', 'openai/gpt-4o'], // primary + fallback
    'default_params' => ['max_tokens' => 512, 'temperature' => 0.1],
    'prompt_args' => [
        ['name' => 'receipt_text', 'type' => 'string', 'required' => true],
    ],
]);

2. Invoke it from PHP

use Andre\AiGateway\Facades\AiGateway;

$result = AiGateway::invoke('expense_extract', [
    'receipt_text' => $ocrText,
]);

$result->text;        // the assistant's reply
$result->model_used;  // 'anthropic/claude-sonnet-4'
$result->cost_usd;    // 0.0021
$result->usage;       // ['prompt_tokens' => ..., 'completion_tokens' => ...]

Multi-turn chat layers messages on top of the templated system prompt:

$result = AiGateway::invoke('support_assistant',
    args: ['kb_version' => 'v3'],
    messages: [
        ['role' => 'user', 'content' => 'How do I reset my password?'],
    ],
);

3. Or call it over HTTP

The HTTP API and the API Tokens admin page are authenticated with Laravel Sanctum. One-time setup in your app:

composer require laravel/sanctum
php artisan migrate            # creates personal_access_tokens

Then add the HasApiTokens trait to your User model:

use Laravel\Sanctum\HasApiTokens;

class User extends Authenticatable
{
    use HasApiTokens;
    // ...
}

Now call the endpoint:

curl -X POST https://your-app.test/api/ai/expense_extract/chat \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{"args": {"receipt_text": "..."}, "options": {"max_tokens": 256}}'

Mint the token from the admin UI (API Tokens page) or in code:

$token = $user->createToken('integration-client', ['ai-gateway:invoke'])->plainTextToken;

The admin UI (Filament)

Register the plugin on your panel:

use Andre\AiGateway\Filament\AiGatewayPlugin;

public function panel(Panel $panel): Panel
{
    return $panel
        // ...
        ->plugin(AiGatewayPlugin::make());
}

You get:

  • AI Integrations — create/edit integrations with a searchable model picker from the live OpenRouter catalog, generation params that auto-populate per model, a model-aware prompt-caching control, an interactive prompt editor (click a declared variable to insert {{name}}), a Versions action that loads any past version back into the form, and a Test panel that runs it live and shows tokens/cost/latency.
  • Draft with AI — describe the use case in plain language; the prompt builder fills the template and variable schema for you.
  • Invocations — a read-only telemetry browser: filter by status / caller / integration / date, with cost + token Σ summaries and a per-call detail modal (usage, error, OpenRouter generation link).
  • General settings — toggle the HTTP API, toggle the prompt builder, and pick the helper model (also a catalog-backed Select).
  • API Tokens — mint and revoke scoped invocation tokens.

Screenshots

Integration form — catalog model picker, per-model params, caching, prompt editor Create integration
Integrations list Integrations
Invocations — telemetry with Σ summaries Invocations
Versions — load a past version into the form Versions
General settings General settings
API tokens API tokens

Conversations (multi-turn threads)

Flag an integration conversational (UI toggle, or is_conversational + conversation_ttl_minutes) to get server-side memory: the gateway persists each turn, so clients send only the next message — no replaying history.

From PHP:

use Andre\AiGateway\Facades\AiGateway;

$first  = AiGateway::converse('support', null, 'My order is late');      // null → new thread
$id     = $first->conversation_id;                                       // keep this
$second = AiGateway::converse('support', $id, 'Order #4471');            // continues with full history

Over HTTP (two calls, à la a chatbot /start then /chat):

# 1) open a thread
curl -X POST https://your-app.test/api/ai/support/start \
  -H "Authorization: Bearer <token>"
# → { "data": { "conversation_id": "0779…", "expires_at": "…" } }

# 2) send turns
curl -X POST https://your-app.test/api/ai/support/converse \
  -H "Authorization: Bearer <token>" -H "Content-Type: application/json" \
  -d '{"conversation_id": "0779…", "message": "Order #4471"}'

Threads are owned by their caller (a guessed id returns 404), expire after the TTL, and link each turn to its telemetry row. Prune expired threads on a schedule:

// routes/console.php
Schedule::command('ai-gateway:prune-conversations')->daily();

Rate & cost limiting

Set ceilings per integration (UI → Limits, or the rate_limit_per_minute / max_daily_cost_usd columns). Blank falls back to the config default; a null default means unlimited.

// config/ai-gateway.php
'rate_limit' => [
    'enabled' => true,
    'default_per_minute' => 60,   // null = unlimited
],
'cost_limit' => [
    'enabled' => true,
    'default_daily_usd' => 25.0,  // null = uncapped
    'window_hours' => 24,
],

When a caller exceeds a limit the gateway throws RateLimitExceededException (HTTP 429) or CostLimitExceededException (HTTP 402) before any spend occurs.

Configuration highlights

Everything in config/ai-gateway.php is overridable. Common knobs:

Key Purpose
openrouter.api_key Your OpenRouter key (OPENROUTER_API_KEY).
default_model Model pre-filled on new integrations.
database.connection / database.tables Relocate / rename the package's tables.
models.* Swap any Eloquent model for an app subclass.
cache.store / cache.ttl_seconds Integration-resolver cache.
api.enabled / api.prefix / api.middleware / api.token_ability HTTP API surface.
prompt_builder.model Helper model (defaults to anthropic/claude-haiku-4.5).
filament.navigation_group / filament.authorize Admin UI placement & access gate.

Observability

Every call writes one row to ai_invocations (success and failure both):

use Andre\AiGateway\Models\AiInvocation;

// Per-integration spend over the last 24h
AiInvocation::where('ai_integration_id', $id)
    ->where('created_at', '>=', now()->subDay())
    ->sum('cost_usd');

Each row links to https://openrouter.ai/activity/{generation_id} for full provider-side cost forensics.

Testing

composer install
vendor/bin/pest

Security

The gateway never sends data to anyone but OpenRouter. Rotate your key by updating OPENROUTER_API_KEY and redeploying. If you discover a vulnerability, please email andre.corugda@ins-global.com.

Credits

Built by Andre Corugda.

License

The MIT License (MIT). See LICENSE.