octopus-llm/laravel

OpenAI-compatible AI gateway untuk Laravel dengan multi-key rotation, circuit breaker, dan zero-cost free tier management.

Maintainers

Package info

github.com/iskandar221201/octopus-laravel

pkg:composer/octopus-llm/laravel

Statistics

Installs: 6

Dependents: 0

Suggesters: 0

Stars: 0

Open Issues: 0

v1.0.0 2026-06-05 04:35 UTC

README

Latest Stable Version PHP Version Laravel Version License

AI gateway with multi-key rotation, circuit breaker, and zero-cost free tier management.

Octopus LLM Gateway allows you to seamlessly integrate multiple LLM providers (like Groq, OpenRouter, Cerebras) into your Laravel application with robust fallbacks, load-balancing key rotation, circuit breaking, and automated recovery, maximizing uptime and cost efficiency.

๐Ÿš€ Key Features

  • ๐Ÿ”„ Multi-Key Rotation: Automatically rotates API keys per provider using a Least Recently Used (LRU) algorithm to maximize rate limit usage.
  • โšก Circuit Breaker: Disables keys automatically when successive HTTP failures occur (e.g., 500, timeout) and triggers events.
  • ๐Ÿ›ก๏ธ Token Validation Guard: Estimates input token counts and blocks oversized requests before hitting the remote APIs.
  • ๐Ÿ”„ Fallback & Retry Mechanism: Automatically falls back to lower-priority providers if a provider's keys are completely exhausted or rate-limited.
  • ๐Ÿค– Automated Ping Recovery: Background tasks test inactive keys against /models endpoints periodically and reactivate them upon recovery.
  • ๐Ÿ’ป Interactive Artisan CLI: Commands to monitor gateway status, validate credentials, benchmark latencies, test chats, and recover keys.

๐Ÿ“ฆ Installation

Important

This package requires PHP 8.3 or higher and Laravel 11.0 or higher.

To install the package, run the following command in your Laravel project:

composer require octopus-llm/laravel

Publish the configuration file:

php artisan vendor:publish --tag=octopus-config

โš™๏ธ Configuration

The published configuration file is located at config/octopus.php. Below is a breakdown of the configuration keys and their environment variables:

return [

    /*
    |--------------------------------------------------------------------------
    | LLM Providers
    |--------------------------------------------------------------------------
    | Define the list of providers in order of priority.
    | Lowest priority number (e.g., 1) is tried first.
    |
    */
    'providers' => [
        [
            'id'       => 'groq',
            'baseURL'  => 'https://api.groq.com/openai/v1',
            'model'    => env('GROQ_MODEL', 'llama-3.1-8b-instant'),
            'keys'     => explode(',', env('GROQ_KEYS', '')),
            'priority' => 1,
            'cooldown' => 60, // Key cooldown time in seconds before recovery
        ],
        [
            'id'           => 'openrouter',
            'baseURL'      => 'https://openrouter.ai/api/v1',
            'model'        => env('OPENROUTER_MODEL', 'mistralai/mistral-7b-instruct:free'),
            'keys'         => explode(',', env('OPENROUTER_KEYS', '')),
            'priority'     => 2,
            'cooldown'     => 120,
            'extraHeaders' => ['HTTP-Referer' => env('APP_URL', 'http://localhost')],
        ],
    ],

    /*
    |--------------------------------------------------------------------------
    | Request & Input Guards
    |--------------------------------------------------------------------------
    */
    'guard' => [
        'temperature'       => env('OCTOPUS_TEMPERATURE', 0.7),
        'timeout_ms'        => env('OCTOPUS_TIMEOUT_MS', 10000),
        'max_input_tokens'  => env('OCTOPUS_MAX_INPUT_TOKENS', 4000),
        'max_retries'       => env('OCTOPUS_MAX_RETRIES', 2),
        'max_output_tokens' => env('OCTOPUS_MAX_OUTPUT_TOKENS', 1000),
    ],

    /*
    |--------------------------------------------------------------------------
    | Circuit Breaker
    |--------------------------------------------------------------------------
    */
    'circuit_breaker' => [
        'failure_threshold' => env('OCTOPUS_CB_THRESHOLD', 3), // Consecutive fails to deactivate a key
    ],

    /*
    |--------------------------------------------------------------------------
    | State Storage
    |--------------------------------------------------------------------------
    */
    'storage'          => env('OCTOPUS_STORAGE', 'cache'),
    'storage_class'    => null, // Custom class implementing StorageInterface
    'cache_key_prefix' => env('OCTOPUS_CACHE_PREFIX', 'octopus_llm_state'),
    'cache_ttl'        => env('OCTOPUS_CACHE_TTL', 86400),

    'streaming' => env('OCTOPUS_STREAMING', true),

];

๐Ÿ“ Environment Variables (.env)

Add the following environment variables to your application's .env file to configure the gateway:

# LLM Providers API Keys (comma-separated for key rotation)
GROQ_KEYS=key-1,key-2
OPENROUTER_KEYS=key-or-1
CEREBRAS_KEYS=key-cerebras-1

# Optional Model Customization
GROQ_MODEL=llama-3.1-8b-instant
OPENROUTER_MODEL=mistralai/mistral-7b-instruct:free
CEREBRAS_MODEL=llama-3.1-8b

# Optional Gateway Parameter Overrides
OCTOPUS_TEMPERATURE=0.7
OCTOPUS_TIMEOUT_MS=10000
OCTOPUS_MAX_INPUT_TOKENS=4000
OCTOPUS_MAX_RETRIES=2
OCTOPUS_MAX_OUTPUT_TOKENS=1000
OCTOPUS_CB_THRESHOLD=3
OCTOPUS_PING_TIMEOUT=5
OCTOPUS_STORAGE=cache
OCTOPUS_CACHE_PREFIX=octopus_llm_state
OCTOPUS_CACHE_TTL=86400
OCTOPUS_STREAMING=true

๐Ÿ› ๏ธ Usage

๐Ÿ’ฌ Sending Chat Requests

Use the OctopusLLM facade to send chat completion requests.

use OctopusLLM\Laravel\Facades\OctopusLLM;

$response = OctopusLLM::chat([
    ['role' => 'user', 'content' => 'What is the speed of light?']
]);

echo $response->content; // Response text
echo $response->provider; // 'groq'
echo $response->model; // 'llama-3.1-8b-instant'
echo $response->latencyMs; // Latency in milliseconds
echo $response->attempts; // Attempts taken (e.g., 1)

๐Ÿ”€ Forcing a Specific Provider

You can bypass rotation sorting and force a specific provider for a single call:

$response = OctopusLLM::chat(
    [['role' => 'user', 'content' => 'Hello']],
    ['forceProvider' => 'openrouter']
);

๐ŸŒŠ Streaming Responses

To stream completions token-by-token, specify the streaming option and pass an onChunk callback:

OctopusLLM::chat(
    [['role' => 'user', 'content' => 'Write a short story.']],
    [
        'streaming' => true,
        'onChunk' => function (string $chunk) {
            echo $chunk;
            flush();
        }
    ]
);

๐Ÿ“Š Monitoring Status

To inspect the current active/inactive status of all providers and API keys:

$status = OctopusLLM::getStatus();

echo "Total Active Keys: " . $status->totalActive;
echo "Total Inactive Keys: " . $status->totalInactive;

foreach ($status->providers as $provider) {
    echo "Provider: " . $provider->id;
    foreach ($provider->keys as $key) {
        echo "Key Index: " . $key->index . " Status: " . $key->status;
    }
}

๐Ÿค– Manual Recovery & Ping

You can manually trigger pings or run the recovery engine inside code:

// Ping a specific key
$isAlive = OctopusLLM::ping('groq', 0); // returns boolean

// Run the full recovery process (checks cooldowns and pings inactive keys)
$report = OctopusLLM::runRecovery();

๐Ÿ’ป Artisan Commands

Octopus LLM comes with a complete suite of Artisan command-line tools:

Command Description
php artisan octopus:validate Validates API keys configuration in .env
php artisan octopus:status Shows table of status, failure counts, and last used times for all keys (use --json for raw data)
php artisan octopus:benchmark Benchmarks request latency for each provider (use --samples=5 to customize)
php artisan octopus:test Sends a prompt request to the gateway to test end-to-end connectivity
php artisan octopus:recover Manually triggers the background key recovery checks

๐Ÿงช Testing

Run the PHPUnit test suite to verify code correctness and coverage:

vendor/bin/phpunit

๐Ÿ“„ License

The MIT License (MIT). Please see License File for more information.