README

Official PHP client for Glitchward Shield - LLM Prompt Injection Protection API.

Protect your AI agents from prompt injection attacks, jailbreaks, and malicious inputs before they reach your LLM provider.

Requirements

PHP 8.1 or higher
JSON extension

Installation

composer require glitchward/shield

Quick Start

<?php

use Glitchward\Shield\ShieldClient;

// Initialize the client with your API token
$shield = new ShieldClient('your_api_token');

// Validate a prompt before sending to your LLM
$messages = [
    ['role' => 'system', 'content' => 'You are a helpful assistant.'],
    ['role' => 'user', 'content' => 'Hello, how are you?']
];

$result = $shield->validate($messages);

if ($result->isSafe()) {
    // Safe to send to your LLM provider
    $response = $openai->chat()->create([
        'model' => 'gpt-4',
        'messages' => $messages
    ]);
} else {
    echo "Blocked! Risk score: {$result->riskScore}\n";
    foreach ($result->matches as $match) {
        echo "  - {$match->category}: {$match->pattern}\n";
    }
}

Usage with OpenAI PHP

<?php

use OpenAI;
use Glitchward\Shield\ShieldClient;

$openai = OpenAI::client('your-openai-key');
$shield = new ShieldClient('your_shield_token');

function safeChat(array $messages): string
{
    global $openai, $shield;

    // Validate with Shield first
    $result = $shield->validate($messages);

    if (!$result->isSafe()) {
        throw new RuntimeException("Prompt blocked by Shield. Risk: {$result->riskScore}");
    }

    // Safe to proceed
    $response = $openai->chat()->create([
        'model' => 'gpt-4',
        'messages' => $messages
    ]);

    return $response->choices[0]->message->content;
}

// Usage
try {
    $response = safeChat([
        ['role' => 'user', 'content' => "What's the weather like?"]
    ]);
    echo $response;
} catch (RuntimeException $e) {
    echo "Request blocked: " . $e->getMessage();
}

Usage in Laravel

<?php

namespace App\Services;

use Glitchward\Shield\ShieldClient;
use Glitchward\Shield\Exceptions\ShieldException;

class AIService
{
    private ShieldClient $shield;

    public function __construct()
    {
        $this->shield = new ShieldClient(config('services.shield.token'));
    }

    public function chat(array $messages): string
    {
        // Validate with Shield
        $result = $this->shield->validate($messages);

        if (!$result->isSafe()) {
            Log::warning('Prompt blocked by Shield', [
                'risk_score' => $result->riskScore,
                'matches' => array_map(fn($m) => $m->toArray(), $result->matches)
            ]);

            throw new \Exception('Your message was blocked by our security filter.');
        }

        // Proceed with OpenAI call
        return $this->callOpenAI($messages);
    }
}

Laravel Middleware

<?php

namespace App\Http\Middleware;

use Closure;
use Illuminate\Http\Request;
use Glitchward\Shield\ShieldClient;

class ValidatePrompt
{
    private ShieldClient $shield;

    public function __construct()
    {
        $this->shield = new ShieldClient(config('services.shield.token'));
    }

    public function handle(Request $request, Closure $next)
    {
        if ($request->has('messages')) {
            $result = $this->shield->validate($request->input('messages'));

            if (!$result->isSafe()) {
                return response()->json([
                    'error' => 'Prompt rejected by security filter',
                    'risk_score' => $result->riskScore,
                ], 400);
            }
        }

        return $next($request);
    }
}

Batch Validation

Validate multiple prompts efficiently in a single request:

<?php

use Glitchward\Shield\ShieldClient;

$shield = new ShieldClient('your_api_token');

// Validate up to 100 prompts at once
$batchResult = $shield->validateBatch([
    ['messages' => [['role' => 'user', 'content' => 'First question']]],
    ['messages' => [['role' => 'user', 'content' => 'Second question']]],
    ['messages' => [['role' => 'user', 'content' => 'Third question']]],
]);

// Check if all prompts are safe
if ($batchResult->allSafe()) {
    echo "All prompts are safe!\n";
} else {
    $blocked = $batchResult->getBlockedIndices();
    echo "Blocked prompts at indices: " . implode(', ', $blocked) . "\n";
}

// Process individual results
foreach ($batchResult->results as $i => $result) {
    echo "Prompt {$i}: safe={$result->safe}, risk={$result->riskScore}\n";
}

Validation Result

The ValidationResult object contains:

Property	Type	Description
`safe`	`bool`	Whether the prompt is safe
`blocked`	`bool`	Whether the prompt was blocked
`riskScore`	`float`	Risk score from 0.0 to 1.0
`processingTimeMs`	`int`	Processing time in milliseconds
`matches`	`ValidationMatch[]`	List of detected injection patterns

Each ValidationMatch contains:

category: Type of injection (e.g., "system_prompt_override", "jailbreak")
pattern: The pattern that was matched
severity: Severity level ("low", "medium", "high", "critical")
matchedText: The text that matched (optional)

Error Handling

<?php

use Glitchward\Shield\ShieldClient;
use Glitchward\Shield\Exceptions\ShieldException;
use Glitchward\Shield\Exceptions\ShieldAuthException;
use Glitchward\Shield\Exceptions\ShieldRateLimitException;
use Glitchward\Shield\Exceptions\ShieldValidationException;

$shield = new ShieldClient('your_api_token');

try {
    $result = $shield->validate($messages);
} catch (ShieldAuthException $e) {
    echo "Invalid API token\n";
} catch (ShieldRateLimitException $e) {
    echo "Rate limited. Retry after: {$e->getRetryAfter()}s\n";
} catch (ShieldValidationException $e) {
    echo "Invalid request: {$e->getMessage()}\n";
} catch (ShieldException $e) {
    echo "API error: {$e->getMessage()}\n";
}

Configuration

<?php

use Glitchward\Shield\ShieldClient;

// Custom configuration
$shield = new ShieldClient(
    apiToken: 'your_api_token',
    baseUrl: 'https://custom.endpoint.com/api/shield', // Optional
    timeout: 30 // Request timeout in seconds
);

Environment Variables

GLITCHWARD_SHIELD_TOKEN="your_api_token"

<?php

use Glitchward\Shield\ShieldClient;

$shield = new ShieldClient(getenv('GLITCHWARD_SHIELD_TOKEN'));

Detection Categories

Shield detects these types of attacks:

System Prompt Override - Attempts to ignore or override system instructions
Jailbreak Attempts - DAN prompts, character roleplay exploits
Data Exfiltration - Attempts to extract system prompts or sensitive data
Encoding Attacks - Base64, hex, unicode obfuscation
Invisible Characters - Zero-width chars, control characters
Code Injection - SQL, shell commands, script injection

Rate Limits

API Rate Limit: 100 requests per minute
Batch Size Limit: 100 items per batch request
Monthly Limit: Based on your plan

License

MIT License - see LICENSE for details.

glitchward / shield

Maintainers

Package info

Statistics

Security