README

A composable input/output safety layer for PHP applications that call LLMs. Redact PII before it leaves your server, catch prompt-injection and jailbreak attempts, scan for leaked secrets, and moderate model output — all in one pipeline.

Why this exists (the 2026 gap)

The PHP AI ecosystem grew up fast in early 2026 — Prism PHP, Neuron AI, and the official Laravel AI SDK now cover the happy path: making calls, RAG, agents, and structured output. What none of them own is the safety boundary.

In the Python world this is a solved, well-funded problem: LiteLLM guardrails and pydantic-ai-shields ship PII filtering, prompt-injection defense, and secret redaction out of the box. PHP teams shipping AI to production in 2026 have no native equivalent — they either roll fragile regexes by hand or proxy everything through a Python sidecar.

Ykachala Guardrails closes that gap with a framework-agnostic, provider-agnostic safety pipeline that sits between your app and any LLM client.

What it does

Guard	Stage	What it catches
`PiiGuard`	input + output	Emails, phones, SSNs, credit cards, IBANs, IPs — reversibly tokenized so you can restore them after the round-trip
`InjectionGuard`	input	"Ignore previous instructions", role-override, delimiter-escape, and encoded jailbreak patterns (heuristic + optional LLM classifier)
`SecretGuard`	input + output	API keys, bearer tokens, private keys, `.env`-shaped values
`ModerationGuard`	output	Toxicity, banned topics, configurable category thresholds
`SchemaGuard`	output	Rejects responses that break an expected shape before they reach users

Each guard returns a verdict (allow / redact / block) with a reason, so you decide the policy: fail closed, fail open, or sanitize-and-continue.

Install

composer require ykachala/guardrails

Quick start

use Ykachala\Guardrails\Pipeline;
use Ykachala\Guardrails\Guard\PiiGuard;
use Ykachala\Guardrails\Guard\InjectionGuard;
use Ykachala\Guardrails\Guard\SecretGuard;

$pipeline = Pipeline::make()
    ->input(new InjectionGuard(action: 'block'))
    ->input(new SecretGuard(action: 'block'))
    ->input(new PiiGuard(action: 'redact'))   // reversible
    ->output(new PiiGuard(action: 'restore')); // put real values back

// 1. Sanitize what you send to the model
$safe = $pipeline->inbound($userMessage);
if ($safe->blocked()) {
    throw new UnsafeInputException($safe->reason());
}

$response = $yourLlmClient->chat($safe->text());

// 2. Sanitize what you show the user
$clean = $pipeline->outbound($response, context: $safe);
echo $clean->text();

Reversible PII tokenization

// "Email john@acme.com about order 4111-1111-1111-1111"
$safe = $pipeline->inbound($text);
// -> "Email <PII_EMAIL_1> about order <PII_CC_1>"   (sent to the model)

$restored = $pipeline->outbound($modelReply, context: $safe);
// model's "<PII_EMAIL_1>" placeholders are swapped back to john@acme.com

Framework bridges

Laravel — GuardrailsServiceProvider, config publish, and a guarded() macro on the Prism/Laravel-AI pending request.
Symfony — a GuardrailsBundle with a tagged-service guard registry and middleware.
Vanilla — the Pipeline is pure PHP with zero framework deps.

Architecture

src/
├── Pipeline.php            # orchestrates inbound/outbound passes
├── Verdict.php             # allow | redact | block + reason + redaction map
├── Guard/
│   ├── GuardInterface.php
│   ├── PiiGuard.php
│   ├── InjectionGuard.php
│   ├── SecretGuard.php
│   └── ModerationGuard.php
├── Detector/               # low-level detectors (regex + pluggable ML)
└── Bridge/                 # Laravel + Symfony integration

Roadmap

Core pipeline + verdict model
Regex detector pack (PII, secrets) with locale-aware rules
Heuristic injection detector + optional LLM classifier adapter
Reversible tokenization store
Laravel & Symfony bridges
Pluggable model-based detectors (toxicity, semantic PII)

See CLAUDE.md for the full phase plan and conventions.

License

MIT

ykachala / guardrails

Maintainers

Package info

Statistics

Security