ykachala / guardrails
Input/output safety layer for PHP LLM apps: PII redaction, prompt-injection detection, secret scanning, and content moderation as a composable pipeline.
Requires
- php: >=8.3
- psr/log: ^3.0
Requires (Dev)
- friendsofphp/php-cs-fixer: ^3.0
- phpstan/phpstan: ^2.0
- phpunit/phpunit: ^11.0
Suggests
- guzzlehttp/guzzle: Required only if you enable the optional LLM-based injection classifier
- ykachala/meter: Track the token/cost overhead added by LLM-based guards
This package is auto-updated.
Last update: 2026-06-02 09:06:51 UTC
README
A composable input/output safety layer for PHP applications that call LLMs. Redact PII before it leaves your server, catch prompt-injection and jailbreak attempts, scan for leaked secrets, and moderate model output — all in one pipeline.
Why this exists (the 2026 gap)
The PHP AI ecosystem grew up fast in early 2026 — Prism PHP, Neuron AI, and the official Laravel AI SDK now cover the happy path: making calls, RAG, agents, and structured output. What none of them own is the safety boundary.
In the Python world this is a solved, well-funded problem: LiteLLM guardrails and pydantic-ai-shields ship PII filtering, prompt-injection defense, and secret redaction out of the box. PHP teams shipping AI to production in 2026 have no native equivalent — they either roll fragile regexes by hand or proxy everything through a Python sidecar.
Ykachala Guardrails closes that gap with a framework-agnostic, provider-agnostic safety pipeline that sits between your app and any LLM client.
What it does
| Guard | Stage | What it catches |
|---|---|---|
PiiGuard |
input + output | Emails, phones, SSNs, credit cards, IBANs, IPs — reversibly tokenized so you can restore them after the round-trip |
InjectionGuard |
input | "Ignore previous instructions", role-override, delimiter-escape, and encoded jailbreak patterns (heuristic + optional LLM classifier) |
SecretGuard |
input + output | API keys, bearer tokens, private keys, .env-shaped values |
ModerationGuard |
output | Toxicity, banned topics, configurable category thresholds |
SchemaGuard |
output | Rejects responses that break an expected shape before they reach users |
Each guard returns a verdict (allow / redact / block) with a reason, so you decide
the policy: fail closed, fail open, or sanitize-and-continue.
Install
composer require ykachala/guardrails
Quick start
use Ykachala\Guardrails\Pipeline; use Ykachala\Guardrails\Guard\PiiGuard; use Ykachala\Guardrails\Guard\InjectionGuard; use Ykachala\Guardrails\Guard\SecretGuard; $pipeline = Pipeline::make() ->input(new InjectionGuard(action: 'block')) ->input(new SecretGuard(action: 'block')) ->input(new PiiGuard(action: 'redact')) // reversible ->output(new PiiGuard(action: 'restore')); // put real values back // 1. Sanitize what you send to the model $safe = $pipeline->inbound($userMessage); if ($safe->blocked()) { throw new UnsafeInputException($safe->reason()); } $response = $yourLlmClient->chat($safe->text()); // 2. Sanitize what you show the user $clean = $pipeline->outbound($response, context: $safe); echo $clean->text();
Reversible PII tokenization
// "Email john@acme.com about order 4111-1111-1111-1111" $safe = $pipeline->inbound($text); // -> "Email <PII_EMAIL_1> about order <PII_CC_1>" (sent to the model) $restored = $pipeline->outbound($modelReply, context: $safe); // model's "<PII_EMAIL_1>" placeholders are swapped back to john@acme.com
Framework bridges
- Laravel —
GuardrailsServiceProvider, config publish, and aguarded()macro on the Prism/Laravel-AI pending request. - Symfony — a
GuardrailsBundlewith a tagged-service guard registry and middleware. - Vanilla — the
Pipelineis pure PHP with zero framework deps.
Architecture
src/
├── Pipeline.php # orchestrates inbound/outbound passes
├── Verdict.php # allow | redact | block + reason + redaction map
├── Guard/
│ ├── GuardInterface.php
│ ├── PiiGuard.php
│ ├── InjectionGuard.php
│ ├── SecretGuard.php
│ └── ModerationGuard.php
├── Detector/ # low-level detectors (regex + pluggable ML)
└── Bridge/ # Laravel + Symfony integration
Roadmap
- Core pipeline + verdict model
- Regex detector pack (PII, secrets) with locale-aware rules
- Heuristic injection detector + optional LLM classifier adapter
- Reversible tokenization store
- Laravel & Symfony bridges
- Pluggable model-based detectors (toxicity, semantic PII)
See CLAUDE.md for the full phase plan and conventions.
License
MIT