sellinnate / rag-engine
Enterprise Retrieval-Augmented Generation engine for Laravel: ingestion, parsing, chunking, embedding, vector store, retrieval, reranking, BYOK security and multi-tenancy.
Fund package maintenance!
Requires
- php: ^8.2
- illuminate/contracts: ^11.0||^12.0||^13.0
- spatie/laravel-package-tools: ^1.16
Requires (Dev)
- aws/aws-sdk-php: ^3.300
- larastan/larastan: ^3.0
- laravel/pint: ^1.14
- nunomaduro/collision: ^8.8
- orchestra/testbench: ^11.0.0||^10.0.0||^9.0.0
- pestphp/pest: ^4.0
- pestphp/pest-plugin-arch: ^4.0
- pestphp/pest-plugin-laravel: ^4.0
- phpstan/extension-installer: ^1.4
- phpstan/phpstan-deprecation-rules: ^2.0
- phpstan/phpstan-phpunit: ^2.0
- smalot/pdfparser: ^2.0
Suggests
- aws/aws-sdk-php: Use AWS KMS as the BYOK key-management driver (kms.aws)
- smalot/pdfparser: Parse text-based PDF documents
README
RAG Engine for Laravel
π Full documentation: laravel-rag-engine.selli.io
Add semantic search and AI answers over your own content to any Laravel app. RAG Engine owns the whole Retrieval-Augmented Generation pipeline β ingesting documents, splitting them, turning them into searchable vectors, and retrieving the most relevant passages for any query. Writing a final answer with an LLM is an optional layer on top.
Infrastructure, not a feature. The engine owns ingestion β retrieval; generation is optional and decoupled. Vertical packages, internal agents and search modules build on top without re-implementing ingestion, chunking, embedding or retrieval.
What you can build
- π Semantic search β a search box that matches by meaning, not keywords.
- π€ AI Q&A / chatbots β LLM answers grounded in your content, with citations.
- π "Ask your docs / tickets / wiki" features inside an existing app.
- π§ Similarity / recommendations β "find records like this one".
Use just the search half (no LLM, no AI bill) or add generation later β same code, one config switch.
Table of contents
- Features
- Requirements
- Installation
- Quick start
- Indexing Eloquent models
- Asking questions with an LLM
- Configuration
- Supported drivers
- Security & multi-tenancy
- Documentation
- Testing & development
- License
Features
- Multi-format ingestion β raw text, file uploads, URLs (SSRF-guarded), cloud storage and Eloquent records. Safely parses Markdown, HTML, XML, CSV, JSON, DOCX and PDF.
- Pluggable everything β parsing, chunking, embedding, vector store, reranking and LLM are swappable drivers behind stable contracts.
- 10 embedding providers β OpenAI, Azure OpenAI, Mistral, Jina, Voyage,
Cohere, Gemini, Hugging Face, Ollama, plus a deterministic
fakedriver for tests. - Powerful retrieval β metadata filters, hybrid (semantic + keyword) search with RRF, MMR diversification, reranking, relevance thresholds and small-to-big (parent-child) context expansion.
- Embeddable Eloquent models β make any model searchable via one contract; recursive composition of relations, auto-sync on change, and vectorβmodel trace-back.
- Security by design β BYOK envelope encryption with a KMS abstraction
(
local+ AWS KMS), crypto-shredding for "right to erasure", and PII redaction on by default. - OCR for scanned PDFs β pluggable OCR (Tesseract) kicks in when a PDF has no text layer.
- Quality evaluation β measure recall@k, precision@k, hit-rate and MRR over a
labelled dataset (
rag:evaluate). - Resilient providers β LLM, reranker and embedder HTTP calls retry transient failures with exponential backoff.
- Multi-tenancy β automatic, fail-closed per-tenant scoping of every query.
- Operations from day one β immutable (WORM) audit log, cost tracking, lifecycle events, queued/batchable ingestion and Artisan commands.
- EU-resident by default β content and embeddings stay in the EU unless you explicitly opt into a non-EU provider.
Requirements
| Requirement | Version |
|---|---|
| PHP | 8.2+ |
| Laravel | 11, 12 or 13 |
| A database | any Laravel-supported (SQLite is fine to start) |
A dedicated vector database is not required to begin: the default store is
in-memory, and the database store works on plain Postgres/MySQL/SQLite. Use
native pgvector or Qdrant at larger scale.
Installation
composer require sellinnate/rag-engine php artisan vendor:publish --tag="rag-engine-config" php artisan vendor:publish --tag="rag-engine-migrations" php artisan migrate
The service provider and Rag facade auto-register via package discovery. Out of
the box the package uses zero-network, deterministic drivers (fake embedder,
in-memory store, local KMS) so your test suite runs offline.
Important
The fake embedder is for tests only β it doesn't understand meaning. For
a real search feature, configure a real embedder (see
Configuration).
Quick start
use Sellinnate\RagEngine\Facades\Rag; // 1. INGEST β register content as a Document (stored & encrypted, not yet searchable). $document = Rag::ingest( Rag::source()->text('Refunds are issued within 14 business days of an approved request.') ); // 2. PROCESS β run the pipeline: parse β clean & redact PII β chunk β embed β store. Rag::process($document); // or: ProcessDocumentJob::dispatch(...) on a queue // 3. SEARCH β find the most relevant chunks by meaning. $hits = Rag::search('how long until I get my money back?')->topK(3)->get(); $hits[0]->content; // "Refunds are issued within 14 business days..." $hits[0]->score; // relevance score $hits[0]->metadata['source_ref']; // provenance: where it came from // 4. (OPTIONAL) ASK β let an LLM write a cited answer from the retrieved chunks. $answer = Rag::ask('how long do refunds take?')->using('openai')->generate(); $answer->answer; // "Refunds take 14 business days. [1]" $answer->citations; // [['index' => 1, 'document_id' => 'β¦', 'chunk_id' => 'β¦']]
Refine retrieval fluently:
$hits = Rag::search('envelope encryption') ->topK(5) ->threshold(0.4) // drop weak matches ->where('tag', 'docs') // metadata filter ->hybrid() // semantic + keyword (RRF) ->rerank() // precision pass ->expandParents() // small-to-big context ->get();
Indexing Eloquent models
If the content you want to search already lives in your database, make the model embeddable β it then stays in sync automatically as rows change, and every vector traces back to its model.
use Sellinnate\RagEngine\Concerns\HasEmbeddings; use Sellinnate\RagEngine\Contracts\Embeddable; use Sellinnate\RagEngine\Eloquent\EmbeddableDefinition; class Article extends Model implements Embeddable { use HasEmbeddings; // auto-indexes on save, removes on delete public function toEmbeddable(): EmbeddableDefinition { return EmbeddableDefinition::make() ->add('Title', $this->title) ->add('Body', $this->body) ->include($this->author, 'author') // compose a related model ->includeMany($this->comments, 'comments'); // recursively } } // Trace a search hit back to its model: $article = Rag::models()->resolve($hits[0]); // App\Models\Article instance, or null
Model file fields (a PDF/DOCX upload) can be embedded too β addFile() parses
the file to text and folds it into the model's embedding; non-embeddable binaries
(zip/exeβ¦) are skipped or rejected per policy. See
docs/concepts/eloquent-models.md.
Asking questions with an LLM
Search returns the relevant chunks; an LLM turns them into a written, cited
answer via Rag::ask(). This layer is optional and decoupled β with the default
null driver, ask() returns the sources with an empty answer, so search-only
apps carry no LLM dependency.
The package ships two LLM drivers: anthropic (Claude) and openai
(OpenAI and any OpenAI-compatible API β Mistral, Ollama, Groq, OpenRouterβ¦).
# .env β use Anthropic Claude to answer questions RAG_LLM=anthropic RAG_ANTHROPIC_API_KEY=sk-ant-... RAG_ANTHROPIC_MODEL=claude-sonnet-4-6 # or claude-opus-4-8 / claude-haiku-4-5-...
$result = Rag::ask('What is our refund policy?') ->topK(5) ->using('anthropic') // or omit to use the default RAG_LLM ->generate(); $result->answer; // "Refunds are issued within 14 business days. [1]" $result->citations; // [['index' => 1, 'document_id' => 'β¦', 'chunk_id' => 'β¦']] $result->sources; // the SearchHits the answer was built from
Note
Anthropic has no embedding API, so anthropic is a generation-only
driver. Keep a real RAG_EMBEDDER (Mistral, OpenAI, Ollamaβ¦) for the search
side. A common combo is Mistral/Ollama embeddings + Claude answers.
Retrieved content is treated as untrusted: the default prompt fences it and tells the model not to follow instructions inside it (prompt-injection hardening). Full guide: docs/concepts/generation.md.
Configuration
Configuration lives in config/rag-engine.php and works like Laravel's
config/database.php: you define named connections per subsystem and pick a
default. Switching provider = changing one name in .env.
# .env β switch to a real embedder (Ollama is free & local) RAG_EMBEDDER=ollama RAG_OLLAMA_BASE_URL=http://localhost:11434 # ...or a hosted provider: RAG_EMBEDDER=openai RAG_OPENAI_API_KEY=sk-...
Note
API keys go in .env, never in the committed config. A copy-ready list of
every variable ships as .env.example. See
docs/getting-started/configuration.md.
Supported drivers
Embedders (RAG_EMBEDDER)
| Driver | Provider | Residency |
|---|---|---|
openai |
OpenAI | global |
azure-openai |
Azure OpenAI | EU (EU region) |
mistral |
Mistral | EU |
jina |
Jina AI | EU |
voyage |
Voyage AI | global |
cohere |
Cohere | global |
gemini |
Google Gemini | global |
huggingface |
Hugging Face / self-hosted TEI | global / self-host |
ollama |
Ollama (BGE/E5/Nomic) | self-hosted |
fake |
deterministic (tests) | local |
Vector stores (RAG_VECTOR_STORE): memory (tests/dev) Β· database
(portable SQL: Postgres/MySQL/SQLite, brute-force) Β· pgvector (native
Postgres ANN: vector column + HNSW + <=>) Β· qdrant (EU self-hostable, ANN at
scale). Full setup, including where to configure the Postgres connection, is
in the Vector stores guide.
LLMs (RAG_LLM, for ask()): anthropic (Claude) Β· openai (OpenAI and
any OpenAI-compatible API: Mistral, Ollama, Groq, OpenRouterβ¦) Β· null/fake.
Anthropic is generation-only (no embeddings). Answers can be streamed with
Rag::ask(...)->stream().
Rerankers (RAG_RERANKER, optional cross-encoder pass): cohere Β· jina
(EU) Β· null/fake.
KMS (RAG_KMS, BYOK key management): local (dev) Β· aws (AWS KMS,
production).
OCR (RAG_OCR, scanned-PDF fallback): null Β· tesseract.
Parsers: plain text Β· Markdown Β· HTML Β· XML Β· CSV/TSV Β· JSON Β· DOCX Β· PDF (+ OCR for scans).
Chunkers: recursive (default) Β· sentence Β· markdown Β· fixed
(char- or token-based), with optional parent-child and contextual headers.
All drivers share one contract β switching backends needs no code changes, and you can register your own (see docs/guides/custom-drivers.md).
Security & multi-tenancy
- BYOK envelope encryption β content is encrypted at rest with per-item DEKs wrapped by a tenant KEK in a KMS; the plaintext key never persists.
- Crypto-shredding β honour "right to erasure" by destroying the key, making data unrecoverable everywhere (including DB backups) at once.
- PII redaction β emails, cards (Luhn), IBANs (mod-97), Italian fiscal codes and phone numbers are redacted before indexing, by default.
- Fail-closed multi-tenancy β every query is automatically scoped to the current tenant; scope can never be widened from a query (a tested invariant).
- Tamper-evident audit log β append-only with database-level WORM triggers.
use Sellinnate\RagEngine\Facades\Rag; // Run work scoped to a tenant (previous tenant restored afterwards): Rag::forTenant('tenant-7', fn () => Rag::search('q')->get()); // Right to erasure β crypto-shred a tenant: Rag::kms()->destroyKey('tenant-7');
See docs/concepts/security.md and docs/concepts/multi-tenancy.md.
Documentation
The full documentation is hosted at
laravel-rag-engine.selli.io. The
sources live in docs/ and are built into a static site with
docmd:
npm install npm run docs:dev # local preview npm run docs:build # static site into ./site
Start here:
- π§ What is RAG? β concepts + glossary, from zero.
- π Quickstart β a complete worked example.
- ποΈ Architecture β how the pieces fit together.
- π₯ Ingesting content Β· π Retrieval & search Β· π¬ Generation
- ποΈ Vector stores & configuration β incl. pgvector Postgres setup.
- π Evaluating quality β recall@k, MRR,
rag:evaluate. - π§© Contracts reference Β· π οΈ Custom drivers
Testing & development
composer test # run the Pest suite (429 tests) composer analyse # PHPStan, level 8 composer format # Laravel Pint (code style) # Coverage (needs a coverage driver, e.g. Xdebug/PCOV): XDEBUG_MODE=coverage vendor/bin/pest --coverage --min=90
Quality gates kept green on every change: 429 tests, PHPStan level 8, Pint clean, β₯90% coverage.
License
MIT β see LICENSE.md.
