sellinnate/rag-engine

Enterprise Retrieval-Augmented Generation engine for Laravel: ingestion, parsing, chunking, embedding, vector store, retrieval, reranking, BYOK security and multi-tenancy.

Maintainers

Package info

github.com/Sellinnate/laravel-ultimate-rag

Documentation

pkg:composer/sellinnate/rag-engine

Fund package maintenance!

:vendor_name

Statistics

Installs: 0

Dependents: 0

Suggesters: 0

Stars: 0

Open Issues: 0

v1.2.0 2026-06-27 09:08 UTC

This package is auto-updated.

Last update: 2026-06-27 09:41:27 UTC


README

RAG Engine for Laravel β€” semantic search & AI answers

RAG Engine for Laravel

Tests Coverage PHPStan PHP Laravel License Docs

πŸ“– Full documentation: laravel-rag-engine.selli.io

Add semantic search and AI answers over your own content to any Laravel app. RAG Engine owns the whole Retrieval-Augmented Generation pipeline β€” ingesting documents, splitting them, turning them into searchable vectors, and retrieving the most relevant passages for any query. Writing a final answer with an LLM is an optional layer on top.

Infrastructure, not a feature. The engine owns ingestion β†’ retrieval; generation is optional and decoupled. Vertical packages, internal agents and search modules build on top without re-implementing ingestion, chunking, embedding or retrieval.

What you can build

  • πŸ”Ž Semantic search β€” a search box that matches by meaning, not keywords.
  • πŸ€– AI Q&A / chatbots β€” LLM answers grounded in your content, with citations.
  • πŸ“š "Ask your docs / tickets / wiki" features inside an existing app.
  • 🧭 Similarity / recommendations β€” "find records like this one".

Use just the search half (no LLM, no AI bill) or add generation later β€” same code, one config switch.

Table of contents

Features

  • Multi-format ingestion β€” raw text, file uploads, URLs (SSRF-guarded), cloud storage and Eloquent records. Safely parses Markdown, HTML, XML, CSV, JSON, DOCX and PDF.
  • Pluggable everything β€” parsing, chunking, embedding, vector store, reranking and LLM are swappable drivers behind stable contracts.
  • 10 embedding providers β€” OpenAI, Azure OpenAI, Mistral, Jina, Voyage, Cohere, Gemini, Hugging Face, Ollama, plus a deterministic fake driver for tests.
  • Powerful retrieval β€” metadata filters, hybrid (semantic + keyword) search with RRF, MMR diversification, reranking, relevance thresholds and small-to-big (parent-child) context expansion.
  • Embeddable Eloquent models β€” make any model searchable via one contract; recursive composition of relations, auto-sync on change, and vectorβ†’model trace-back.
  • Security by design β€” BYOK envelope encryption with a KMS abstraction (local + AWS KMS), crypto-shredding for "right to erasure", and PII redaction on by default.
  • OCR for scanned PDFs β€” pluggable OCR (Tesseract) kicks in when a PDF has no text layer.
  • Quality evaluation β€” measure recall@k, precision@k, hit-rate and MRR over a labelled dataset (rag:evaluate).
  • Resilient providers β€” LLM, reranker and embedder HTTP calls retry transient failures with exponential backoff.
  • Multi-tenancy β€” automatic, fail-closed per-tenant scoping of every query.
  • Operations from day one β€” immutable (WORM) audit log, cost tracking, lifecycle events, queued/batchable ingestion and Artisan commands.
  • EU-resident by default β€” content and embeddings stay in the EU unless you explicitly opt into a non-EU provider.

Requirements

Requirement Version
PHP 8.2+
Laravel 11, 12 or 13
A database any Laravel-supported (SQLite is fine to start)

A dedicated vector database is not required to begin: the default store is in-memory, and the database store works on plain Postgres/MySQL/SQLite. Use native pgvector or Qdrant at larger scale.

Installation

composer require sellinnate/rag-engine

php artisan vendor:publish --tag="rag-engine-config"
php artisan vendor:publish --tag="rag-engine-migrations"
php artisan migrate

The service provider and Rag facade auto-register via package discovery. Out of the box the package uses zero-network, deterministic drivers (fake embedder, in-memory store, local KMS) so your test suite runs offline.

Important

The fake embedder is for tests only β€” it doesn't understand meaning. For a real search feature, configure a real embedder (see Configuration).

Quick start

use Sellinnate\RagEngine\Facades\Rag;

// 1. INGEST β€” register content as a Document (stored & encrypted, not yet searchable).
$document = Rag::ingest(
    Rag::source()->text('Refunds are issued within 14 business days of an approved request.')
);

// 2. PROCESS β€” run the pipeline: parse β†’ clean & redact PII β†’ chunk β†’ embed β†’ store.
Rag::process($document);            // or: ProcessDocumentJob::dispatch(...) on a queue

// 3. SEARCH β€” find the most relevant chunks by meaning.
$hits = Rag::search('how long until I get my money back?')->topK(3)->get();

$hits[0]->content;                  // "Refunds are issued within 14 business days..."
$hits[0]->score;                    // relevance score
$hits[0]->metadata['source_ref'];   // provenance: where it came from

// 4. (OPTIONAL) ASK β€” let an LLM write a cited answer from the retrieved chunks.
$answer = Rag::ask('how long do refunds take?')->using('openai')->generate();
$answer->answer;                    // "Refunds take 14 business days. [1]"
$answer->citations;                 // [['index' => 1, 'document_id' => '…', 'chunk_id' => '…']]

Refine retrieval fluently:

$hits = Rag::search('envelope encryption')
    ->topK(5)
    ->threshold(0.4)        // drop weak matches
    ->where('tag', 'docs')  // metadata filter
    ->hybrid()              // semantic + keyword (RRF)
    ->rerank()              // precision pass
    ->expandParents()       // small-to-big context
    ->get();

Indexing Eloquent models

If the content you want to search already lives in your database, make the model embeddable β€” it then stays in sync automatically as rows change, and every vector traces back to its model.

use Sellinnate\RagEngine\Concerns\HasEmbeddings;
use Sellinnate\RagEngine\Contracts\Embeddable;
use Sellinnate\RagEngine\Eloquent\EmbeddableDefinition;

class Article extends Model implements Embeddable
{
    use HasEmbeddings; // auto-indexes on save, removes on delete

    public function toEmbeddable(): EmbeddableDefinition
    {
        return EmbeddableDefinition::make()
            ->add('Title', $this->title)
            ->add('Body', $this->body)
            ->include($this->author, 'author')          // compose a related model
            ->includeMany($this->comments, 'comments');  // recursively
    }
}

// Trace a search hit back to its model:
$article = Rag::models()->resolve($hits[0]); // App\Models\Article instance, or null

Model file fields (a PDF/DOCX upload) can be embedded too β€” addFile() parses the file to text and folds it into the model's embedding; non-embeddable binaries (zip/exe…) are skipped or rejected per policy. See docs/concepts/eloquent-models.md.

Asking questions with an LLM

Search returns the relevant chunks; an LLM turns them into a written, cited answer via Rag::ask(). This layer is optional and decoupled β€” with the default null driver, ask() returns the sources with an empty answer, so search-only apps carry no LLM dependency.

The package ships two LLM drivers: anthropic (Claude) and openai (OpenAI and any OpenAI-compatible API β€” Mistral, Ollama, Groq, OpenRouter…).

# .env β€” use Anthropic Claude to answer questions
RAG_LLM=anthropic
RAG_ANTHROPIC_API_KEY=sk-ant-...
RAG_ANTHROPIC_MODEL=claude-sonnet-4-6     # or claude-opus-4-8 / claude-haiku-4-5-...
$result = Rag::ask('What is our refund policy?')
    ->topK(5)
    ->using('anthropic')   // or omit to use the default RAG_LLM
    ->generate();

$result->answer;     // "Refunds are issued within 14 business days. [1]"
$result->citations;  // [['index' => 1, 'document_id' => '…', 'chunk_id' => '…']]
$result->sources;    // the SearchHits the answer was built from

Note

Anthropic has no embedding API, so anthropic is a generation-only driver. Keep a real RAG_EMBEDDER (Mistral, OpenAI, Ollama…) for the search side. A common combo is Mistral/Ollama embeddings + Claude answers.

Retrieved content is treated as untrusted: the default prompt fences it and tells the model not to follow instructions inside it (prompt-injection hardening). Full guide: docs/concepts/generation.md.

Configuration

Configuration lives in config/rag-engine.php and works like Laravel's config/database.php: you define named connections per subsystem and pick a default. Switching provider = changing one name in .env.

# .env β€” switch to a real embedder (Ollama is free & local)
RAG_EMBEDDER=ollama
RAG_OLLAMA_BASE_URL=http://localhost:11434

# ...or a hosted provider:
RAG_EMBEDDER=openai
RAG_OPENAI_API_KEY=sk-...

Note

API keys go in .env, never in the committed config. A copy-ready list of every variable ships as .env.example. See docs/getting-started/configuration.md.

Supported drivers

Embedders (RAG_EMBEDDER)

Driver Provider Residency
openai OpenAI global
azure-openai Azure OpenAI EU (EU region)
mistral Mistral EU
jina Jina AI EU
voyage Voyage AI global
cohere Cohere global
gemini Google Gemini global
huggingface Hugging Face / self-hosted TEI global / self-host
ollama Ollama (BGE/E5/Nomic) self-hosted
fake deterministic (tests) local

Vector stores (RAG_VECTOR_STORE): memory (tests/dev) Β· database (portable SQL: Postgres/MySQL/SQLite, brute-force) Β· pgvector (native Postgres ANN: vector column + HNSW + <=>) Β· qdrant (EU self-hostable, ANN at scale). Full setup, including where to configure the Postgres connection, is in the Vector stores guide.

LLMs (RAG_LLM, for ask()): anthropic (Claude) Β· openai (OpenAI and any OpenAI-compatible API: Mistral, Ollama, Groq, OpenRouter…) Β· null/fake. Anthropic is generation-only (no embeddings). Answers can be streamed with Rag::ask(...)->stream().

Rerankers (RAG_RERANKER, optional cross-encoder pass): cohere Β· jina (EU) Β· null/fake.

KMS (RAG_KMS, BYOK key management): local (dev) Β· aws (AWS KMS, production).

OCR (RAG_OCR, scanned-PDF fallback): null Β· tesseract.

Parsers: plain text Β· Markdown Β· HTML Β· XML Β· CSV/TSV Β· JSON Β· DOCX Β· PDF (+ OCR for scans).

Chunkers: recursive (default) Β· sentence Β· markdown Β· fixed (char- or token-based), with optional parent-child and contextual headers.

All drivers share one contract β€” switching backends needs no code changes, and you can register your own (see docs/guides/custom-drivers.md).

Security & multi-tenancy

  • BYOK envelope encryption β€” content is encrypted at rest with per-item DEKs wrapped by a tenant KEK in a KMS; the plaintext key never persists.
  • Crypto-shredding β€” honour "right to erasure" by destroying the key, making data unrecoverable everywhere (including DB backups) at once.
  • PII redaction β€” emails, cards (Luhn), IBANs (mod-97), Italian fiscal codes and phone numbers are redacted before indexing, by default.
  • Fail-closed multi-tenancy β€” every query is automatically scoped to the current tenant; scope can never be widened from a query (a tested invariant).
  • Tamper-evident audit log β€” append-only with database-level WORM triggers.
use Sellinnate\RagEngine\Facades\Rag;

// Run work scoped to a tenant (previous tenant restored afterwards):
Rag::forTenant('tenant-7', fn () => Rag::search('q')->get());

// Right to erasure β€” crypto-shred a tenant:
Rag::kms()->destroyKey('tenant-7');

See docs/concepts/security.md and docs/concepts/multi-tenancy.md.

Documentation

The full documentation is hosted at laravel-rag-engine.selli.io. The sources live in docs/ and are built into a static site with docmd:

npm install
npm run docs:dev     # local preview
npm run docs:build   # static site into ./site

Start here:

Testing & development

composer test         # run the Pest suite (429 tests)
composer analyse      # PHPStan, level 8
composer format       # Laravel Pint (code style)

# Coverage (needs a coverage driver, e.g. Xdebug/PCOV):
XDEBUG_MODE=coverage vendor/bin/pest --coverage --min=90

Quality gates kept green on every change: 429 tests, PHPStan level 8, Pint clean, β‰₯90% coverage.

License

MIT β€” see LICENSE.md.