hamzi / nativerag
A production-ready, privacy-first Local AI Controller and Retrieval-Augmented Generation (RAG) engine for Laravel 11, 12, and 13. Supports Ollama, LM Studio, zero-infra vector search, SSE streaming, and full conversational memory โ all with 100% data residency.
Requires
- php: ^8.2|^8.5
- illuminate/database: ^11.0|^12.0|^13.0
- illuminate/http: ^11.0|^12.0|^13.0
- illuminate/support: ^11.0|^12.0|^13.0
- symfony/http-foundation: ^7.0|^8.0
Requires (Dev)
- laravel/pint: ^1.0
- orchestra/testbench: ^9.0|^10.0|^11.0
- phpstan/phpstan: ^2.0
- phpunit/phpunit: ^10.0|^11.0
README
๐ง Laravel NativeRAG
A world-class, production-ready Local AI & RAG Engine for Laravel 11, 12 & 13
Laravel NativeRAG empowers you to run fully localized, privacy-first AI workflows using models hosted in Ollama or LM Studio โ directly from your Laravel application.
No OpenAI keys. No Pinecone. No cloud data leaks. 100% data residency. Zero external dependencies.
โจ Features
| Feature | Details |
|---|---|
| ๐ค Multi-Driver LLM | Switch between Ollama & LM Studio via Laravel's Manager pattern |
| ๐๏ธ Zero-Infra Vector Search | Cosine Similarity powered by PHP + native SQL. No Pinecone needed |
| โก SSE Streaming | Real-time token streaming to Alpine.js / Livewire frontends |
| ๐งฉ Auto-Embedding Trait | Add Embeddable to any Eloquent model for automatic vector indexing |
| ๐ง Persistent Memory | Multi-turn chat history with sliding-window pruning |
| ๐ Payload Encryption | Encrypt stored chat history using Laravel's App Key |
| ๐ก๏ธ Strict Types | PHP 8.2+ with declare(strict_types=1), Readonly DTOs, Enums |
| ๐ pgvector Support | Native PostgreSQL pgvector cosine distance queries |
โ Compatibility
| Laravel | PHP | Status |
|---|---|---|
| 13.x | 8.2, 8.3, 8.4, 8.5 | โ Fully Supported |
| 12.x | 8.2, 8.3, 8.4, 8.5 | โ Fully Supported |
| 11.x | 8.2, 8.3, 8.4, 8.5 | โ Fully Supported |
๐ Installation
composer require hamzi/nativerag
Publish configuration and migrations:
php artisan vendor:publish --tag="nativerag-config" php artisan vendor:publish --tag="nativerag-migrations" php artisan migrate
๐ ๏ธ Configuration
Set your driver settings in .env:
NATIVE_RAG_DRIVER=ollama # Ollama OLLAMA_BASE_URL=http://localhost:11434 OLLAMA_CHAT_MODEL=llama3 OLLAMA_EMBEDDING_MODEL=nomic-embed-text # LM Studio LMSTUDIO_BASE_URL=http://localhost:1234 LMSTUDIO_CHAT_MODEL=meta-llama-3-8b-instruct # Chunking & Retrieval NATIVE_RAG_CHUNK_SIZE=1000 NATIVE_RAG_CHUNK_OVERLAP=200 NATIVE_RAG_MIN_SCORE=0.35 # Security NATIVE_RAG_ENCRYPT_PAYLOADS=false
๐ Usage
1. Chat Completions
use Hamzi\NativeRag\Facades\NativeRag; $response = NativeRag::chat([ ['role' => 'system', 'content' => 'You are a senior Laravel engineer.'], ['role' => 'user', 'content' => 'Explain service containers briefly.'], ]); echo $response->content; // The generated text echo $response->promptTokens; // Input tokens used echo $response->completionTokens; // Output tokens generated
2. Real-Time SSE Streaming
use Hamzi\NativeRag\Facades\NativeRag; use Illuminate\Support\Facades\Route; Route::post('/api/ai/stream', function () { return NativeRag::stream([ ['role' => 'user', 'content' => 'Write a comprehensive guide on Eloquent ORM.'], ]); });
Consume in JavaScript (Alpine.js / Vanilla):
const source = new EventSource('/api/ai/stream'); source.onmessage = ({ data }) => { const { content, done } = JSON.parse(data); if (done) { source.close(); return; } document.querySelector('#output').insertAdjacentText('beforeend', content); };
3. Embeddable Models (Auto-Indexing)
Attach the Embeddable trait to any Eloquent model. Whenever the model is saved, its content is automatically chunked and embedded locally.
namespace App\Models; use Illuminate\Database\Eloquent\Model; use Hamzi\NativeRag\Traits\Embeddable; class Article extends Model { use Embeddable; /** * Define the text payload the AI engine will index. */ public function toEmbeddableString(): string { return "Title: {$this->title}\n\nContent: {$this->content}"; } }
4. Semantic Vector Search
use Hamzi\NativeRag\Services\VectorSearchEngine; use Hamzi\NativeRag\Facades\NativeRag; // 1. Embed the user's question $queryVector = NativeRag::embedding()->embed('How does quantum physics relate to computing?'); // 2. Search native database for closest chunks $engine = new VectorSearchEngine(); $results = $engine->search($queryVector, limit: 5, minScore: 0.50); foreach ($results as $chunk) { echo $chunk->chunk_content; // Matching text passage echo $chunk->similarity; // Score: 0.0 โ 1.0 }
5. Switch Driver at Runtime
use Hamzi\NativeRag\Facades\NativeRag; // Use LM Studio for this specific call $response = NativeRag::driver('lmstudio')->chat([ ['role' => 'user', 'content' => 'Summarize this document.'], ]);
๐ Security & Privacy
- 100% On-Premise: All inference runs against Ollama/LM Studio on your own hardware. Zero network calls leave your server.
- Payload Encryption: Enable
NATIVE_RAG_ENCRYPT_PAYLOADS=trueto encrypt all stored chat content and metadata using Laravel's native AES-256-CBC encryption. - SQL Injection Safe: Strictly uses Laravel's parameterized query builder with no raw string interpolations.
- Change Detection: MD5 content hashing prevents re-embedding unchanged documents โ eliminating redundant local API calls.
๐งช Testing & Code Quality
# Run tests composer test # Run code style fixer composer lint # Run static analysis (PHPStan Level 6) composer analyse
๐ค Contributing
Please read CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.
๐ก๏ธ Security Vulnerabilities
Please review the SECURITY.md policy to learn how to responsibly report a vulnerability.
๐ License
The MIT License (MIT). Please see LICENSE.md for more information.