AI-powered text-to-query system with RAG, Neo4j, and Qdrant

Maintainers

Details

github.com/condoedge/AI

Source

Issues

Installs: 4

Dependents: 0

Suggesters: 0

Security: 0

Stars: 0

Watchers: 0

Forks: 0

Open Issues: 0

pkg:composer/condoedge/ai

v0.0.1 2025-11-21 15:22 UTC

This package is not auto-updated.

Last update: 2026-01-03 14:31:49 UTC


README

A Laravel package providing intelligent text-to-Cypher query generation with RAG (Retrieval-Augmented Generation), dual-storage coordination (Neo4j + Qdrant), and auto-discovery from Eloquent models.

PHP Version Laravel License: MIT

Table of Contents

Overview

What This Package Does

The AI package transforms natural language questions into executable Neo4j Cypher queries using RAG-powered LLMs. It automatically discovers entity configurations from your existing Eloquent models, eliminating manual setup while maintaining dual-storage synchronization between Neo4j (graph relationships) and Qdrant (vector embeddings).

Core Value Proposition:

  • Zero configuration for most use cases (convention over configuration)
  • Automatic discovery from Eloquent models - no duplication
  • Dual-storage coordination with consistency guarantees
  • Production-ready security (injection protection, retry logic, circuit breakers)
  • RAG-powered intelligent query generation

Key Technologies:

  • Neo4j: Graph database for relationship storage and pattern matching
  • Qdrant: Vector database for semantic similarity search
  • Laravel: PHP framework integration with Eloquent ORM
  • OpenAI/Anthropic: LLM providers for query generation and embeddings

Example

// 1. Make your model Nodeable (zero config needed)
class Customer extends Model implements Nodeable
{
    use HasNodeableConfig;

    protected $fillable = ['name', 'email', 'status'];

    public function scopeActive($query) {
        return $query->where('status', 'active');
    }
}

// 2. Data auto-syncs on create/update/delete
$customer = Customer::create(['name' => 'John Doe', 'email' => 'john@example.com']);
// Automatically stored in Neo4j + Qdrant

// 3. Ask questions in natural language
$response = AI::chat("How many active customers do we have?");
// Generated Cypher: MATCH (n:Customer) WHERE n.status = 'active' RETURN count(n)
// Response: "You have 1,250 active customers in the system."

Architecture

High-Level Overview

┌─────────────────────────────────────────────────────────────────────┐
│                         User Question                                │
│                "Show active customers in USA"                        │
└────────────────────────────┬────────────────────────────────────────┘
                             │
                             ▼
┌─────────────────────────────────────────────────────────────────────┐
│                     ChatOrchestrator                                 │
│  ┌─────────────────────────────────────────────────────────┐        │
│  │ 1. Context Retrieval (RAG)                              │        │
│  │    - Vector search for similar past queries             │        │
│  │    - Fetch graph schema from Neo4j                      │        │
│  │    - Retrieve example entities                          │        │
│  └─────────────────────────────────────────────────────────┘        │
│  ┌─────────────────────────────────────────────────────────┐        │
│  │ 2. Query Generation                                     │        │
│  │    - Build LLM prompt with context                      │        │
│  │    - Generate Cypher query                              │        │
│  │    - Validate (injection check, complexity, safety)     │        │
│  └─────────────────────────────────────────────────────────┘        │
│  ┌─────────────────────────────────────────────────────────┐        │
│  │ 3. Query Execution                                      │        │
│  │    - Execute against Neo4j with timeout                 │        │
│  │    - Format results                                     │        │
│  └─────────────────────────────────────────────────────────┘        │
│  ┌─────────────────────────────────────────────────────────┐        │
│  │ 4. Response Generation                                  │        │
│  │    - Transform to natural language                      │        │
│  │    - Extract insights (trends, outliers, patterns)      │        │
│  │    - Suggest visualizations                             │        │
│  └─────────────────────────────────────────────────────────┘        │
└────────────────────────────┬────────────────────────────────────────┘
                             │
                             ▼
┌─────────────────────────────────────────────────────────────────────┐
│        Natural Language Answer + Insights + Suggestions             │
└─────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────┐
│                      Data Ingestion Flow                             │
└─────────────────────────────────────────────────────────────────────┘

Eloquent Model Event (create/update/delete)
           │
           ▼
   HasNodeableConfig Trait
   (Auto-sync listener)
           │
           ▼
   EntityAutoDiscovery
   - Reflect on model
   - Extract properties from $fillable, $casts
   - Discover relationships (belongsTo)
   - Detect text fields for embedding
   - Convert scopes to Cypher patterns
           │
           ▼
   DataIngestionService
   ┌──────────────────────────────────┐
   │ Compensating Transaction Pattern │
   └──────────────────────────────────┘
           │
           ├─────────────────────┬──────────────────────┐
           ▼                     ▼                      ▼
    Generate Embedding    Neo4j Store           Qdrant Store
    (OpenAI/Anthropic)   (Graph + Relations)   (Vector + Metadata)
           │                     │                      │
           └─────────────────────┴──────────────────────┘
                                 │
                     ┌───────────┴──────────┐
                     │  Success?            │
                     │  - Both stores OK    │
                     │  - Rollback on fail  │
                     └──────────────────────┘

Core Components

1. Auto-Discovery System

  • EntityAutoDiscovery: Introspects Eloquent models using PHP Reflection
  • CypherScopeAdapter: Converts Eloquent scopes to Cypher patterns
  • SchemaInspector: Extracts database schema hints
  • ConfigCache: Caches expensive discovery operations

2. Dual-Storage Coordination

  • DataIngestionService: Orchestrates writes to both stores with compensating transactions
  • Neo4j (GraphStore): Node storage, relationships, pattern matching
  • Qdrant (VectorStore): Vector embeddings, semantic search, metadata filtering
  • Auto-Sync: Automatic synchronization via Laravel model events

3. RAG System

  • ContextRetriever: Fetches similar queries + schema + examples
  • PatternLibrary: Pre-defined query patterns for common questions
  • QueryGenerator: LLM-powered Cypher generation with validation
  • QueryExecutor: Safe query execution with timeouts and limits
  • ResponseGenerator: Natural language explanations with insights

4. Security Layer

  • CypherSanitizer: Injection prevention for labels, types, property keys
  • RetryPolicy: Exponential backoff with jitter
  • CircuitBreaker: Fail-fast pattern for cascading failure prevention
  • SensitiveDataSanitizer: API key and credential redaction in logs

Key Features

Auto-Discovery from Eloquent Models

Eliminates duplication by extracting entity configuration directly from your models:

  • Properties: Auto-detected from $fillable, $casts, $dates
  • Relationships: Auto-discovered from belongsTo() methods
  • Scopes: Auto-converted from scopeX() methods to Cypher patterns
  • Embed Fields: Text fields automatically identified for vector embeddings
  • Aliases: Generated from table names for semantic matching

Three-tier fallback: Explicit config > Legacy config file > Auto-discovery

Dual-Storage Coordination

Synchronized writes to Neo4j (graph) and Qdrant (vector) with consistency guarantees:

  • Compensating Transactions: Automatic rollback on partial failure
  • No Orphaned Data: Either both stores succeed or both roll back
  • Independent Resilience: One store failing doesn't break the other
  • Batch Operations: Efficient bulk ingestion

RAG-Powered Query Generation

Intelligent Cypher generation using retrieval-augmented generation:

  • Context-Aware: Similar past queries inform new query generation
  • Schema-Aware: Graph structure guides query construction
  • Example-Based: Sample data provides reference patterns
  • Pattern Matching: Pre-defined templates for common queries
  • Validation: Syntax, safety, and complexity checks

Auto-Sync Capabilities

Zero-boilerplate synchronization via Laravel events:

  • Event-Driven: Automatic sync on create, update, delete
  • Async Support: Optional queue processing
  • Configurable: Per-model, per-operation granularity
  • Error Handling: Silent failure with logging or exception throwing

Extensible Prompt Builders

Both the Query Generator and Response Generator use extensible section-based architectures that allow you to customize how prompts are built.

SemanticPromptBuilder (Query Generation)

The SemanticPromptBuilder constructs prompts for Cypher query generation using a pipeline of sections:

Section Priority Purpose
project_context 10 Project name, description, domain, business rules
generic_context 15 Current date/time
schema 20 Graph schema (labels, relationships, properties)
relationships 30 Entity relationships with exact directions
example_entities 40 Sample data showing actual types/formats
similar_queries 50 RAG: similar past queries for few-shot learning
detected_entities 60 Entities detected in user's question
detected_scopes 65 Business concepts detected in question
pattern_library 70 Available query patterns
query_rules 75 Query generation rules
question 80 User's question
task_instructions 90 Final task instructions

Extension Methods:

use Condoedge\Ai\Services\SemanticPromptBuilder;

// Global extension (applies to all new instances)
SemanticPromptBuilder::extendBuild(function($builder) {
    $builder->setProjectContext([
        'name' => 'My CRM',
        'description' => 'Customer relationship management system',
        'domain' => 'Sales',
        'business_rules' => [
            'All dates are stored as ISO strings',
            'Active customers have status = "active"',
        ],
    ]);
});

// Instance-level extensions
$builder = app(SemanticPromptBuilder::class);

// Add custom section
$builder->addSection(new CustomContextSection());

// Remove a section
$builder->removeSection('similar_queries');

// Replace a section
$builder->replaceSection('project_context', new MyProjectContextSection());

// Extend with callbacks (before/after sections)
$builder->extendAfter('schema', function($question, $context, $options) {
    return "\n=== CUSTOM INFO ===\n\nAdditional context here\n\n";
});

// Convenience methods
$builder->addBusinessRule('Orders cannot be deleted once shipped');
$builder->addQueryRule('PERFORMANCE', 'Always use indexed properties');
$builder->setMaxSimilarQueries(5);

Creating Custom Sections:

use Condoedge\Ai\Contracts\PromptSectionInterface;
use Condoedge\Ai\Services\PromptSections\BasePromptSection;

class DomainTermsSection extends BasePromptSection
{
    protected string $name = 'domain_terms';
    protected int $priority = 25; // After schema, before relationships

    public function format(string $question, array $context, array $options = []): string
    {
        return $this->header('DOMAIN TERMINOLOGY') .
               "- 'Client' and 'Customer' are synonyms\n" .
               "- 'Active' means status = 'active' or 'enabled'\n\n";
    }

    public function shouldInclude(string $question, array $context, array $options = []): bool
    {
        // Only include if question mentions domain terms
        return str_contains(strtolower($question), 'client') ||
               str_contains(strtolower($question), 'active');
    }
}

ResponseGenerator (Response Generation)

The ResponseGenerator uses the same extensible pattern for building prompts that explain query results:

Section Priority Purpose
system 10 System prompt (LLM role)
project_context 20 Project context for explanations
question 30 Original user question
query 40 Executed Cypher query
data 50 Query results
statistics 60 Execution statistics
guidelines 70 Response guidelines (style, format)
task 80 Final task instruction

Extension Methods:

use Condoedge\Ai\Services\ResponseGenerator;

// Global extension
ResponseGenerator::extendBuild(function($generator) {
    $generator->setSystemPrompt(
        "You are a friendly data analyst who explains results clearly.\n\n"
    );
    $generator->addGuideline('Always mention the total count first');
});

// Instance-level extensions
$generator = app(ResponseGenerator::class);

// Add/remove/replace sections
$generator->addSection(new CustomAnalysisSection());
$generator->removeSection('statistics');

// Convenience methods
$generator->setProjectContext(['name' => 'My App', 'domain' => 'E-commerce']);
$generator->setMaxDataItems(20);

// Extend with callbacks
$generator->extendAfter('data', function($context, $options) {
    $count = count($context['data']);
    return "\nNote: Showing {$count} results.\n\n";
});

Security Architecture

The package implements defense-in-depth security with multiple layers of protection. All security features are enabled by default with no configuration required.

1. Injection Protection

Cypher Injection Prevention (CypherSanitizer):

  • Validates all labels, relationship types, and property keys against strict patterns
  • Regex: [a-zA-Z_][a-zA-Z0-9_]* (alphanumeric + underscore, must start with letter)
  • Blocks reserved Cypher keywords (MATCH, DELETE, DROP, CREATE, etc.)
  • Maximum length validation (255 characters)
  • Backtick escaping as additional defense layer

SQL Injection Prevention (SchemaInspector):

  • Table/index name validation in SQLite PRAGMA queries
  • Prevents malicious identifiers in auto-discovery schema introspection
  • Parameter binding for MySQL/PostgreSQL

Example Protection:

// Automatic injection protection
CypherSanitizer::validateLabel("User}); DELETE (n) //");
// Throws: CypherInjectionException

CypherSanitizer::validateLabel("User_Profile");
// Returns: "User_Profile" ✓

2. Data Consistency Guarantees

Compensating Transactions (DataIngestionService):

  • Two-phase commit pattern for dual-store operations
  • Automatic rollback on vector store failure
  • Automatic restoration on deletion failure
  • Critical error logging when compensation fails

Transaction Flow:

1. Write to Neo4j → Success
2. Write to Qdrant → Failure
3. Rollback Neo4j → Success
4. Throw DataConsistencyException

3. Resilience & Fault Tolerance

Circuit Breaker Pattern (CircuitBreaker):

  • States: CLOSED → OPEN → HALF_OPEN → CLOSED
  • Prevents cascading failures
  • Configurable failure threshold (default: 5 failures)
  • Configurable recovery timeout (default: 30 seconds)
  • Fail-fast when circuit open

Retry Policy (RetryPolicy):

  • Exponential backoff with jitter
  • Prevents thundering herd problem
  • Configurable max attempts (default: 3-5 depending on operation)
  • Separate policies for API calls, database operations, network requests

Example:

// Automatic retry and circuit breaking
$neo4j = new Neo4jStore(); // Includes retry + circuit breaker
$result = $neo4j->createNode('User', $properties);
// Retries on transient failures, fails fast if circuit open

4. Sensitive Data Protection

Log Sanitization (SensitiveDataSanitizer):

  • Automatic redaction of API keys, passwords, tokens, secrets
  • Pattern detection for multiple formats:
    • OpenAI keys: sk-...
    • Anthropic keys: sk-ant-...
    • AWS credentials
    • Bearer tokens
    • Database passwords
  • Stack trace sanitization
  • Absolute path removal

Protected Patterns:

// Automatic sanitization
Log::error('API failed', SensitiveDataSanitizer::forLogging([
    'api_key' => 'sk-abc123...', // Logged as: ***REDACTED***
    'error' => $exception->getMessage(),
]));

5. Recursion & Resource Protection

Auto-Discovery Guards:

  • Maximum stack depth: 5 levels
  • Circular reference detection
  • Automatic cycle breaking
  • Deep merge protection: 10 level limit

Resource Limits:

  • Query timeout: 30 seconds (configurable)
  • Result limit: 100 rows (configurable)
  • Max query complexity scoring
  • Identifier length limits

Security Testing

The package includes comprehensive security test coverage:

  • Injection Testing: Adversarial inputs, malicious patterns, edge cases
  • Data Consistency Testing: Partial failure scenarios, rollback verification
  • Resilience Testing: Retry logic, circuit breaker state transitions, timeout handling
  • Sanitization Testing: API key patterns, stack traces, nested objects

All security tests are passing. See tests/Unit/StressTests/ for details.

Quick Start

Prerequisites

  • PHP 8.1+
  • Laravel 9.x+
  • Neo4j 4.4+
  • Qdrant 1.0+
  • OpenAI or Anthropic API key

Installation

# Install package
composer require condoedge/ai

# Publish config (optional)
php artisan vendor:publish --tag=ai-config

Configuration

Add to .env:

# Neo4j
NEO4J_HOST=http://localhost:7474
NEO4J_USER=neo4j
NEO4J_PASSWORD=your-password

# Qdrant
QDRANT_HOST=http://localhost:6333

# LLM Provider (OpenAI or Anthropic)
OPENAI_API_KEY=sk-your-key
AI_LLM_PROVIDER=openai
AI_EMBEDDING_PROVIDER=openai

Basic Usage

// 1. Make models Nodeable
use Condoedge\Ai\Domain\Contracts\Nodeable;
use Condoedge\Ai\Domain\Traits\HasNodeableConfig;

class Customer extends Model implements Nodeable
{
    use HasNodeableConfig;

    protected $fillable = ['name', 'email', 'status'];
}

// 2. Discover and generate config
php artisan ai:discover

// This generates config/entities.php with discovered configuration:
// - Neo4j label, properties, relationships
// - Qdrant collection, embed fields
// - Aliases for natural language queries
// Review and customize config/entities.php as needed

// 3. Bulk ingest existing data (one-time setup)
php artisan ai:ingest

// This ingests all existing entities into Neo4j + Qdrant
// - Processes in batches for efficiency
// - Shows progress bar
// - Reports success/failure counts

// 4. New data auto-syncs (after initial ingest)

    public function orders() {
        return $this->hasMany(Order::class);
    }

    public function scopeActive($query) {
        return $query->where('status', 'active');
    }
}

// 2. Data auto-syncs
$customer = Customer::create([
    'name' => 'John Doe',
    'email' => 'john@example.com',
    'status' => 'active'
]);
// Automatically stored in Neo4j + Qdrant

// 3. Ask questions
use Condoedge\Ai\Facades\AI;

$response = AI::chat("How many active customers do we have?");
// Generates Cypher, executes, returns natural language answer

Manual Configuration Override

use Condoedge\Ai\Domain\ValueObjects\NodeableConfig;

class Customer extends Model implements Nodeable
{
    use HasNodeableConfig;

    public function nodeableConfig(): NodeableConfig
    {
        return NodeableConfig::discover($this)
            ->embedFields(['name', 'bio'])        // Override embed fields
            ->addAlias('client')                  // Add custom alias
            ->addRelationship('HAS_ORDER', 'Order', 'customer_id')
            ->disableVectorStore();               // Graph-only entity
    }
}

Project Structure

ai/
├── config/
│   ├── ai.php              # Main package configuration
│   └── ai-patterns.php     # Query pattern definitions
├── docs/
│   ├── ARCHITECTURE.md     # Detailed technical architecture
│   └── GETTING-STARTED.md  # User guide and tutorials
├── examples/
│   └── *.php               # Working examples and demos
├── src/
│   ├── Contracts/          # Service interfaces
│   │   ├── DataIngestionServiceInterface.php
│   │   ├── GraphStoreInterface.php
│   │   ├── VectorStoreInterface.php
│   │   ├── LlmProviderInterface.php
│   │   └── ...
│   ├── Domain/             # Domain models and value objects
│   │   ├── Contracts/
│   │   │   └── Nodeable.php           # Entity interface
│   │   ├── Traits/
│   │   │   └── HasNodeableConfig.php  # Auto-sync + discovery
│   │   └── ValueObjects/
│   │       ├── GraphConfig.php
│   │       ├── VectorConfig.php
│   │       └── NodeableConfig.php
│   ├── Services/           # Core services
│   │   ├── Discovery/
│   │   │   ├── EntityAutoDiscovery.php
│   │   │   ├── CypherScopeAdapter.php
│   │   │   ├── SchemaInspector.php
│   │   │   └── ...
│   │   ├── PromptSections/  # Extensible prompt sections
│   │   │   ├── BasePromptSection.php
│   │   │   ├── ProjectContextSection.php
│   │   │   ├── SchemaSection.php
│   │   │   ├── RelationshipsSection.php
│   │   │   ├── ExampleEntitiesSection.php
│   │   │   ├── SimilarQueriesSection.php
│   │   │   ├── QueryRulesSection.php
│   │   │   └── ...
│   │   ├── ResponseSections/ # Extensible response sections
│   │   │   ├── BaseResponseSection.php
│   │   │   ├── SystemPromptSection.php
│   │   │   ├── GuidelinesSection.php
│   │   │   └── ...
│   │   ├── Resilience/
│   │   │   ├── RetryPolicy.php
│   │   │   └── CircuitBreaker.php
│   │   ├── Security/
│   │   │   └── SensitiveDataSanitizer.php
│   │   ├── DataIngestionService.php
│   │   ├── ContextRetriever.php
│   │   ├── SemanticPromptBuilder.php
│   │   ├── QueryGenerator.php
│   │   ├── QueryExecutor.php
│   │   └── ResponseGenerator.php
│   ├── GraphStore/         # Neo4j implementation
│   │   ├── Neo4jStore.php
│   │   └── CypherSanitizer.php
│   ├── VectorStore/        # Qdrant implementation
│   │   └── QdrantStore.php
│   ├── LlmProviders/       # LLM integrations
│   │   ├── OpenAiLlmProvider.php
│   │   └── AnthropicLlmProvider.php
│   ├── EmbeddingProviders/
│   │   ├── OpenAiEmbeddingProvider.php
│   │   └── AnthropicEmbeddingProvider.php
│   ├── Jobs/               # Queue jobs
│   │   ├── IngestEntityJob.php
│   │   ├── SyncEntityJob.php
│   │   └── RemoveEntityJob.php
│   ├── Exceptions/         # Custom exceptions
│   │   ├── CypherInjectionException.php
│   │   ├── DataConsistencyException.php
│   │   ├── CircuitBreakerOpenException.php
│   │   └── ...
│   └── Facades/
│       └── AI.php          # Main facade
├── tests/
│   ├── Unit/               # Unit tests
│   │   ├── Domain/
│   │   ├── Services/
│   │   └── StressTests/    # Security & resilience tests
│   ├── Integration/        # Integration tests
│   │   ├── EntityAutoDiscoveryTest.php
│   │   ├── DualStorageCoordinationTest.php
│   │   └── ...
│   └── Fixtures/           # Test models
└── composer.json

Key Files & Purposes

Core Services:

  • DataIngestionService.php: Dual-store coordination with compensating transactions
  • EntityAutoDiscovery.php: Model introspection and config extraction
  • CypherScopeAdapter.php: Eloquent scope to Cypher conversion
  • ContextRetriever.php: RAG context fetching (similar queries + schema)
  • SemanticPromptBuilder.php: Extensible prompt builder with section pipeline
  • QueryGenerator.php: LLM-powered Cypher generation with validation
  • QueryExecutor.php: Safe query execution with timeouts
  • ResponseGenerator.php: Natural language response generation with extensible sections

Extensible Sections:

  • PromptSectionInterface.php: Contract for query prompt sections
  • ResponseSectionInterface.php: Contract for response prompt sections
  • BasePromptSection.php: Base class for query prompt sections
  • BaseResponseSection.php: Base class for response prompt sections

Security:

  • CypherSanitizer.php: Injection prevention for Cypher identifiers
  • SensitiveDataSanitizer.php: API key/credential redaction in logs
  • RetryPolicy.php: Exponential backoff retry logic
  • CircuitBreaker.php: Circuit breaker pattern for resilience

Configuration:

  • HasNodeableConfig.php: Trait providing auto-sync and discovery
  • NodeableConfig.php: Fluent builder for entity configuration
  • GraphConfig.php, VectorConfig.php: Store-specific configurations

Testing

Running Tests

# All tests
composer test

# Unit tests only
composer test-unit

# Integration tests only
composer test-integration

# With coverage report
composer test-coverage

Test Organization

tests/
├── Unit/
│   ├── Domain/              # Domain model tests
│   ├── Services/
│   │   ├── Discovery/       # Auto-discovery tests
│   │   ├── Resilience/      # Retry + circuit breaker tests
│   │   └── Security/        # Sanitization tests
│   └── StressTests/
│       ├── AdversarialSecurityTest.php      # Injection tests
│       ├── DualStorageFailureTest.php       # Consistency tests
│       └── ...
├── Integration/
│   ├── EntityAutoDiscoveryTest.php          # End-to-end discovery
│   ├── DualStorageCoordinationTest.php      # Store coordination
│   └── RealBusinessScenarioTest.php         # Business logic tests
└── Fixtures/
    └── Test*.php            # Test models

Test Coverage

  • Unit tests: 150+ tests
  • Integration tests: 20+ tests
  • Security tests: 54 tests (all passing)
    • 30 Cypher injection scenarios
    • 5 data consistency scenarios
    • 19 resilience scenarios (retry + circuit breaker)

Documentation

  • Foundations Track (resources/docs/1.0/foundations/): Requirements, installation, infrastructure, configuration, and troubleshooting (rendered at /ai-docs/foundations).
  • Usage & Extension Track (resources/docs/1.0/usage/): Quick start, AI facade APIs, ingestion/context guides, Laravel integration, testing, and extension playbooks.
  • Internals & Architecture Track (resources/docs/1.0/internals/): Architecture diagrams, component deep dives, data flows, storage reference, resilience/security details.
  • Examples: Working code samples in examples/ directory.
  • Tests: Test suite demonstrates usage patterns.

Artisan Commands

php artisan ai:discover

Auto-discover Nodeable entities and generate config/entities.php.

# Discover all Nodeable models
php artisan ai:discover

# Discover specific model
php artisan ai:discover --model="App\Models\Customer"

# Overwrite existing config
php artisan ai:discover --force

# Preview without writing
php artisan ai:discover --dry-run

What it does:

  • Scans app/Models for classes implementing Nodeable
  • Analyzes models to discover:
    • Neo4j label, properties, relationships
    • Qdrant collection, embed fields
    • Aliases for natural language queries
    • Eloquent scopes for Cypher conversion
  • Generates/merges config/entities.php

When to run:

  • Initial setup
  • After adding new Nodeable models
  • After changing model structure (new columns, relationships, scopes)

php artisan ai:ingest

Bulk ingest existing Nodeable entities into Neo4j and Qdrant.

# Ingest all entities
php artisan ai:ingest

# Ingest specific model
php artisan ai:ingest --model="App\Models\Customer"

# Clear stores before ingesting
php artisan ai:ingest --fresh

# Custom batch size (default: 100)
php artisan ai:ingest --chunk=500

# Preview without ingesting
php artisan ai:ingest --dry-run

What it does:

  • Finds all Nodeable models
  • Processes existing database records in batches
  • Ingests into Neo4j (graph) + Qdrant (vectors)
  • Shows progress bar and success/failure counts

When to run:

  • Initial setup (when you have existing data)
  • After migrating to the AI package
  • To rebuild graph/vector stores from scratch (--fresh)

Note: After initial ingestion, new/updated entities auto-sync via model events.

Development

Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Make your changes
  4. Run tests: composer test
  5. Commit: git commit -m "Add amazing feature"
  6. Push: git push origin feature/amazing-feature
  7. Open a pull request

Code Standards

  • PSR-12: PHP coding standard
  • PHP 8.1+: Type hints, readonly properties, intersection types
  • Interface-based: Depend on interfaces, not implementations
  • Test Coverage: All new features must have tests
  • Documentation: Update docs for API changes

Running Development Environment

# Install dependencies
composer install

# Run tests
composer test

# Generate coverage report
composer test-coverage

Technical Decisions

Why Neo4j + Qdrant?

Neo4j (Graph Database):

  • Native graph storage optimized for relationship traversal
  • Cypher query language provides expressive pattern matching
  • Efficient multi-hop queries for complex relationships
  • ACID transactions for data consistency

Qdrant (Vector Database):

  • High-performance vector similarity search
  • Metadata filtering alongside vector search
  • Scales to millions of vectors
  • Easy deployment (Docker, cloud)

Why Dual-Storage?

  • Neo4j excels at relationship queries ("who is connected to whom")
  • Qdrant excels at semantic search ("find similar entities")
  • Together: Powerful hybrid queries combining structure and semantics

Why Compensating Transactions vs 2PC?

Two-Phase Commit (2PC) requires:

  • Coordinator service
  • Prepare phase locks
  • Complex failure recovery
  • Increased latency

Compensating Transactions provide:

  • Simpler implementation (rollback on failure)
  • No distributed coordinator needed
  • Lower latency (no prepare phase)
  • Adequate consistency for this use case

Trade-off: Brief window of inconsistency on failure (acceptable for AI indexing, not financial transactions)

Why Interface-Based Design?

Benefits:

  • Testability: Easy to mock dependencies in unit tests
  • Flexibility: Swap implementations (e.g., switch from OpenAI to Anthropic)
  • Loose Coupling: Components depend on contracts, not concrete classes
  • Laravel Integration: Works naturally with service container binding

Example:

// Bind interface to implementation
$this->app->singleton(GraphStoreInterface::class, Neo4jStore::class);

// Swap to different implementation
$this->app->singleton(GraphStoreInterface::class, ArangoDbStore::class);

Why Auto-Discovery vs Manual Config?

Manual Configuration Issues:

  • Duplication between model definition and config file
  • Config drift when models change
  • Maintenance burden
  • Error-prone

Auto-Discovery Benefits:

  • Single source of truth (Eloquent model)
  • Zero duplication
  • Automatic updates when models change
  • Laravel philosophy: Convention over configuration

Escape Hatch: Override via nodeableConfig() method when needed

License

MIT License - see LICENSE file for details

Support

  • Issues: GitHub Issues for bug reports and feature requests
  • Documentation: See resources/docs/1.0/ (rendered via LaRecipe at /ai-docs)
  • Examples: Working code samples in examples/ directory

Built with Laravel | Powered by RAG | Secured by Design