mrshennawy / ai-seeder
Generate smart, context-aware dummy data for Laravel databases using the Laravel 12 AI SDK.
Requires
- php: ^8.2
- illuminate/console: ^12.0
- illuminate/database: ^12.0
- illuminate/support: ^12.0
- laravel/ai: ^0.2
- laravel/prompts: ^0.3|^0.4|^1.0
Requires (Dev)
- mockery/mockery: ^1.6
- orchestra/testbench: ^10.0
- pestphp/pest: ^4.0
README
๐ฑ AiSeeder
AI-powered database seeding for Laravel 12 โ realistic, schema-aware, zero-hallucination.
Stop writing fake data by hand. Stop trusting the AI with your IDs.
AiSeeder reads your database schema, resolves every relationship, and lets the AI generate only the creative parts โ while PHP handles everything that must be exact.
โก Why AiSeeder?
Traditional seeders require you to hand-craft factories for every table. Faker gives you random strings โ not contextually realistic data. And if you ask a raw LLM to generate INSERT statements, it will hallucinate IDs, violate foreign keys, and break JSON columns.
AiSeeder solves all of this:
| Problem | AiSeeder's Solution |
|---|---|
| AI invents fake IDs / ULIDs | PHP generates all PKs & FKs โ AI never sees them |
| Parent table is empty | Recursively seeds parent tables automatically |
VARCHAR(2) gets "English" |
Schema analyzer extracts max lengths & ENUMs |
| JSON columns get plain strings | Strict structured output + post-processing |
password column gets "abc123" |
Auto-detected & filled with Hash::make() |
| No visual feedback during generation | Beautiful CLI with spinners, progress bars & token tracking |
๐ฆ Installation
composer require mrshennawy/ai-seeder
The service provider is auto-discovered. Publish the config to customize defaults:
php artisan vendor:publish --tag=ai-seeder-config
Make sure you have at least one AI provider configured in your config/ai.php:
# Any provider supported by the Laravel 12 AI SDK OPENAI_API_KEY=sk-... # or ANTHROPIC_API_KEY=sk-ant-... # or GEMINI_API_KEY=... # or a local Ollama instance
๐ Quick Start
# Seed 10 rows into the users table (default) php artisan ai:seed users # Seed 100 rows php artisan ai:seed orders --count=100 # Fully interactive mode โ pick a table, count, and language php artisan ai:seed
That's it. AiSeeder will analyze your users table, detect that id is a ULID, password needs hashing, and email must be unique โ then ask the AI to generate only the creative columns like name, bio, and phone.
๐ Feature Deep Dive
1. Smart Schema Introspection
AiSeeder reads your database schema at runtime โ not your migrations, not your models โ the actual database state. It extracts:
- โ Column names & data types
- โ
VARCHAR(n)max lengths โ enforced as hard limits on the AI and truncated in PHP as a safety net - โ
ENUM('active','inactive','pending')โ AI is constrained to only these values - โ
UNIQUEconstraints โ AI generates distinct values per row - โ
NULLABLEcolumns โ AI is instructed to occasionally returnnull - โ
JSON/JSONBcolumns โ AI returns structured objects/arrays, PHPjson_encode()s them - โ
Password columns (
password,password_hash, etc.) โ auto-detected by name
๐ Table: users โ 10 row(s) to generate
โโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Column โ Type โ Flags โ
โโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ id โ char โ PK (ULID), LEN(26) โ
โ name โ varchar โ LEN(255) โ
โ email โ varchar โ UNIQUE, LEN(255) โ
โ password โ varchar โ PASSWORD, LEN(255) โ
โ language โ varchar โ LEN(2) โ
โ status โ enum โ ENUM(active|inactive) โ
โ bio โ text โ NULL โ
โ preferences โ json โ NULL, JSON โ
โ created_at โ timestamp โ NULL โ
โ updated_at โ timestamp โ NULL โ
โโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
From this analysis, AiSeeder excludes id, password, created_at, and updated_at from the AI prompt entirely. The AI only generates: name, email, language, status, bio, and preferences.
2. Bulletproof IDs & Foreign Keys โ Zero Hallucination
This is the architectural philosophy that makes AiSeeder production-safe:
The AI generates zero IDs. PHP handles 100% of structural integrity.
| Column Type | Who Handles It | How |
|---|---|---|
| Auto-increment PK | Database | Excluded from AI prompt entirely |
| ULID primary key | PHP | Str::ulid() injected per row |
| UUID primary key | PHP | Str::uuid() injected per row |
Foreign keys (user_id, etc.) |
PHP | Random valid parent ID via array_rand() |
password |
PHP | Hash::make('password') injected |
created_at / updated_at |
PHP | now() injected |
deleted_at |
PHP | Set to null (soft-delete default) |
The AI never even sees these columns in the prompt. This means:
โ
0% chance of SQLSTATE integrity constraint violations from hallucinated IDs
โ
0% chance of invalid ULID format (like "A1" or "1234567890")
โ
0% chance of FK pointing to a non-existent parent record
3. Auto-Resolving Relationships (Recursive Seeding)
When you seed a child table, AiSeeder checks every foreign key:
- Parent has records? โ Fetches existing IDs automatically.
- Parent is empty? โ Pauses, recursively seeds the parent table first, then resumes.
php artisan ai:seed cart_items --count=20 --lang=ar
๐ Resolving relationships...
๐ Resolving: user_id โ users.id
โ ๏ธ Parent table [users] is empty. Recursively seeding it first...
โโโโโโโโโโโโ
โ AiSeeder โ โ Recursive child command for [users]
โโโโโโโโโโโโ
๐ Analyzing schema for table: [users]...
๐ง Generating chunk 1/1 (5 rows)...
โ
Successfully seeded [users] with 5 rows.
โ
Fetched 5 ID(s) from [users].
๐ Resolving: cart_id โ carts.id
โ
Parent table [carts] already has data. Fetched 12 existing IDs.
๐ Resolving: product_id โ products.id
โ ๏ธ Parent table [products] is empty. Recursively seeding it first...
...
The entire dependency tree resolves automatically. Language selection (--lang) propagates to all recursive child commands.
๐ Self-Referencing Tables
Tables like categories with a parent_id โ categories.id are handled gracefully:
php artisan ai:seed categories --count=10
๐ Resolving: parent_id โ categories.id
๐ Self-referencing FK [parent_id] on [categories] โ table is empty, will use NULL.
AiSeeder detects the self-reference, skips recursive seeding (which would cause an infinite loop), and sets parent_id = NULL for the initial batch โ creating root-level categories. If you run it again, subsequent rows will randomly pick from the existing category IDs.
4. Source Code Context Injection โ The Power Feature
Database schemas tell you what a column is. But they can't tell you what a content JSON column should look like when delivery_mode = 'online' vs 'in_person'.
AiSeeder can read your actual PHP code:
php artisan ai:seed course_items \
--context="Modules\Course\Http\Requests\CourseItemRequest" \
--count=20
๐ Loading code context from: Modules\Course\Http\Requests\CourseItemRequest
โ Loaded 87 lines of source code for AI context.
AiSeeder uses PHP's ReflectionClass to locate the source file, reads its raw content with file_get_contents(), and injects it directly into the AI prompt. For example, given a FormRequest like:
class CourseItemRequest extends FormRequest { public function rules(): array { return [ 'type' => 'required|in:lesson,quiz,assignment', 'content' => 'required|array', 'content.location' => 'required_if:delivery_mode,in_person', 'content.meeting_url' => 'required_if:delivery_mode,online', 'content.duration_minutes' => 'required|integer|min:15', ]; } }
The AI will generate content as a proper JSON object:
{
"location": "Riyadh, Building 4, Room 201",
"duration_minutes": 60
}
Instead of the broken flat array ["location", "Riyadh", "duration_minutes", 60] it would normally produce.
Use cases for --context:
| Pass this class... | The AI learns about... |
|---|---|
StoreOrderRequest |
Conditional validation rules, required-with dependencies |
CourseItem (Model) |
$casts, morph maps, $appends, accessor logic |
PaymentObserver |
Business rules triggered on create |
UserPolicy |
Role/permission constraints for realistic role distribution |
๐ก Tip: This is especially powerful for e-learning platforms, ERP systems, or any domain where JSON columns carry complex, context-dependent structures โ like Quran academy platforms with lesson content varying by
delivery_mode.
5. Multi-Language Data Generation
Generate data in any language โ or a realistic mix of multiple languages:
# All text in Arabic php artisan ai:seed users --lang=ar # Bilingual Arabic + English (simulates a real bilingual platform) php artisan ai:seed courses --lang=ar,en # Trilingual dataset php artisan ai:seed products --lang=es,pt,fr # Interactive selection (if --lang is omitted) ๐ What language(s) should the generated data be in? (comma-separated for multiple, e.g., ar,en)
The language instruction is injected at the system prompt level of the AI agent, ensuring authentic names, titles, and descriptions:
# --lang=ar generates:
โโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ name โ email โ
โโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ ู
ุญู
ูุฏ ุงูุดูุงูู โ mahmoud.shennawy@example.com โ
โ ูุงุทู
ุฉ ุฃุญู
ุฏ โ fatima.ahmed@example.com โ
โ ุนุจุฏุงููู ุฎุงูุฏ โ abdullah.khaled@example.com โ
โโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
# --lang=ar,en generates a mix:
โ ุณุงุฑุฉ ู
ุญู
ุฏ โ sarah.m@example.com โ
โ John Mitchell โ john.mitchell@example.com โ
โ ุฃุญู
ุฏ ููุณู โ ahmed.youssef@example.com โ
Technical values (emails, URLs, timestamps, IDs) always remain in ASCII/Latin characters.
6. Beautiful CLI UX & Token Tracking
Built entirely with laravel/prompts for a modern, interactive experience:
โ AiSeeder โ Smart Database Seeder โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ ๐ Analyzing schema for table: [orders]... โ
โ โณ Reading columns, indexes, and constraints... โ
โ โ
โ ๐ Table: orders โ 50 row(s) to generate โ
โ โโโโโโโโโโโโโโฌโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Column โ Type โ Flags โ โ
โ โโโโโโโโโโโโโโผโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโค โ
โ โ id โ char โ PK (ULID), LEN(26) โ โ
โ โ ... โ ... โ ... โ โ
โ โโโโโโโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ ๐ Resolving relationships... โ
โ ๐ Resolving: user_id โ users.id โ
โ โ
Fetched 25 ID(s) from [users]. โ
โ โ
โ โ๏ธ Plan: 50 rows in 1 chunk(s). Language: AR,EN. โ
โ โ Proceed with seeding [orders]? (Yes) โ
โ โ
โ ๐ง Generating chunk 1/1 (50 rows)... โ
โ โณ Waiting for AI to generate data (this may take a moment)... โ
โ โ AI returned 50 row(s). Tokens: 1,847 prompt + 3,291 comp. โ
โ โ
โ ๐พ Inserting chunk 1/1 into [orders] โโโโโโโโโโโโโโโโโโโโ 100% โ
โ โ
โ โ
Successfully seeded [orders] with 50 rows. โ
โ โ
โ ๐ Token Usage Summary โ
โ โโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโ โ
โ โ Metric โ Tokens โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโค โ
โ โ Prompt tokens โ 1,847 โ โ
โ โ Completion tokens โ 3,291 โ โ
โ โ Total tokens โ 5,138 โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Token usage is aggregated across all chunks and recursive parent seeding calls, so you always know the full cost of a seeding operation.
๐ Command Reference
php artisan ai:seed [table] [options]
| Argument / Option | Description | Default |
|---|---|---|
table |
The database table to seed (interactive picker if omitted) | โ |
--count=N |
Number of rows to generate | 10 |
--chunk=N |
Rows per AI request (smaller = safer for token limits) | 50 |
--lang=CODE |
Language(s) โ single (ar) or comma-separated (ar,en,fr) |
en |
--context=CLASS |
Fully-qualified PHP class for business logic context | โ |
--no-interaction |
Skip all prompts (use defaults) | โ |
Examples
# Basic usage php artisan ai:seed users --count=50 # Large dataset with small chunks to avoid token limits php artisan ai:seed products --count=1000 --chunk=25 # Arabic-only data with FormRequest context php artisan ai:seed course_items --count=30 --lang=ar \ --context="Modules\Course\Http\Requests\CourseItemRequest" # Fully non-interactive (CI/CD, scripts) php artisan ai:seed users --count=5 --lang=en --no-interaction
โ๏ธ Configuration
// config/ai-seeder.php return [ // Rows per AI request. Smaller = safer for tokens, larger = fewer API calls. 'chunk_size' => env('AI_SEEDER_CHUNK_SIZE', 50), // Default row count when --count is not provided. 'default_count' => env('AI_SEEDER_DEFAULT_COUNT', 10), // Retry attempts on AI failure (malformed JSON, wrong row count). 'max_retries' => env('AI_SEEDER_MAX_RETRIES', 3), // Default language for generated text content. 'default_language' => env('AI_SEEDER_DEFAULT_LANGUAGE', 'en'), ];
๐๏ธ Architecture
packages/shennawy/ai-seeder/
โโโ config/
โ โโโ ai-seeder.php # Published configuration
โโโ src/
โ โโโ Agents/
โ โ โโโ SeederAgent.php # Laravel AI SDK Agent (structured output)
โ โโโ Console/Commands/
โ โ โโโ AiSeedCommand.php # The ai:seed Artisan command
โ โโโ Contracts/
โ โ โโโ DataGeneratorInterface.php
โ โ โโโ RelationshipResolverInterface.php
โ โ โโโ SchemaAnalyzerInterface.php
โ โโโ AiSeederOrchestrator.php # Programmatic API (non-CLI usage)
โ โโโ AiSeederServiceProvider.php # Auto-discovered service provider
โ โโโ ContextExtractor.php # ReflectionClass-based source reader
โ โโโ DataGenerator.php # Prompt builder + post-processor
โ โโโ GenerationResult.php # DTO: rows + token usage
โ โโโ RelationshipResolver.php # FK resolution + recursive seeding
โ โโโ SchemaAnalyzer.php # Database introspection engine
โ โโโ TokenUsageTracker.php # Aggregates tokens across calls
โโโ tests/
โโโ Feature/ # 90+ Pest tests
All core services are bound via interfaces in the service provider, making them fully swappable and testable.
๐ Cross-Provider Compatibility
AiSeeder works with any provider supported by the Laravel 12 AI SDK:
| Provider | Status | Notes |
|---|---|---|
| OpenAI (GPT-4o, etc.) | โ Fully supported | Best structured output compliance |
| Google Gemini | โ Fully supported | Schema sanitized for strict OpenAPI validation |
| Anthropic Claude | โ Fully supported | โ |
| Ollama (local) | โ Fully supported | Great for development without API costs |
The structured output schema is carefully built to avoid provider-specific pitfalls:
- No
additionalPropertieskey (Gemini rejects it) - No array-type for nullable fields (Gemini rejects
["string", "null"]) - Nullability communicated via descriptions instead
๐งช Testing
The package ships with 90+ tests covering schema analysis, post-processing, prompt building, relationship resolution, self-referencing FKs, and Gemini compatibility:
# Run from the main Laravel project php artisan test packages/shennawy/ai-seeder/tests/ # Or with a filter php artisan test --filter="self-referencing" packages/shennawy/ai-seeder/tests/
๐ ๏ธ Local Development
To use this package from a local path during development:
// composer.json (main Laravel project) { "repositories": [ { "type": "path", "url": "./packages/shennawy/ai-seeder" } ] }
composer require mrshennawy/ai-seeder:@dev
๐ License
AiSeeder is open-source software licensed under the MIT License.
Built with โค๏ธ for the Laravel community by Mahmoud Shennawy