subhashladumor1 / laravel-ai-docs
Laravel AI Document Intelligence & OCR package for Laravel 12 AI SDK. Convert PDF to JSON, extract tables, image to text, Ask PDF with AI, audio transcription and multi-language support using GPT-5.2, Claude and Gemini.
Installs: 2
Dependents: 0
Suggesters: 0
Security: 0
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
pkg:composer/subhashladumor1/laravel-ai-docs
Requires
- php: ^8.2|^8.3
- guzzlehttp/guzzle: ^7.0
- intervention/image-laravel: ^1.5
- phpoffice/phpword: ^1.4
- smalot/pdfparser: ^2.0
Requires (Dev)
- mockery/mockery: ^1.6
- pestphp/pest: ^4.0
- pestphp/pest-plugin-laravel: ^4.0
README
Turn any PDF, image, audio or Word document into structured, searchable intelligence โ powered by GPT-5.2, Claude 4.6, or Gemini 3.1 โ with a single line of Laravel code.
What Is This Package?
Imagine your users are uploading invoices, contracts, scanned receipts, meeting recordings or medical reports. You need to extract data, answer questions, generate summaries, and build searchable databases โ without writing thousands of lines of AI integration code.
Laravel AI Docs handles all of that with a fluent, chainable API:
// Extract, summarize and convert an invoice to JSON in one chain $data = AIDocs::pdf($invoice)->toJson(); // Ask a natural-language question about a contract $answer = AIDocs::pdf($contract)->ask('What is the payment due date?'); // Transcribe a meeting audio and summarize it $summary = AIDocs::audio($recording)->summarize(); // OCR a handwritten receipt into structured text $text = AIDocs::image($receipt)->text();
How It Works
The package sits as an intelligent layer between your Laravel app and AI providers. When you call AIDocs::pdf($file), it:
- Validates the file (type, size, existence)
- Extracts raw content using native parsers (smalot/pdfparser, PhpWord)
- Detects if the document is scanned and needs OCR
- Processes images or audio through pre-processing pipelines
- Sends a carefully crafted prompt to your chosen AI (OpenAI, Claude, or Gemini)
- Returns the result as a string, array, or typed
DocumentResultDTO
flowchart TD
A([Your Laravel App]) --> B{AIDocs Facade}
B --> C[FileValidator: type + size + existence]
C --> D{File Type?}
D -->|PDF| E[PDFProcessor via smalot/pdfparser]
D -->|DOCX| F[DocxService via PhpWord]
D -->|Image| G[ImageProcessor via GD]
D -->|Audio| H[AudioProcessor: validate + prep]
E --> I{Scanned PDF?}
I -->|Yes - image based| G
I -->|No - has text| J[Raw Text]
F --> J
G --> K[Base64 Image Data]
H --> L[Audio File Path]
J --> M[AI Prompt Builder]
K --> M
L --> M
M --> N{Active Provider}
N -->|openai| O[OpenAI GPT-5.2 / GPT-5-mini / Whisper]
N -->|claude| P[Anthropic Claude 4.6 Sonnet]
N -->|gemini| Q[Google Gemini 3.1 Pro Preview]
O --> R[DocumentResultDTO]
P --> R
Q --> R
R --> S([Your App: Text / JSON / Markdown])
Loading
Installation
composer require subhashladumor1/laravel-ai-docs
Publish the config file:
php artisan vendor:publish --tag=ai-docs-config
Add your API keys to .env:
# Pick at least one provider OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-... GEMINI_API_KEY=AIza... # Which provider to use by default AI_DOCS_PROVIDER=openai
๐งช Open Source Laravel AI Docs Lab
Want to test this package before integrating it? We've built an open-source testing UI where you can upload documents, test all AI models, and experiment with the API visually!
๐ Get the Laravel AI Docs Lab on GitHub
Real-World Example 1 โ Invoice Processing System
Your e-commerce platform receives thousands of supplier invoices as PDFs every day. You need to extract line items, totals, vendor names and due dates automatically.
use Subhashladumor1\LaravelAiDocs\Facades\AIDocs; class InvoiceProcessor { public function process(string $pdfPath): array { // Extract ALL structured data from the invoice in one call $data = AIDocs::model('gpt-5.2')->pdf($pdfPath)->toJson(); // $data contains: // [ // 'title' => 'Invoice #INV-2024-00892', // 'document_type' => 'invoice', // 'date' => '2024-03-15', // 'author' => 'ACME Supplies Ltd.', // 'key_values' => [ // 'invoice_number' => 'INV-2024-00892', // 'subtotal' => '$4,200.00', // 'tax' => '$420.00', // 'total' => '$4,620.00', // 'due_date' => '2024-04-15', // ], // 'key_entities' => ['ACME Supplies Ltd.', 'John Doe', 'USD'], // 'summary' => 'Invoice for 12 units of server hardware...', // ] Invoice::create([ 'vendor' => $data['author'] ?? 'Unknown', 'total' => $data['key_values']['total'] ?? '0', 'due_date' => $data['key_values']['due_date'] ?? null, 'raw_data' => $data, ]); return $data; } }
How the data flows for this example:
flowchart LR
A[invoice.pdf] --> B[PDFProcessor: Extract text]
B --> C{Text found?}
C -->|Yes| D[JSONConversionService: Build AI prompt]
C -->|No - scanned| E[ImageProcessor: Convert to base64]
E --> D
D --> F[OpenAI GPT-5.2: generateStructured]
F --> G[Parse JSON response]
G --> H[title + key_values + entities + summary]
H --> I[(Invoice Database)]
Loading
Real-World Example 2 โ Legal Contract Q&A Chatbot
A law firm wants to let paralegals ask plain-English questions about uploaded contracts without reading hundreds of pages.
class ContractChatbot { public function chat(string $contractPath, string $question): string { // The package automatically chunks the contract into overlapping // windows, scores each chunk for relevance, then sends only the // relevant context to the AI (RAG pipeline). return AIDocs::model('claude-sonnet-4-6') ->pdf($contractPath) ->ask($question); } } // Usage in a controller: $bot = new ContractChatbot(); $bot->chat($contract, 'What is the termination notice period?'); // โ "Either party may terminate with 30 days written notice per Section 12.3." $bot->chat($contract, 'What are the payment terms?'); // โ "Payment is due net-30 from invoice date. Late payments accrue 1.5% monthly interest." $bot->chat($contract, 'Is there an exclusivity clause?'); // โ "Yes, Section 8 grants Client exclusivity in the APAC region for 24 months."
How the RAG pipeline works:
flowchart TD
A[contract.pdf - 150 pages] --> B[PDFProcessor: Extract full text]
B --> C[TextChunker: Split into 1000-char overlapping windows]
C --> D[Chunks: 1 ... N]
D --> E[Relevance Scorer: keyword frequency per chunk vs question]
E --> F[Top 5 most relevant chunks]
F --> G[Prompt Builder: Context + Question]
G --> H[Claude 4.6 Sonnet]
H --> I[Precise answer grounded in document]
Loading
Real-World Example 3 โ Medical Report Digitization
A hospital scans paper patient forms. The images need to be converted to searchable database records.
class MedicalFormDigitizer { public function digitize(string $imagePath): array { // Works with scanned photos, JPG, PNG โ any image format $text = AIDocs::image($imagePath) ->language('en') ->text('Extract all medical data: patient name, DOB, diagnosis codes, medications, allergies, and physician name.'); // Parse fields from the extracted text return $this->parseFields($text); } public function digitizeArabicReport(string $imagePath): string { // Full Arabic OCR โ auto-detected language return AIDocs::image($imagePath)->text(); // Language detected automatically as 'ar' } }
Image OCR flow:
flowchart LR
A["Scanned Image (.jpg / .png)"] --> B[ImageProcessor]
B --> C[Resize max 2048px + Auto-rotate + Normalize quality]
C --> D[Encode to base64]
D --> E[OCRService: Build vision prompt]
E --> F[GPT-5-mini Vision API - high detail mode]
F --> G[Extracted text preserving layout]
G --> H[Your Database]
Loading
Real-World Example 4 โ Meeting Minutes Automation
Your team records all meetings. You want automatic transcriptions and action-item summaries sent to Slack.
class MeetingAutomator { public function process(string $audioPath): void { $builder = AIDocs::audio($audioPath)->language('en'); // Step 1: Get full transcription $transcript = $builder->transcribe(); // Step 2: Summarize with action items $summary = $builder->summarize( 'List all action items, decisions made, and owners. Use bullet points.' ); // Step 3: Store and notify Meeting::create([ 'transcript' => $transcript, 'summary' => $summary, 'duration' => now(), ]); Notification::send( User::managers()->get(), new MeetingSummaryNotification($summary) ); } }
Real-World Example 5 โ Multi-Language Document Portal
A global company receives documents in English, Arabic, French, Chinese, and Russian. Your portal needs summaries in the original language.
class DocumentPortal { public function summarize(string $filePath): string { // Language is auto-detected from content (Unicode script analysis) // Summary is returned in the detected language automatically return AIDocs::pdf($filePath)->summarize()->text(); // Arabic PDF โ Arabic summary // Chinese PDF โ Chinese summary // French PDF โ French summary } public function summarizeInEnglish(string $filePath): string { // Force English output regardless of source language return AIDocs::pdf($filePath) ->language('en') ->summarize() ->text(); } public function bulkProcess(array $files): array { return array_map(fn($file) => [ 'file' => basename($file), 'summary' => AIDocs::pdf($file)->summarize()->text(), 'tables' => AIDocs::pdf($file)->tables()->result()->tables, 'json' => AIDocs::pdf($file)->toJson(), ], $files); } }
Language detection flow:
flowchart TD
A[Extracted Text] --> B[LanguageDetector: Unicode script analysis]
B -->|Arabic block| C[ar]
B -->|Hiragana or Katakana| D[ja]
B -->|CJK Unified Han| E[zh]
B -->|Cyrillic| F[ru]
B -->|Hangul ranges| G[ko]
B -->|Devanagari| H[hi]
B -->|Greek| I[el]
B -->|Latin default| J[en]
C & D & E & F & G & H & I & J --> K[Language code passed to AI prompt]
K --> L[AI responds in detected language]
Loading
Real-World Example 6 โ Financial Report Dashboard
A fintech app ingests quarterly earnings PDFs and extracts all financial tables for charting.
class FinancialReportParser { public function parse(string $reportPath): array { // Full pipeline: extract text โ find tables โ summarize โ markdown $result = AIDocs::model('gpt-5.2') ->pdf($reportPath) ->enhance() // pre-process PDF ->tables() // extract all data tables ->summarize() // executive summary ->result(); // get DocumentResultDTO // Access structured data foreach ($result->tables as $table) { echo $table->title . "\n"; // "Revenue by Quarter" echo $table->toMarkdown() . "\n"; // markdown table // Headers + rows as arrays for charting $chartData = [ 'labels' => $table->headers, 'rows' => $table->rows, ]; } return [ 'summary' => $result->summary, 'tables' => count($result->tables), 'provider' => $result->provider, // 'openai' 'model' => $result->model, // 'gpt-5.2' 'time' => round($result->processingTimeSeconds, 2) . 's', ]; } }
Switching AI Providers
You can switch providers per-request. No config changes needed โ just chain .model() or .provider():
// Use the cheapest model for simple summaries $quickSummary = AIDocs::model('gemini-3-flash-preview')->pdf($file)->summarize()->text(); // Use the most capable model for complex legal analysis $legalAnalysis = AIDocs::model('claude-sonnet-4-6')->pdf($contract)->ask( 'Identify all clauses that could create liability for the vendor.' ); // Use GPT-5-mini for vision-heavy scanned documents $scannedText = AIDocs::model('gpt-5-mini')->image($scan)->text(); // Chain with explicit provider name $result = AIDocs::provider('gemini')->pdf($file)->toJson();
| Alias | Provider | Best For |
|---|---|---|
gpt-5.2 / gpt-5 |
OpenAI | Latest generation general documents, max accuracy |
gpt-5.2-pro |
OpenAI | Advanced reasoning tasks |
gpt-5-mini / gpt-5-nano |
OpenAI | Scanned images, vision tasks, fast processing |
claude-sonnet-4-6 |
Anthropic | Latest iteration for long documents, legal, code |
claude-opus-4-6 |
Anthropic | Maximum reasoning depth |
claude-haiku-4-5 |
Anthropic | Fast, cost-efficient summarization |
gemini-3.1-pro-preview |
Multilingual, ultra-large context | |
gemini-3-pro-preview |
Advanced reasoning and logic | |
gemini-3-flash-preview |
High speed, low cost | |
gemini-2.5-pro |
Previous generation pro model |
The Full Pipeline Explained
flowchart TD
A([AIDocs Facade]) --> B[model / provider override - optional]
B --> C{Entry point}
C -->|"AIDocs::pdf"| D[PDFBuilder]
C -->|"AIDocs::image"| E[ImageBuilder]
C -->|"AIDocs::audio"| F[AudioBuilder]
C -->|"AIDocs::document"| G[DocumentBuilder]
D --> H[enhance - extensible pre-process hook]
D --> I[summarize via SummarizerService]
D --> J[tables via TableExtractionService]
D --> K[ask question via AskPDFService RAG]
D --> L[toJson via JSONConversionService]
D --> M[toMarkdown via MarkdownService]
I & J & K & L & M --> N["result() returns DocumentResultDTO"]
N --> O[rawText / summary / tables / markdown / json / language / provider / model / time]
Loading
DocumentResultDTO โ Your Data Container
Every ->result() call returns a typed DocumentResultDTO. It's immutable and holds everything your pipeline produced:
$dto = AIDocs::model('gpt-5.2') ->pdf('/path/to/report.pdf') ->enhance() ->tables() ->summarize() ->result(); // Check what's available if ($dto->hasText()) { /* raw text extracted */ } if ($dto->hasSummary()) { /* AI summary generated */ } if ($dto->hasTables()) { /* tables found */ } if ($dto->hasJson()) { /* structured JSON extracted */ } // Read properties $dto->rawText; // Full extracted text $dto->summary; // AI-generated summary $dto->tables; // TableDTO[] โ each with headers, rows, title $dto->markdown; // Markdown-formatted document $dto->json; // Structured JSON array $dto->language; // Detected language: 'en', 'ar', etc. $dto->provider; // 'openai' | 'claude' | 'gemini' $dto->model; // 'gpt-5.2' | 'claude-sonnet-4-6' | etc. $dto->processingTimeSeconds; // Wall-clock time used // Convert for API responses return response()->json($dto->toArray()); // Immutable wither โ create a modified copy $modified = $dto->with(['language' => 'fr']);
Working with Tables
$result = AIDocs::pdf('/reports/q4-financials.pdf')->tables()->result(); foreach ($result->tables as $table) { // Print as Markdown echo $table->toMarkdown(); // | Revenue | Q1 | Q2 | Q3 | Q4 | // | --- | ---| ---| ---| ---| // | Product | $1M| $2M| $3M| $4M| // Access raw data $table->title; // "Revenue by Quarter" $table->headers; // ['Revenue', 'Q1', 'Q2', 'Q3', 'Q4'] $table->rows; // [['Product', '$1M', '$2M', '$3M', '$4M'], ...] $table->pageNumber; // 3 (estimated page) // Convert for JSON APIs $table->toArray(); }
Complete API Reference
Manager Methods โ Available on AIDocs::
These three methods configure the active provider and language before you call an entry point. They return a cloned, immutable instance โ so chains never interfere with each other.
| Method | Returns | Description |
|---|---|---|
AIDocs::model(string $alias) |
AIDocsManager |
Switch provider + model via alias, e.g. 'gpt-5-mini', 'claude-sonnet-4-6' |
AIDocs::provider(string $name) |
AIDocsManager |
Switch provider by name: 'openai', 'claude', 'gemini' |
AIDocs::language(string $code) |
AIDocsManager |
Force a language code, e.g. 'ar', 'fr', 'zh' |
// Switch model (auto-detects provider from alias) AIDocs::model('gpt-5-mini')->pdf($file)->text(); AIDocs::model('claude-sonnet-4-6')->pdf($file)->summarize()->text(); AIDocs::model('gemini-3-flash-preview')->pdf($file)->toJson(); // Switch provider explicitly (uses that provider's default model) AIDocs::provider('claude')->pdf($file)->ask('Who signed this?'); AIDocs::provider('gemini')->image($scan)->text(); // Force language for all operations in the chain AIDocs::language('ar')->pdf($file)->summarize()->text(); AIDocs::language('fr')->model('claude-sonnet-4-6')->document($docx)->toMarkdown(); // Multiple overrides โ order doesn't matter AIDocs::model('gpt-5-mini')->language('zh')->image($scan)->text(); AIDocs::language('de')->provider('gemini')->pdf($file)->toJson();
Entry Points โ What File Type to Process
| Method | Returns | Accepts |
|---|---|---|
AIDocs::pdf(string $path) |
PDFBuilder |
.pdf |
AIDocs::image(string $path) |
ImageBuilder |
.jpg, .jpeg, .png, .gif, .bmp, .webp, .tiff |
AIDocs::audio(string $path) |
AudioBuilder |
.mp3, .mp4, .m4a, .wav, .webm, .ogg |
AIDocs::document(string $path) |
DocumentBuilder |
.pdf, .docx, .doc, .txt, .md |
Tip: Use
AIDocs::pdf()for PDFs with special handling (scanned detection, page count). UseAIDocs::document()when you want one entry point for any document type.
PDFBuilder โ Full Method Reference
Returned by AIDocs::pdf($path). Text is automatically extracted on construction so all methods below work immediately.
Chainable Methods (return static โ can be chained)
| Method | Signature | What It Does |
|---|---|---|
language |
language(string $code): static |
Override the detected language for all subsequent AI calls |
enhance |
enhance(): static |
Pre-processing hook (extensible, currently a no-op) |
summarize |
summarize(?string $prompt = null): static |
Generate an AI summary. Pass custom $prompt to control the output style |
tables |
tables(): static |
Extract all tabular data into TableDTO[] objects |
Terminal Methods (return a final value โ end the chain)
| Method | Signature | Returns | What It Does |
|---|---|---|---|
text |
text(): string |
string |
Return the raw extracted PDF text |
pages |
pages(): int |
int |
Return the total page count |
ask |
ask(string $question): string |
string |
RAG Q&A: answer a question using the document as context |
toJson |
toJson(?string $prompt = null): array |
array |
Extract structured JSON from the document |
structured |
structured(): array |
array |
Generate a structured extraction (alias for toJson with a structured focus) |
toMarkdown |
toMarkdown(): string |
string |
Build a Markdown document from whatever was accumulated |
result |
result(): DocumentResultDTO |
DocumentResultDTO |
Collect everything into a typed result object |
Every Useful PDFBuilder Combination
// 1. Just extract raw text $text = AIDocs::pdf($file)->text(); // 2. Get page count $pages = AIDocs::pdf($file)->pages(); // 3. Summarize $summary = AIDocs::pdf($file)->summarize()->text(); // 4. Summarize with a custom instruction $bullets = AIDocs::pdf($file)->summarize('Return 5 bullet points only.')->text(); // 5. Ask a question (RAG) $answer = AIDocs::pdf($file)->ask('What is the contract value?'); // 6. Extract structured JSON $data = AIDocs::pdf($file)->toJson(); // 7. JSON with custom schema instruction $invoice = AIDocs::pdf($file)->toJson('Extract: vendor, total, due_date, line_items as JSON.'); // 8. Structured Extraction $data = AIDocs::pdf($file)->structured(); // 9. Extract tables only $result = AIDocs::pdf($file)->tables()->result(); $tables = $result->tables; // TableDTO[] // 9. Convert to Markdown (text only) $md = AIDocs::pdf($file)->toMarkdown(); // 10. Summarize then convert full result to Markdown $md = AIDocs::pdf($file)->summarize()->toMarkdown(); // 11. Tables + summary โ Markdown (richest document output) $md = AIDocs::pdf($file)->enhance()->tables()->summarize()->toMarkdown(); // 12. Full pipeline โ DocumentResultDTO $result = AIDocs::pdf($file)->enhance()->tables()->summarize()->result(); // 13. Override language for Arabic PDFs $summary = AIDocs::language('ar')->pdf($file)->summarize()->text(); // 14. Multi-provider same file $fast = AIDocs::model('gemini-3-flash-preview')->pdf($file)->summarize()->text(); $precise = AIDocs::model('claude-sonnet-4-6')->pdf($file)->ask('What are the risks?'); // 15. Page count before processing if (AIDocs::pdf($file)->pages() > 100) { $summary = AIDocs::model('claude-sonnet-4-6')->pdf($file)->summarize()->text(); } else { $summary = AIDocs::pdf($file)->toJson(); }
ImageBuilder โ Full Method Reference
Returned by AIDocs::image($path). Unlike PDF/Document builders, text is NOT extracted on construction โ it is lazily extracted when you first call a method that needs it.
Chainable Methods
| Method | Signature | What It Does |
|---|---|---|
language |
language(string $code): static |
Set language hint for OCR extraction |
Terminal Methods
| Method | Signature | Returns | What It Does |
|---|---|---|---|
text |
text(?string $prompt = null): string |
string |
OCR: extract all text. Pass a custom prompt to control extraction focus |
summarize |
summarize(?string $prompt = null): string |
string |
Extract text then summarize it |
tables |
tables(): array |
TableDTO[] |
Extract text then find tables |
ask |
ask(string $question): string |
string |
Extract text then answer a question about it |
toJson |
toJson(): array |
array |
Extract text then convert to structured JSON |
result |
result(): DocumentResultDTO |
DocumentResultDTO |
Extract text then return a full result object |
Every Useful ImageBuilder Combination
// 1. Simple OCR - extract all text $text = AIDocs::image($scan)->text(); // 2. OCR with a focused extraction prompt $numbers = AIDocs::image($receipt)->text('Extract only monetary amounts and totals.'); // 3. OCR with language hint $arabic = AIDocs::language('ar')->image($scan)->text(); $french = AIDocs::language('fr')->image($scan)->text(); // 4. Summarize image content $summary = AIDocs::image($scan)->summarize(); // 5. Summarize with custom style $summary = AIDocs::image($scan)->summarize('One sentence summary only.'); // 6. Extract tables from a screenshot of a spreadsheet $tables = AIDocs::image($spreadsheetPhoto)->tables(); foreach ($tables as $table) { echo $table->toMarkdown(); } // 7. Ask a question about an image $answer = AIDocs::image($photo)->ask('What is the name on this ID card?'); $answer = AIDocs::image($menu)->ask('Does this menu have any vegetarian options?'); // 8. Convert image content to structured JSON $data = AIDocs::image($businessCard)->toJson(); // Returns: ['title' => 'John Doe', 'key_entities' => ['Acme Corp'], ...] // 9. Get full DocumentResultDTO $result = AIDocs::image($scan)->result(); echo $result->rawText; echo $result->language; // auto-detected // 10. Use GPT-5-mini for best OCR accuracy $text = AIDocs::model('gpt-5-mini')->image($scan)->text(); // 11. Medical form: focused extraction prompt + JSON $fields = AIDocs::model('gpt-5-mini') ->language('en') ->image($medicalForm) ->toJson();
AudioBuilder โ Full Method Reference
Returned by AIDocs::audio($path). Audio requires OpenAI Whisper (Claude and Gemini do not support direct audio transcription).
Chainable Methods
| Method | Signature | What It Does |
|---|---|---|
language |
language(string $code): static |
Hint the transcription language for better accuracy |
Terminal Methods
| Method | Signature | Returns | What It Does |
|---|---|---|---|
transcribe |
transcribe(): string |
string |
Transcribe audio to text using Whisper |
summarize |
summarize(?string $prompt = null): string |
string |
Transcribe then summarize the transcript |
result |
result(): DocumentResultDTO |
DocumentResultDTO |
Transcribe and return full result (transcript is in both rawText and transcript) |
Every Useful AudioBuilder Combination
// 1. Simple transcription $text = AIDocs::audio($mp3)->transcribe(); // 2. Transcription with language hint (improves accuracy) $text = AIDocs::language('es')->audio($file)->transcribe(); $text = AIDocs::language('ar')->audio($file)->transcribe(); // 3. Transcribe then summarize $summary = AIDocs::audio($meeting)->summarize(); // 4. Summarize with custom instructions $actions = AIDocs::audio($meeting)->summarize( 'List action items, owners, and deadlines. Format as numbered list.' ); // 5. Get both transcript + summary via result() $result = AIDocs::audio($meeting)->result(); $transcript = $result->transcript; // or $result->rawText โ same value $provider = $result->provider; // 'openai' // 6. Step-by-step: transcribe first, then summarize separately $builder = AIDocs::audio($recording)->language('en'); $transcript = $builder->transcribe(); $summary = $builder->summarize('Bullet points only.'); // 7. Store both in DB Recording::create([ 'transcript' => AIDocs::audio($file)->transcribe(), 'summary' => AIDocs::audio($file)->summarize(), 'duration' => $audioDurationSeconds, ]);
Note: Audio transcription is only supported by the OpenAI provider (Whisper). Calling
AIDocs::provider('claude')->audio(...)orAIDocs::provider('gemini')->audio(...)will throw aFileProcessingException.
DocumentBuilder โ Full Method Reference
Returned by AIDocs::document($path). Handles DOCX, DOC, TXT, MD and also PDF. Text is automatically extracted on construction based on file extension.
Chainable Methods
| Method | Signature | What It Does |
|---|---|---|
language |
language(string $code): static |
Override the detected language |
enhance |
enhance(): static |
Pre-processing hook (extensible) |
summarize |
summarize(?string $prompt = null): static |
AI summarization |
tables |
tables(): static |
Extract all tables |
Terminal Methods
| Method | Signature | Returns | What It Does |
|---|---|---|---|
text |
text(): string |
string |
Return raw extracted text |
ask |
ask(string $question): string |
string |
RAG Q&A over the document |
toJson |
toJson(?string $prompt = null): array |
array |
Structured JSON extraction |
toMarkdown |
toMarkdown(): string |
string |
Build Markdown document |
result |
result(): DocumentResultDTO |
DocumentResultDTO |
Full typed result object |
Every Useful DocumentBuilder Combination
// 1. Extract text from DOCX $text = AIDocs::document($docx)->text(); // 2. Extract text from plain text file $text = AIDocs::document('/notes/meeting.txt')->text(); // 3. Summarize a Word document $summary = AIDocs::document($docx)->summarize()->text(); // 4. Ask a question about a DOCX contract $answer = AIDocs::document($docx)->ask('What is the governing law?'); // 5. Extract structured JSON from a DOCX report $data = AIDocs::document($docx)->toJson(); // 6. Extract tables from a DOCX with data tables $result = AIDocs::document($docx)->tables()->result(); // 7. Full pipeline on a Word document $result = AIDocs::document($docx) ->enhance() ->tables() ->summarize() ->result(); // 8. Convert any document to Markdown $md = AIDocs::document($docx)->summarize()->toMarkdown(); $md = AIDocs::document($txtFile)->toMarkdown(); // 9. Multi-language DOCX $summary = AIDocs::language('de')->document($germanDocx)->summarize()->text(); // 10. Ask a question using Claude (long context = better for big docs) $answer = AIDocs::model('claude-sonnet-4-6')->document($docx)->ask( 'Summarize all obligations of Party B.' );
DocumentResultDTO โ All Properties & Methods
Every ->result() call returns a DocumentResultDTO. It is immutable โ values are set once and never change.
Properties
| Property | Type | Populated By |
|---|---|---|
$rawText |
string |
Always โ extracted document text |
$summary |
?string |
->summarize() |
$tables |
TableDTO[] |
->tables() |
$markdown |
?string |
->toMarkdown() |
$json |
?array |
->toJson() |
$language |
?string |
Auto-detected or via ->language() |
$mimeType |
?string |
Detected from the file |
$filePath |
?string |
The source file path |
$provider |
?string |
'openai', 'claude', 'gemini' |
$model |
?string |
e.g. 'gpt-5.2', 'claude-sonnet-4-6' |
$processingTimeSeconds |
float |
Wall-clock time the pipeline took |
$transcript |
?string |
Audio only โ same as $rawText for audio |
Helper Methods
$dto->hasText(); // true if rawText is not empty $dto->hasSummary(); // true if summary was generated $dto->hasTables(); // true if at least one table was found $dto->hasJson(); // true if structured JSON was extracted $dto->toArray(); // Convert everything to a plain array (great for API responses) $dto->toJson(); // Return only the json property as array (or [] if null) $dto->with([...]); // Return a modified copy (immutable wither)
Usage Examples
$result = AIDocs::pdf($file)->enhance()->tables()->summarize()->result(); // Conditional logic based on what was found if ($result->hasTables()) { foreach ($result->tables as $table) { echo $table->title . "\n"; echo $table->toMarkdown() . "\n"; } } if ($result->hasSummary()) { Slack::send('#docs', $result->summary); } // API response return response()->json($result->toArray()); // Immutable wither โ create a modified copy without changing original $translated = $result->with(['summary' => translateToFrench($result->summary)]); // Metadata for logging Log::info('AI processed document', [ 'provider' => $result->provider, 'model' => $result->model, 'language' => $result->language, 'time' => $result->processingTimeSeconds, 'tables' => count($result->tables), ]);
TableDTO โ All Properties & Methods
Each item in $result->tables is a TableDTO.
$table->title; // string|null โ e.g. "Revenue by Region" $table->headers; // string[] โ e.g. ['Region', 'Q1', 'Q2'] $table->rows; // string[][] โ e.g. [['APAC', '$2M', '$3M'], ...] $table->pageNumber; // int โ estimated page number (1-based) $table->toMarkdown(); // Renders as a GitHub-flavoured markdown table $table->toArray(); // Converts to plain array for JSON serialization
Configuration Reference
After publishing (php artisan vendor:publish --tag=ai-docs-config), you can tune every aspect in config/ai-docs.php:
// config/ai-docs.php return [ // Which provider is the default 'default_provider' => env('AI_DOCS_PROVIDER', 'openai'), // Provider API keys and model defaults 'providers' => [ 'openai' => [ 'api_key' => env('OPENAI_API_KEY'), 'default_model' => env('OPENAI_DEFAULT_MODEL', 'gpt-5.2'), 'vision_model' => env('OPENAI_VISION_MODEL', 'gpt-5-mini'), 'whisper_model' => env('OPENAI_WHISPER_MODEL', 'whisper-1'), ], 'claude' => [ 'api_key' => env('ANTHROPIC_API_KEY'), 'default_model' => env('CLAUDE_DEFAULT_MODEL', 'claude-sonnet-4-6'), ], 'gemini' => [ 'api_key' => env('GEMINI_API_KEY'), 'default_model' => env('GEMINI_DEFAULT_MODEL', 'gemini-3.1-pro-preview'), ], ], // RAG / Ask PDF settings 'rag' => [ 'chunk_size' => 1000, // Characters per chunk 'chunk_overlap' => 100, // Overlap between chunks 'top_k_chunks' => 5, // How many chunks to send as context ], // Audio processing 'audio' => [ 'enabled' => true, 'max_file_size_mb' => 25, 'supported_formats'=> ['mp3', 'mp4', 'm4a', 'wav', 'webm'], ], // Image processing pre-pipeline 'image' => [ 'max_width' => 2048, 'max_height' => 2048, 'quality' => 90, 'auto_rotate' => true, 'enhance_contrast'=> true, ], ];
Testing Your Integration
The package ships a FakeAIProvider โ write tests that never call real APIs:
use Subhashladumor1\LaravelAiDocs\Tests\Fakes\FakeAIProvider; use Subhashladumor1\LaravelAiDocs\Services\SummarizerService; use Subhashladumor1\LaravelAiDocs\Services\AskPDFService; use Subhashladumor1\LaravelAiDocs\Processors\TextChunker; // Test summarization โ no OpenAI call made it('summarizes a PDF', function () { $provider = (new FakeAIProvider()) ->withTextResponse('This report covers Q4 earnings growth of 23%.'); $service = new SummarizerService(); $result = $service->summarize($provider, 'Long earnings report text...'); expect($result)->toBe('This report covers Q4 earnings growth of 23%.'); }); // Test Ask PDF / RAG โ no API call made it('answers questions about a document', function () { $provider = (new FakeAIProvider()) ->withTextResponse('The payment is due on April 15, 2024.'); $service = new AskPDFService(new TextChunker(500, 50)); $answer = $service->ask($provider, 'Invoice text...', 'When is payment due?'); expect($answer)->toContain('April 15'); }); // Test error simulation it('handles provider failure gracefully', function () { $provider = (new FakeAIProvider())->shouldThrow('Rate limit exceeded'); expect(fn () => (new SummarizerService())->summarize($provider, 'text')) ->toThrow(RuntimeException::class, 'Rate limit exceeded'); });
Run all tests:
composer test
Package Architecture
src/
โโโ AIDocsManager.php โ Main orchestrator. Entry-point for the Facade.
โโโ LaravelAIDocsServiceProvider.php โ Binds everything to the Laravel container.
โโโ Facades/
โ โโโ AIDocs.php โ Static facade: AIDocs::pdf(), ::image(), etc.
โ
โโโ Builders/ โ Fluent chainable API per file type
โ โโโ PDFBuilder.php โ .enhance() .summarize() .tables() .ask() .toJson() .toMarkdown()
โ โโโ ImageBuilder.php โ .text() .summarize() .ask() .toJson()
โ โโโ AudioBuilder.php โ .transcribe() .summarize()
โ โโโ DocumentBuilder.php โ DOCX + TXT, same API as PDFBuilder
โ
โโโ Services/ โ One service per feature, injected into builders
โ โโโ OCRService.php โ Image โ text via vision API
โ โโโ PDFService.php โ PDF extraction + scanned fallback
โ โโโ DocxService.php โ DOCX extraction via PhpWord
โ โโโ AudioService.php โ Audio validation + transcription
โ โโโ SummarizerService.php โ AI summarization with language hints
โ โโโ TableExtractionService.php โ AI table detection โ TableDTO[]
โ โโโ AskPDFService.php โ RAG pipeline (chunk โ retrieve โ answer)
โ โโโ MarkdownService.php โ Build markdown from result parts
โ โโโ JSONConversionService.php โ Document โ structured JSON via AI
โ
โโโ Providers/ โ One class per AI vendor
โ โโโ Contracts/AIProviderInterface.php
โ โโโ OpenAIProvider.php โ GPT-5.2, GPT-5-mini, Whisper
โ โโโ ClaudeProvider.php โ Claude 4.6 Sonnet / Haiku / Opus
โ โโโ GeminiProvider.php โ Gemini 3.1 Pro Preview / Flash
โ
โโโ Processors/ โ Pre-processing before AI calls
โ โโโ ImageProcessor.php โ Resize, rotate, base64 encode
โ โโโ PDFProcessor.php โ smalot/pdfparser wrapper
โ โโโ AudioProcessor.php โ File validation
โ โโโ TextChunker.php โ Split text for RAG, relevance scoring
โ
โโโ DTO/
โ โโโ DocumentResultDTO.php โ Immutable result container (all outputs)
โ โโโ TableDTO.php โ Single extracted table with toMarkdown()
โ
โโโ Exceptions/
โ โโโ FileProcessingException.php
โ โโโ ProviderNotSupportedException.php
โ
โโโ Support/
โ โโโ ModelResolver.php โ 'claude-sonnet-4-6' โ {provider, model}
โ โโโ LanguageDetector.php โ Unicode script analysis
โ โโโ FileValidator.php โ Type + size + existence checks
โ
โโโ config/
โโโ ai-docs.php โ All configuration with env() defaults
License
MIT โ free for personal and commercial use. See LICENSE.
