robo-meister / flow-scribe-api
FlowScribe OCR PHP SDK for connecting to the Robo-Meister OCR API
Requires
- php: >=8.2
- ext-curl: *
- ext-fileinfo: *
- ext-json: *
README
PHP client for connecting to the Robo-Meister FlowScribe OCR API and Robo Connector / Universal Dropzone OCR flows.
Install
composer require robo-meister/flow-scribe-api
For local development from this repository, add the package as a path repository:
{
"repositories": [
{
"type": "path",
"url": "shared-profile/sdk/flowscribe-ocr/php"
}
],
"require": {
"robo-meister/flow-scribe-api": "*"
}
}
Basic OCR
use Robo\FlowScribeOcr\FlowScribeOcrClient; $client = new FlowScribeOcrClient( baseUrl: 'https://ocr.robo-meister.com' ); $health = $client->health(); $metadata = $client->metadata(); $result = $client->processDocument(__DIR__ . '/invoice.pdf', [ 'document_type' => 'invoice', 'mode' => 'fuse', 'journal_csv' => true, ]); if (($result['exit_code'] ?? 1) === 0) { $parsed = $result['parsed'] ?? []; $csv = $result['csv'] ?? null; }
Robo Connector and Universal Dropzone options
processDocument(), ingestFlowScribe(), and ingestRcInvoice() accept the same metadata options so OCR results can be linked back to Robo Connector documents:
$result = $client->processDocument(__DIR__ . '/upload.pdf', [ 'document_type' => 'auto', 'mode' => 'fuse', 'source_document_id' => 'doc_123', 'workspace_id' => 'workspace_123', 'org_id' => 'org_123', 'user_id' => 'user_123', 'context_type' => 'UniversalDropzone', 'context_id' => 'dropzone_upload_123', // Connector base URL; FlowScribe appends /api/integration/document/ocr-completed. 'rc_callback_url' => 'https://connector.example.com', 'document_name' => 'Vendor invoice.pdf', 'return_review_payload' => true, 'include_storage' => true, 'include_preview' => true, 'idempotency_key' => 'upload-doc_123-v1', 'correlation_id' => 'corr_123', ]);
rc_callback_url is the Robo Connector base URL, not the final callback endpoint; FlowScribe appends /api/integration/document/ocr-completed when it calls back.
When these options are present, the SDK sends the important values as multipart form fields. It also mirrors routing/correlation values to headers: X-Workspace-Id, X-Org-Id, X-Source-Document-Id, X-Correlation-Id, Idempotency-Key, and X-RC-Callback-Url.
auto document type behavior
Use 'document_type' => 'auto' when you want the API to classify the dictionary. By default, the PHP SDK does not send a document_type=auto multipart field; omitting the override lets FlowScribe use server-side auto-classification. If you are integrating with an API deployment that explicitly requires the literal string auto, pass 'send_auto_document_type' => true.
Universal Dropzone integration
Authenticated integration routes require an access token generated by the Robo Connector link flow. The SDK sends the token as a bearer token.
use Robo\FlowScribeOcr\FlowScribeOcrClient; $client = new FlowScribeOcrClient( baseUrl: 'https://ocr.robo-meister.com', accessToken: getenv('FLOWSCRIBE_ACCESS_TOKEN') ?: null ); $queued = $client->ingestUniversalDropzoneDocument(__DIR__ . '/dropzone.pdf', [ 'source_document_id' => 'doc_123', 'workspace_id' => 'workspace_123', 'org_id' => 'org_123', 'context_type' => 'UniversalDropzone', 'context_id' => 'upload_123', // Connector base URL; FlowScribe appends /api/integration/document/ocr-completed. 'rc_callback_url' => 'https://connector.example.com', 'idempotency_key' => 'upload_123', 'correlation_id' => 'corr_123', ]); $jobId = $queued['data']['job']['id']; $status = $client->flowScribeStatus($jobId);
processDocumentForReview(), ingestUniversalDropzoneDocument(), and ingestRcDocument() set review-friendly defaults: document_type=auto, mode=fuse, return_review_payload=true, include_storage=true, and include_preview=true.
For the legacy RC invoice OCR bridge, ingestRcInvoice() and rcInvoiceStatus() remain available. For non-invoice Robo Connector documents, prefer ingestRcDocument():
$queued = $client->ingestRcDocument(__DIR__ . '/contract.pdf', [ 'source_document_id' => 'doc_456', 'workspace_id' => 'workspace_123', 'org_id' => 'org_123', 'document_name' => 'Customer contract.pdf', ]);
Handling review payloads
$result = $client->processDocumentForReview(__DIR__ . '/invoice.pdf', [ 'workspace_id' => 'workspace_123', 'org_id' => 'org_123', ]); $reviewPayload = $result['review_payload'] ?? null; $preview = $result['preview'] ?? null; $storage = $result['storage'] ?? null;
Diagnostics
diagnostics() first calls /api/integration/flowscribe/diagnostics with optional org/workspace headers. If that endpoint is not available, it falls back to /health and /metadata.
$diagnostics = $client->diagnostics( organisationId: 'org_123', workspaceId: 'workspace_123' ); if (($diagnostics['diagnostics_available'] ?? true) === false) { $health = $diagnostics['health']; $metadata = $diagnostics['metadata']; }
Custom dictionary config
Pass either a JSON file path:
$result = $client->processDocument(__DIR__ . '/invoice.pdf', [ 'config_path' => __DIR__ . '/custom-dictionary.json', ]);
Or pass an array that the SDK serializes as config.json for the multipart upload. Temporary config files are deleted after each request.
$result = $client->processDocument(__DIR__ . '/invoice.pdf', [ 'config' => [ 'fields' => [ 'invoiceNumber' => ['keywords' => ['Invoice #']], ], ], ]);
Errors
HTTP and transport failures throw FlowScribeOcrException. HTTP errors preserve the status code, decoded response body, raw response body, response correlation ID, and machine-readable error code when available.
use Robo\FlowScribeOcr\FlowScribeOcrException; try { $result = $client->processDocument(__DIR__ . '/invoice.pdf'); } catch (FlowScribeOcrException $exception) { $statusCode = $exception->getStatusCode(); $responseBody = $exception->getResponseBody(); $rawBody = $exception->getRawResponseBody(); $correlationId = $exception->getCorrelationId(); $errorCode = $exception->getErrorCode(); }