kraenzle-ritter / puidentify
Unified PHP interface for PRONOM-based file identification using Siegfried and FIDO.
Installs: 0
Dependents: 0
Suggesters: 0
Security: 0
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
pkg:composer/kraenzle-ritter/puidentify
Requires
- php: >=8.1
Requires (Dev)
- phpstan/phpstan: ^1.10
- phpunit/phpunit: ^10.0
README
Puidentify is a PHP library for file format identification using PRONOM PUIDs. It supports Siegfried, FIDO, or both engines with priority/fallback mechanisms.
Installation
composer require kraenzle-ritter/puidentify
Requirements
- PHP 8.1 or higher
- At least one of the following tools installed:
Quick Start
use Puidentify\Identifier; use Puidentify\Engine\FidoEngine; use Puidentify\Engine\SiegfriedEngine; // Create identifier with engines (priority order) $identifier = new Identifier([ new SiegfriedEngine(), // Primary new FidoEngine() // Fallback ]); // Identify a file $result = $identifier->identify('/path/to/file.pdf'); if ($result) { echo $result; // "PUID: fmt/276 (PDF/A-1b) via Siegfried" // Access individual properties echo "PUID: " . $result->puid; echo "Format: " . $result->formatName; echo "Engine: " . $result->engine; }
Advanced Usage
Custom Binary Paths
use Puidentify\Engine\SiegfriedEngine; use Puidentify\Engine\FidoEngine; $identifier = new Identifier([ new SiegfriedEngine('/usr/local/bin/sf'), new FidoEngine('/opt/fido/bin/fido') ]);
Check Available Engines
$availableEngines = $identifier->getAvailableEngines(); echo "Available engines: " . count($availableEngines);
Result Methods
$result = $identifier->identify('/path/to/file.pdf'); // Access basic properties echo "PUID: " . $result->puid; echo "Format: " . $result->formatName; echo "Engine: " . $result->engine; // Access extended properties (if available) if ($result->hasMimeType()) { echo "MIME: " . $result->mimeType; } if ($result->hasVersion()) { echo "Version: " . $result->version; } if ($result->hasConfidence()) { echo "Confidence: " . ($result->confidence * 100) . "%"; } // Convert to array with all properties $array = $result->toArray(); // [ // 'puid' => 'fmt/276', // 'formatName' => 'PDF/A-1b', // 'engine' => 'Siegfried', // 'mimeType' => 'application/pdf', // 'fileExtension' => 'pdf', // 'version' => '1.4', // 'confidence' => 0.95, // 'rawOutput' => [...] // Full engine output // ] // Enhanced string representation echo $result; // "PUID: fmt/276 (PDF/A-1b) v1.4 [95%] via Siegfried"
Error Handling
The library throws specific exceptions for different error conditions:
use Puidentify\Exception\FileNotFoundException; use Puidentify\Exception\EngineException; try { $result = $identifier->identify('/path/to/file.pdf'); } catch (FileNotFoundException $e) { echo "File not found or not readable: " . $e->getMessage(); } catch (EngineException $e) { echo "Engine error: " . $e->getMessage(); }
Testing
# Install dependencies composer install # Run tests composer test # Run static analysis composer analyse
License
MIT License