kraenzle-ritter/puidentify

Unified PHP interface for PRONOM-based file identification using Siegfried and FIDO.

Installs: 0

Dependents: 0

Suggesters: 0

Security: 0

Stars: 0

Watchers: 0

Forks: 0

Open Issues: 0

pkg:composer/kraenzle-ritter/puidentify

v1.0.0 2025-10-30 13:41 UTC

This package is auto-updated.

Last update: 2025-10-30 13:52:25 UTC


README

CI Code Quality PHP Version License

Puidentify is a PHP library for file format identification using PRONOM PUIDs. It supports Siegfried, FIDO, or both engines with priority/fallback mechanisms.

Installation

composer require kraenzle-ritter/puidentify

Requirements

  • PHP 8.1 or higher
  • At least one of the following tools installed:

Quick Start

use Puidentify\Identifier;
use Puidentify\Engine\FidoEngine;
use Puidentify\Engine\SiegfriedEngine;

// Create identifier with engines (priority order)
$identifier = new Identifier([
    new SiegfriedEngine(), // Primary
    new FidoEngine()       // Fallback
]);

// Identify a file
$result = $identifier->identify('/path/to/file.pdf');

if ($result) {
    echo $result; // "PUID: fmt/276 (PDF/A-1b) via Siegfried"
    
    // Access individual properties
    echo "PUID: " . $result->puid;
    echo "Format: " . $result->formatName;
    echo "Engine: " . $result->engine;
}

Advanced Usage

Custom Binary Paths

use Puidentify\Engine\SiegfriedEngine;
use Puidentify\Engine\FidoEngine;

$identifier = new Identifier([
    new SiegfriedEngine('/usr/local/bin/sf'),
    new FidoEngine('/opt/fido/bin/fido')
]);

Check Available Engines

$availableEngines = $identifier->getAvailableEngines();
echo "Available engines: " . count($availableEngines);

Result Methods

$result = $identifier->identify('/path/to/file.pdf');

// Access basic properties
echo "PUID: " . $result->puid;
echo "Format: " . $result->formatName;
echo "Engine: " . $result->engine;

// Access extended properties (if available)
if ($result->hasMimeType()) {
    echo "MIME: " . $result->mimeType;
}

if ($result->hasVersion()) {
    echo "Version: " . $result->version;
}

if ($result->hasConfidence()) {
    echo "Confidence: " . ($result->confidence * 100) . "%";
}

// Convert to array with all properties
$array = $result->toArray();
// [
//     'puid' => 'fmt/276',
//     'formatName' => 'PDF/A-1b', 
//     'engine' => 'Siegfried',
//     'mimeType' => 'application/pdf',
//     'fileExtension' => 'pdf',
//     'version' => '1.4',
//     'confidence' => 0.95,
//     'rawOutput' => [...] // Full engine output
// ]

// Enhanced string representation
echo $result; // "PUID: fmt/276 (PDF/A-1b) v1.4 [95%] via Siegfried"

Error Handling

The library throws specific exceptions for different error conditions:

use Puidentify\Exception\FileNotFoundException;
use Puidentify\Exception\EngineException;

try {
    $result = $identifier->identify('/path/to/file.pdf');
} catch (FileNotFoundException $e) {
    echo "File not found or not readable: " . $e->getMessage();
} catch (EngineException $e) {
    echo "Engine error: " . $e->getMessage();
}

Testing

# Install dependencies
composer install

# Run tests
composer test

# Run static analysis
composer analyse

License

MIT License