kraenzle-ritter / anton-import-format
Canonical JSON Schema (Draft 2020-12) and framework-free Validator for Anton's metadata.json import format.
Package info
github.com/kraenzle-ritter/anton-import-format
pkg:composer/kraenzle-ritter/anton-import-format
Requires
- php: >=8.2
- ext-json: *
- opis/json-schema: ^2.3
Requires (Dev)
- phpstan/phpstan: ^1.10
- phpunit/phpunit: ^10.0
This package is auto-updated.
Last update: 2026-05-04 12:22:07 UTC
README
Canonical JSON Schema (Draft 2020-12) and a framework-free PHP Validator
for the metadata.json shape that Anton's importers consume.
This package is the single source of truth for "what does an Anton-import
look like" — used both by Anton itself (read-side validation in
AgateImportHelper, write-side dump via anton:import --dump-metadata-json)
and by external producers (notably agate,
the digital-preservation pipeline).
Install
This package is distributed via VCS / GitHub. Add the repository entry and the package as a Composer dependency:
{
"repositories": [
{
"type": "vcs",
"url": "https://github.com/kraenzle-ritter/anton-import-format"
}
],
"require": {
"kraenzle-ritter/anton-import-format": "^0.1"
}
}
Quickstart
use KraenzleRitter\AntonImportFormat\Validator; $validator = new Validator(); $result = $validator->validate($metadataJson); // string | array | stdClass if ($result->valid) { // proceed with import } else { foreach ($result->errors as $error) { // each error has ->path, ->keyword, ->message printf("[%s] %s: %s\n", $error->keyword, $error->path, $error->message); } }
Validator::validate() returns a ValidationResult:
final readonly class ValidationResult { public bool $valid; /** @var list<ValidationError> */ public array $errors; public function toArray(): array; // for JSON-serialisation into ImportEvent / pipeline-result payloads }
Version-aware validation
If you want a structured warning when the document declares a version
that does not match the loaded schema's major.minor:
$result = $validator->validateWithVersionWarning($metadataJson); // $result->errors may include a {path: '/version', keyword: 'schema_version_mismatch', ...} entry. // $result->valid still reflects only structural validation — the version mismatch is informational.
Schema reference
A document is a top-level wrapper object with these required fields:
{
"version": "0.1",
"tenant": "<anton-tenant-slug>",
"generator": "<producer@version>",
"defaults": { "match_by": "label", "on_not_found": "create" },
"entries": [ ... ]
}
Entries
Each entry is a collection or record and includes (among others):
uuid(required) — UUID of the AntonObject. Primary identifier across the import; works even before DB ids exist.type—"collection"or"record".level_of_description— ISAD(G) level:collection,recordgroup,fonds,series,class,file,item.identifier— archival signature (e.g.KBA 1.1.1).title— multilingual object:{de: "...", fr: "...", en: "..."}. Keys MUST be ISO-639-1 two-letter codes.parent— object reference:{uuid: "..."}(preferred), or{identifier: "..."}/{id: 42}for already-persisted parents.events,notes,keywords,places,languages— see below.files— only onrecordentries. Each file has at minimumname,mime_type,md5sum.
References to other AntonObjects
Use parent or any other object-reference slot with the resolution order
uuid > identifier > id:
"parent": { "uuid": "0193e8f7-..." } // preferred (always works) "parent": { "identifier": "KBA 1" } // works if parent in DB "parent": { "id": 42 } // works if parent in DB
Authority references (Actor, Place, Keyword)
Two mutually-exclusive forms:
// id-form: existing DB record "actor": { "id": 42 } // inline-form: match-or-create with explicit policy "actor": { "label": { "de": "Anna Müller" }, "type": "person", "match_by": "label", "on_not_found": "create" }
match_by enum: label (any locale), label.de, label.fr, label.en,
label.it, alternative_names.
on_not_found enum: create, error, skip.
Both default from the wrapper's defaults object; per-spec values override.
Multilingual content
Keys are ISO-639-1 two-letter codes (de, fr, en, it). The
languages[] array on entries uses ISO-639-2 three-letter codes (ger,
fre, eng, lat) — matching Anton's languages.name column.
Files
Files are nested inside record entries (1:N):
"files": [ { "name": "brief.pdf", "mime_type": "application/pdf", "md5sum": "5d41402abc4b2a76b9719d911017c592", "size_bytes": 20697, "pronom_id": "fmt/14", "nara_risk": "low" } ]
md5sum is required and must match ^[a-f0-9]{32}$. nara_risk enum:
low, moderate, high, unknown.
Version policy
0.x.yreleases may break across bothy(point) andx(minor) while the schema iterates.1.0.0will be the first stability commitment — breaking changes from then on require a major bump.- Consumers should pin
^0.1while the schema is in 0.x.
Test fixtures
Under tests/Fixtures/:
valid/minimal.json— smallest valid document.valid/full.json— exercises every schema feature.broken/*.json— intentionally-broken cases, each pinning the validator's error reporting against regression.agate-target/folder-input.json— the v0.1 form that agate'sCreateMetadataJsonStepshould emit after migration. Companion tolegacy-agate-output/folder-input.json(pre-restructure).legacy-agate-output/*.json— read-only snapshots of agate's pre-restructure emit. Not validated against the current schema; kept as migration baseline.
Producer mapping
If you're emitting metadata.json from a producer (agate's
CreateMetadataJsonStep, Anton's Excel-Import dump, anything new),
read docs/producers.md. It maps the
producer-side flat fields (e.g. agate's parent_uuid,
creation_actors, scope_and_content) to the v0.1 wrapper shape.
Development
composer install composer test # PHPUnit composer analyse # PHPStan level 8
Consumers
- Anton — read-side validation
in
AgateImportHelper, write-side dump option inanton:import. - agate — pipeline emits
this shape in
CreateMetadataJsonStepand validates pre-finalize viaValidateInitializedStep.
License
MIT — see LICENSE.