acdh-oeaw/arche-metadata-crawler

Script and library for checking and generating ARCHE metadata in ACDH schema

0.2.1 2024-02-20 13:19 UTC

This package is auto-updated.

Last update: 2024-02-20 13:20:22 UTC


README

Functionality

A set of scripts:

  • Merging metadata of a collection from inputs in various formats
  • Validating the merged metadata
  • Generating XLSX metadata templates based on the current ontology (see the horizontal metadata files in metadata formats description)

used for the metadata curation during ARCHE ingestions.

Installation

Locally

  • Install PHP 8 and composer
  • Run:
    composer require acdh-oeaw/arche-metadata-crawler

As a docker image

On repo-ingestion@hephaistos

Nothing to be done. It is installed there already.

Usage

(For a full walk-trough using repo-ingestion@hephaistos and the Wollmilchsau test collection please look here)

On repo-ingestion@hephaistos

  • Generating and validaing the metadata:
    /ARCHE/vendor/bin/arche-crawl-meta \
      <pathToMetadataDirectory> \
      <outputTtlPath> \
      <basePathOfTheCollection> \
      <idPrefix>
    e.g.
    /ARCHE/vendor/bin/arche-crawl-meta \
      /ARCHE/staging/GlaserDiaries_16674/metadata/input \
      /ARCHE/staging/GlaserDiaries_16674/metadata/metadata.ttl \
      /ARCHE/staging/GlaserDiaries_16674/data
      https://id.acdh.oeaw.ac.at/glaserdiaries
  • Creating metadata templates:
    /ARCHE/vendor/bin/arche-create-metadata-template \
      <pathToDirectoryWhereTemplateShouldBeCreated> \
      all
    e.g. to create templates in the current directory
    /ARCHE/vendor/bin/arche-create-metadata-template . all

Locally

  • Generating and validaing the metadata:
    vendor/bin/arche-crawl-meta \
      --filecheckerOutput <pathTo_fileList.json_generatedBy_repo-filechecker> \
      <pathToCollectionData> \
      <pathToTargetMetadataFile>
    e.g.
    vendor/bin/arche-crawl-meta \
      metaDir \
      metadata.ttl
      `pwd`/data
      https://id.acdh.oeaw.ac.at/myCollection
  • Creating metadata templates:
    vendor/bin/arche-create-metadata-template \
      <pathToDirectoryWhereTemplateShouldBeCreated> \
      all
    e.g. to create templates in the current directory
    bin/arche-create-metadata-template . all

Remarks:

  • To get a list of all available parameters run:
    vendor/bin/arche-crawl-meta --help
    vendor/bin/arche-create-metadata-template --help

As a docker container

  • Creating metadata templates: Run a container mounting directory where templates should be created under /mnt inside the container:
    docker run \
      --rm -u `id -u`:`id -g`\
      -v <pathToDirectoryWhereTemplateShouldBeCreated:/mnt \
      acdhch/arche-metadata-crawler createTemplate all
    e.g. to create the templates in the current directory
    docker run \
      --rm -u `id -u`:`id -g` -v `pwd`:/mnt acdhch/arche-metadata-crawler createTemplate all