wizdam / bagit-php
BagIt Library for LOCKSS Personal Lockers Network (PLN)
Fund package maintenance!
Requires
- php: ^8.4
- pear/archive_tar: ^1.4
This package is auto-updated.
Last update: 2026-04-25 15:21:06 UTC
README
A modern, PHP 8.x‑ready library for creating, manipulating and validating BagIt bags – the standard packaging format for digital preservation in the LOCKSS Personal Lockers Network (PLN).
🔐 Ensure integrity · 📋 Generate manifests · 🌐 Fetch remote content · ✅ Validate bags
📖 About the Project
BagIt PHP is a PHP implementation of the BagIt 0.96 specification (now standardized as RFC 8493). A "bag" is a self‑describing, integrity‑checked container that bundles digital payload files together with their metadata – making it ideal for archival storage, network transfer, and long‑term digital preservation.
This Wizdam Edition is a complete modernization of the original scholarslab/BagItPHP library, fully refactored for PHP 8.4+ compatibility, PSR‑4 autoloading, and seamless integration with Composer. It is purpose‑built to support the LOCKSS PLN (Private LOCKSS Network) ecosystem, where integrity‑verified content packages are critical for distributed preservation workflows.
✨ Key Features
| 🔧 Feature | 📝 Description |
|---|---|
| Bag Compilation | Create new bags from scratch with a single constructor call |
| Manifest Generation | Automatically generate manifest-{sha1,md5}.txt and tagmanifest-{sha1,md5}.txt |
| Tag File Handling | Read and write bagit.txt, bag-info.txt and custom tag files |
| Remote Fetching | Download files listed in fetch.txt via HTTP (optional) |
| Validation | Full integrity checking – checksum verification, required file/directory presence |
| Compression | Export bags as .tgz (gzip) or .zip archives |
| Dual Hash Support | Choose between SHA‑1 and MD5 hashing algorithms on the fly |
| Extended Bag Mode | Optional creation of bag-info.txt, fetch.txt, and tag manifests |
| PHP 8.4+ Native | Fully compatible with modern PHP – type‑safe, deprecation‑free |
| PSR‑4 Autoloading | PSR‑4 namespaced under Wizdam\BagIt for immediate Composer autoloading |
🚀 Installation
Via Composer (Recommended)
composer require wizdam/bagit-php
Manual Installation
Clone the repository and include the autoloader:
git clone https://github.com/mokesano/bagit-php.git
cd bagit-php
composer install
Requirements
- PHP ≥ 8.4
- PEAR
Archive_Tar≥ 1.4 (automatically pulled by Composer)
⚡ Quick Start
🆕 Create a New Bag
<?php require_once 'vendor/autoload.php'; use Wizdam\BagIt\BagIt; // Create a fresh bag in the "my-bag" directory $bag = new BagIt('my-bag'); // Add a file to the payload $bag->addFile('/path/to/source/document.pdf', 'document.pdf'); // Update checksums and manifests $bag->update(); // Export as a gzipped tarball $bag->package('my-bag'); // creates my-bag.tgz echo "✅ Bag created successfully!\n";
✨ Create an Extended Bag with Metadata
<?php require_once 'vendor/autoload.php'; use Wizdam\BagIt\BagIt; // Metadata to store in bag-info.txt $bagInfo = [ 'Source-Organization' => 'University of Virginia Library', 'Bagging-Date' => date('Y-m-d'), 'External-Description' => 'Preservation copy of a scholarly article' ]; // Create an extended bag with metadata and remote fetching enabled $bag = new BagIt( bag: 'extended-bag', validate: true, // run checksum validation after creation extended: true, // generate optional tag files fetch: true, // download files listed in fetch.txt bagInfoData: $bagInfo ); // Add files and register remote URLs $bag->addFile('localfile.xml', 'data/localfile.xml'); $bag->fetch->add('https://example.org/remote-file.pdf', 'data/remote-file.pdf'); // Add more metadata $bag->setBagInfoData('Internal-Sender-Identifier', 'archive-2025-001'); // Finalize the bag $bag->update(); $bag->package('extended-bag', 'zip'); // create extended-bag.zip
🔍 Validate an Existing Bag
<?php use Wizdam\BagIt\BagIt; // Open and validate an existing bag $bag = new BagIt('path/to/existing-bag.tgz'); // Check validity if ($bag->isValid()) { echo "✅ Bag is valid!\n"; // List all payload files foreach ($bag->getBagContents() as $file) { echo " 📄 $file\n"; } // Retrieve remote files if needed $bag->fetch->download(); } else { echo "❌ Validation errors found:\n"; foreach ($bag->getBagErrors() as $error) { echo " ⚠️ {$error['file']}: {$error['message']}\n"; } }
📚 API Overview
The library is organized into four source files, each providing a distinct capability:
| 📁 File | 🏷️ Class / Functions | 📋 Responsibility |
|---|---|---|
src/bagit.php |
BagIt, BagItException |
Core bag logic – creation, file management, validation, compression |
src/bagit_manifest.php |
BagItManifest |
Reading, writing, and validating manifest checksum files |
src/bagit_fetch.php |
BagItFetch |
Managing fetch.txt entries and downloading remote payload files |
src/bagit_utils.php |
rls(), rrmdir(), tmpdir(), endsWith(), filterArrayMatches() |
Filesystem and string utility functions |
Core Methods (Class BagIt)
| Method | Description |
|---|---|
__construct($bag, $validate, $extended, $fetch, $bagInfoData) |
Initialize a bag – create new or open existing |
addFile($src, $dest) |
Copy a file into the bag's data/ directory |
update() |
Regenerate all manifests and sanitize file names |
package($destination, $method) |
Compress to .tgz or .zip archive |
validate() |
Run full integrity checks (returns error list) |
isValid() |
Returns true if no validation errors exist |
getBagContents() |
List all files in the data directory |
getBagErrors() |
Return array of validation errors |
getBagInfo() |
Retrieve bag version, encoding, and hash algorithm |
setHashEncoding($algo) |
Switch between 'sha1' and 'md5' |
setBagInfoData($key, $value) |
Add or update metadata key–value pairs |
📦 BagIt – A Quick Primer
A BagIt "bag" is a hierarchical file layout:
my-bag/
├── bagit.txt # BagIt version and encoding
├── bag-info.txt # Metadata (extended bags)
├── manifest-sha1.txt # Checksums of payload files
├── tagmanifest-sha1.txt # Checksums of tag files
├── fetch.txt # Remote file URLs (optional)
└── data/ # Payload – the actual content
├── document.pdf
├── image.jpg
└── subdir/
└── another-file.xml
The specification ensures that each bag carries its own integrity verification data, making it self‑contained and suitable for archiving, network exchange, and long‑term preservation.
📘 Learn more: RFC 8493 – The BagIt File Packaging Format
🤝 Contributing
We welcome contributions! Please review our Contributing Guidelines before submitting a pull request.
Coding Standards:
- PHP code must follow PSR‑1 guidelines
- JavaScript follows Crockford's conventions
- All new features require updated unit tests and documentation
This project adheres to the Contributor Covenant Code of Conduct. By participating, you agree to uphold these standards.
🔒 Security
Security is taken seriously. Please do not publicly disclose any vulnerabilities.
- Reporting: Send vulnerability reports to security@sangia.org
- Acknowledgment: The lead maintainer will respond within 48 hours
- Advisories: Published at GitHub Security Advisories
Full details are in our Security Policy.
📄 License
This project is distributed under the GNU General Public License v3.0 (GPL‑3.0‑only). See LICENSE for the full text.
Note: The original BagItPHP library by the University of Virginia / Scholars' Lab was released under Apache License 2.0. This fork retains attribution headers in source files and is relicensed under GPL‑3.0‑only for the Wizdam Edition.
🙏 Acknowledgments
| 🏷️ Attribution | 🔗 Reference |
|---|---|
| Original Author | Wayne Graham, Eric Rochester – University of Virginia Scholars' Lab |
| Original Repository | scholarslab/BagItPHP (no longer maintained) |
| Wizdam Edition Maintainer | Rochmady (mokesano) |
| BagIt Specification | RFC 8493 – J. Kunze, J. Littman, E. Madden, J. Scancella |
| LOCKSS Program | lockss.org – Stanford University |
| Dependency | PEAR Archive_Tar |
Made with ❤️ for the digital preservation community
© 2026 Rochmady. Licensed under GPL‑3.0‑only.