neam / dna-project-base-stateless-file-management
Schema addition and helper traits that encapsulates the 12-factor-inspired [DNA Project Base](http://neamlabs.com/dna-project-base/) stateless file-handling logic
Requires
- php: >=5.6.5
- ext-iconv: *
- filestack/filestack-php: >=1.1.11
- league/flysystem: ~1.0.24
- league/flysystem-aws-s3-v3: ~1.0.13
- propel/propel: *@dev
Requires (Dev)
- codeception/codeception: ~2.2.4
This package is not auto-updated.
Last update: 2024-12-21 20:36:16 UTC
README
Schema addition and helper traits that encapsulates DNA Project Base stateless file-handling logic.
Instead of storing paths or uris to paths in the items of the data model, items that need to keep track of files will use a relation to a File item type, which in turn has relations to FileInstance items that specify where binary copies of the file are stored.
Features
- Abstracts away stateless file management in a PHP application
- Stores metadata about files available remotely, so that queries of files can be done without physical access to the files
- Uses the local filesystem only as a brief, single-transaction cache, for instance downloading a large file, operating on it, and storing the results of the operation in the database / uploading a modified version of the file.
- Uses Filestack.com as a secure remote file storage for files
- Enables the use of Filestack.com's widgets and JS SDK for browser-based file uploads
- Enables the use of Filestack.com's file conversion API
- Ability to push public files to S3 buckets so that they can be made available to others
Background and motive
The following quote from https://12factor.net/processes describes the motive behind this approach:
Twelve-factor processes are stateless and share-nothing. Any data that needs to persist must be stored in a stateful backing service, typically a database.
The memory space or filesystem of the process can be used as a brief, single-transaction cache. For example, downloading a large file, operating on it, and storing the results of the operation in the database. The twelve-factor app never assumes that anything cached in memory or on disk will be available on a future request or job – with many processes of each type running, chances are high that a future request will be served by a different process. Even when running only one process, a restart (triggered by code deploy, config change, or the execution environment relocating the process to a different physical location) will usually wipe out all local (e.g., memory and filesystem) state.
Design principles
Some principles upon which these helpers are built:
- Local file manipulation should be available simply by reading LOCAL_TMP_FILES_PATH . DIRECTORY_SEPARATOR . $file->getPath() as defined in getLocalAbsolutePath()
- The path to the file is relative to the storage component's file system and should follow the format $file->getId() . DIRECTORY_SEPARATOR . $file->getFilename() - this is the file's "correct path" and ensures that multiple files with the same filename can be written to all file systems
- Running $file->ensureCorrectLocalFile() ensures §1 and §2 (designed to run before local file manipulation, post file creation/modification time and/or as a scheduled process)
- File instance records tell us where binary copies of the file are stored
- File instances should (if possible) store it's binary copy using the relative path provided by $file->getPath(), so that retrieval of the file's binary contents is straightforward and eventual public url's follow the official path/name supplied by $file->getPath()
Current storage components handled by this trait:
- local (implies that the binary is stored locally)
- filestack (implies that the binary is stored at filestack)
- filestack-pending (implies that the binary is pending an asynchronous task to finish, after which point the instance will be converted into a 'filestack' instance)
- filepicker (legacy filestack name, included only to serve filepicker-stored files until all have been converted to filestack-resources)
- public-files-s3 (implies that the binary is stored in Amazon S3 in a publicly accessible bucket)
Installation
-
Copy file and file_instance table schemas to your schema.xml and generate the new propel models
-
Add the
\neam\stateless_file_management\FileTrait
trait to File modelclass File extends BaseFile { use \neam\stateless_file_management\FileTrait; }
-
Wherever your data model requires files, add foreign key relationships to the file table, see example_item_type in schema.xml for an example.
-
Set the following constants in your app
LOCAL_TMP_FILES_PATH
- to a path where the tenant's local temporary files can be written and read by the php process
DATA
- Dataset id which corresponds to the database being used in the current request. Enables sharing of the same backing file services for multiple tenants
Optionally also set the following:
PUBLIC_FILES_S3_BUCKET
- Amazon S3 bucket where publicly shared files are to be stored, prefixed with 's3://'
PUBLIC_FILES_S3_REGION
- The region of the S3 bucket
PUBLIC_FILES_S3_PATH
- Directory within amazon S3 bucket where publicly shared files are to be stored. Leave empty if the whole bucket is reserved for storing files. (Example: "/files/", in which case tenant files will be published to "/files/%DATA/" instead of "/%DATA")
PUBLIC_FILES_S3_HTTP_HOST
- Hostname to where the bucket is available via http and https (Defaults to the bucket name with 's3://' stripped off from the beginning)
PUBLIC_FILE_UPLOADERS_ACCESS_KEY
- Amazon S3 access key for access to the above bucket
PUBLIC_FILE_UPLOADERS_SECRET
- Amazon S3 access secret for access to the above bucket
FILESTACK_API_KEY
- Filestack.com API key
FILESTACK_API_SECRET
- Used to sign URLs for temporary access to secured Filestack resources
CONTEXT_IO_API_SECRET
- Used to sign URLs for temporary access to secured Context.IO resources
APPVHOST
- The host name that the current deployment uses (to be able to reference to REST api endpoints)
Note: DNA Project Base uses PHP App Config to set constants based on expected config environment variables.
Usage
Example of using a remotely stored file locally:
$exampleItemType = \propel\models\ExampleItemTypeQuery::create()->findPk(1);
$fooFile = $exampleItemType->getFileRelatedByFooFileId();
$fooFileName = $fooFile->getFilename();
$mimeType = $fooFile->getMimetype();
switch ($mimeType) {
case 'text/plain':
case 'text/csv':
case 'application/vnd.ms-excel':
case 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet':
// Actually downloads the file
$fooFilePath = $fooFile->getAbsoluteLocalPath();
// Example of using the file contents in a PHP library that expects an absolute path to a local file
$cellData = SpreadsheetDataFileHelper::getSpreadsheetCellData($fooFilePath);
// ...
break;
default:
throw new Exception("Unsupported mimetype ('$mimeType')");
break;
}
Example of creating a file locally and making sure it is stored remotely as a public file:
$exampleItemType = \propel\models\ExampleItemTypeQuery::create()->findPk(1);
// Create new File record with sample file contents as a local file instance
$fooFile = new File();
$fooFile->setFilename('foo.txt');
$fooFile->putContents("text file contents\ntext file contents");
// Sync to public files
$fooFile->ensureRemotePublicFileInstance();
// Save in the parent item only after we have stored a remote copy of the file contents
$exampleItemType->setFileRelatedByFooFileId($fooFile);
$exampleItemType->save();
// Access public url
$publicUrl = $fooFile->fileInstanceAbsoluteUrl($fooFile->getFileInstanceRelatedByPublicFilesS3FileInstanceId());