lucinda / db
Lucinda DB: Pure PHP Tag-Based Key-Value Store
Installs: 16 975
Dependents: 0
Suggesters: 0
Security: 0
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Requires
- php: ^8.1
- ext-simplexml: *
- ext-spl: *
- ext-xml: *
Requires (Dev)
- lucinda/unit-testing: ^2.0
This package is auto-updated.
Last update: 2025-03-18 14:24:28 UTC
README
Table of contents:
About
Lucinda DB is serverless KEY-VALUE store originally designed to help developers cache results of resource-intensive SQL queries based to criteria (aka TAG) query depends on. It is different from other KV stores by having KEYs self-generate based on TAGs query result depended on and VALUEs saved as individual JSON files (instead of RAM) named by KEY in one/more SCHEMAs. This brings following advantages:
- ability of working without a server: operating system on host machine, already optimized to manipulate files, becomes the server
- ability of being platform agnostic: a database specification that can be implemented in any programming language on any operating system
- key standardization: value of a KEY is generated according to a predictable rule based on value of TAGs it depends on
- no entry duplication: combination of TAGs will always be unique
- easy maintenance: entries can be queried by TAG, something impossible in standard KV stores that purely rely on RAM hash tables indexed by cumulative key
- portability: to transfer/backup database, it is as easy as copying schema folder(s)
- scalability: ability of database to be distributed on multiple disks in real time
API relies on an interplay of following concepts:
- DATA: this is value of data to cache (eg: result of your SQL query or query combinations)
- TAG: criteria based on whome DATA was generated (eg: "users", "roles")
- KEY: the key in KV store, whose name was generated automatically based on TAGs it depends on (eg: "roles_users")
- VALUE: the value in KV store present as json-ed DATA saved on disk as a separate file in SCHEMA folder named by KEY
- SCHEMA: folders/disks in which KV entries are stored
Data
This is result to cache as VALUE in KV store, convertible to JSON.
Example query:
SELECT t1.name AS user, t3.name AS role FROM users AS t1 INNER JOIN users__roles AS t2 ON t1.user_id = t2.user_id INNER JOIN roles AS t3 ON t2.role_id = t3.id WHERE t1.active = 1
Processed into following PHP array structure:
$data = ["John Doe"=>["Administrator"], "Jane Doe"=>["Assistant Manager", "Team Leader"]];
Tag
A tag corresponds to name of criteria based on whom DATA was generated (eg: "users" and "roles" for example above). A tag's value must obey following requirements:
- must be lowercase
- can only contain a-z0-9 characters
- - sign is allowed as separator of multi-word names
Key
A key is unique identifier of DATA in KV store named by combination of TAGs data depended on (eg: "roles_users" above) . To make things easier for maintenance, each finite combination of TAGs results into a single KEY, regardless of how they were ordered by caller! The rules based on whom key name is calculated are:
Key creation is encapsulated by Lucinda\DB\Key class. Usage example:
$object = new Lucinda\DB\Key(["users", "roles"]); $key = $object->getValue(); // key will be "roles_users"
Value
A value is a JSON-ed representation of DATA saved as .json file named by KEY inside SCHEMA folder according to following rules:
- DATA must be json encodable
Value operations are encapsulated by Lucinda\DB\Value class. Usage example:
$key = new Lucinda\DB\Key(["users", "roles"]); // initializes KEY $value = new Lucinda\DB\Value("/usr/local/share/db", $key->getValue()); // instances VALUE by KEY and SCHEMA $value->set($data); // saves DATA by KEY, creating a KEY.json file within SCHEMA
Load Balancing
For performance, consistency or scalability reasons, users can opt for VALUEs to be load balanced across multiple SCHEMAs. This is done by using Lucinda\DB\ValueDriver instead, which wraps Lucinda\DB\Value in order to insure that:
- all writes (set/delete) are evenly distributed among replicas
- all reads are done from random replica
- all race condition operations (increment/decrement) will be done in first replica, then distributed to all others
Usage example:
$key = new Lucinda\DB\ValueDriver(["schema1", "schema2"], ["users", "roles"]); //instances VALUE by KEY and SCHEMAs $value->set($data); // saves DATA by KEY, creating a KEY.json file within all SCHEMAs
Schema
A schema is simply the folder where VALUEs are saved. A schema can be a single folder or an array of replicas that can be located on different disks from same server or even different servers (eg: via symlinks). Class Lucinda\DB\Schema encapsulates operations one can perform on a single schema. Usage example:
$key = new Lucinda\DB\Schema("schema1"); // initializes SCHEMA $value->deleteAll(); // deletes all VALUEs in SCHEMA
Load Balancing
For performance, consistency or scalability reasons, users can opt for SCHEMAs to be load balanced and evenly distributed. This is done via Lucinda\DB\SchemaDriver class that insures all schema operations are automatically reflected in replicas. Usage example:
$key = new Lucinda\DB\SchemaDriver(["schema1", "schema2"]); // initializes SCHEMAs $value->deleteAll(); // deletes all VALUEs in all SCHEMAs
Installation
First choose a folder, then write this command there using console:
composer require lucinda/db
Then create at least one SCHEMA and follow configuration guide in order to set it/them in XML required to configure API. Once latter is done, you will be able to query database using Lucinda\DB\Wrapper object. Usage example:
require 'vendor/autoload.php'; $object = new Lucinda\DB\Wrapper(XML_CONFIGURATION_PATH, DEVELOPMENT_ENVIRONMENT); $value = $wrapper->getEntryDriver(["users", "roles"]); $value->set(["John Doe"=>["Administrator"], "Jane Doe"=>["Assistant Manager", "Team Leader"]]);
API uses composer autoload, requires PHP 8.1+ and has no external dependencies except SimpleXML, DOM and SPL extensions. All code inside is 100% unit tested and developed on simplicity and elegance principles!
Configuration
To configure this API you must have a XML with a lucinda_db tag inside:
<lucinda_db> <{ENVIRONMENT}> <schemas> <schema>{SCHEMA}</schema> ... </schemas> </{ENVIRONMENT}> ... </lucinda_db>
Where:
- lucinda_db: (mandatory) holds database configuration
- {ENVIRONMENT}: name of development environment (to be replaced with "local", "dev", "live", etc)
- schemas: (mandatory) stores list of schemas to be load balanced, each as schema tag
- schema: (mandatory) holds path to a single schema that takes part of load balancing scheme
- schemas: (mandatory) stores list of schemas to be load balanced, each as schema tag
- {ENVIRONMENT}: name of development environment (to be replaced with "local", "dev", "live", etc)
Example:
<lucinda_db> <local> <schemas> <schema>C:\db</schema> </schemas> </local> <live> <schemas> <schema>/usr/local/share/db</schema> <schema>/mnt/remote/db</schema> </schemas> </live> </lucinda_db>
Querying
Now that XML is configured, you can query entries or schemas via Lucinda\DB\Wrapper class and its methods available:
Method | Arguments | Returns | Description |
---|---|---|---|
__construct | string $xmlFilePath, string $developmentEnvironment | void | Sets location of configuration file along with development environment for later querying |
getEntryDriver | string[] $tags | Lucinda\DB\ValueDriver | Gets object to query load-balanced entry list of tags key depends on |
getSchemaDriver | void | Lucinda\DB\SchemaDriver | Gets object to query distributed schemas |
Querying Entries
Usage example employing Lucinda\DB\ValueDriver:
$object = new Lucinda\DB\Wrapper("/var/www/html/myapp/configuration.xml", "local"); $driver = $wrapper->getEntryDriver(["users", "roles"]); $driver->set(["John Doe"=>["Administrator"], "Jane Doe"=>["Assistant Manager", "Team Leader"]]);
Using Lucinda\DB\Value directly is useful only if your app requires no load balancing, lies in one development environment only and needs only primitive maintenance. Usage example:
$key = new Lucinda\DB\Key(["users", "roles"]); $value = new Lucinda\DB\Value("/usr/local/share/db", $key->getValue()); $value->set(["John Doe"=>["Administrator"], "Jane Doe"=>["Assistant Manager", "Team Leader"]]);
Both Lucinda\DB\Value and its Lucinda\DB\ValueDriver wrapper abide to a single contract defining blueprints of VALUE operations through interface Lucinda\DB\ValueOperations, which comes with following prototype methods:
Method | Arguments | Returns | Description |
---|---|---|---|
set | mixed $value | void | Sets entry value |
get | void | mixed | Gets entry value |
exists | void | bool | Checks if entry exists |
increment | int $step = 1 | int | Increments existing entry value using locking. Throws Lucinda\DB\LockException on race condition! See: Handling Race Conditions |
decrement | int $step = 1 | int | Decrements existing entry value using locking. Throws Lucinda\DB\LockException on race condition! See: Handling Race Conditions |
delete | void | void | Deletes entry |
As one can see above, in case developers opt using Lucinda\DB\Value directly, class Lucinda\DB\Key needs to be instanced manually. It encapsulates creation of keys based on tags and defines following public methods:
Method | Arguments | Returns | Description |
---|---|---|---|
__construct | string[] tags | void | Compiles key based on TAGs. Throws Lucinda\DB\KeyException if TAGs break naming rules |
getValue | string | Gets entry key compiled above. |
Querying Schemas
Usage example employing Lucinda\DB\SchemaDriver:
$object = new Lucinda\DB\Wrapper("/var/www/html/myapp/configuration.xml", "local"); $driver = $wrapper->getSchemaDriver(); $driver->deleteAll();
Using Lucinda\DB\Schema directly is useful only if your app requires no load balancing, lies in one development environment only and needs only primitive maintenance. Usage example:
$object = new Lucinda\DB\Schema("/usr/local/share/db"); $object->deleteAll();
Both Lucinda\DB\Schema and Lucinda\DB\SchemaDriver abide to a single contract defining blueprints of SCHEMA operations through interface Lucinda\DB\SchemaOperations, which comes with following prototype methods:
Method | Arguments | Returns | Description |
---|---|---|---|
create | void | bool | Creates schema(s) |
exists | void | bool | Checks if schema(s) exist |
getAll | void | string[] | Gets all keys in schema(s) |
getByTag | string $tag | string[] | Gets all keys in schema(s) matching tag |
getCapacity | void | int | Gets number of entries in schema(s) |
deleteAll | void | int | Deletes all entries in schema(s) |
drop | void | bool | Drops all schema(s), deleting all entries in the process |
Maintenance
Modern operating systems allow up to 4,294,967,295 files in one folder but you shouldn't go anywhere near that value! Just like MySQL running out of disk space, LucindaDB may additionally run out of VALUEs in SCHEMA, something that usually happens only when specialization is used at massive scale.
To fix such cases class Lucinda\DB\DatabaseMaintenance class was created, whose purpose is to do automated maintenance through following public methods:
Method | Arguments | Returns | Description |
---|---|---|---|
__construct | string $xmlFilePath, string $developmentEnvironment | void | Sets location of configuration file along with development environment for later querying |
checkHealth | float $maximumWriteDuration | Lucinda\DB\SchemaStatus[string] | Performs health checks of all load balanced schemas and returns results as status by schema. See: Checking Schemas Health |
plugIn | string $schema | void | Plugs in schema to load-balanced DB without down times See: Plugging Schema In |
plugOut | string $schema | void | Plugs out schema from load-balanced DB without down times See: Plugging Schema Out |
deleteByTag | string $tag | int | Deletes all DB entries whose key matches TAG See: Deleting Entries by Tag |
deleteUntil | int $secondsBeforeNow | int | Deletes all DB entries whose last modified time is more than #seconds old See: Deleting Entries by Time |
deleteByCapacity | int $minCapacity, int $maxCapacity | int | Deletes all DB entries by keeping schema at fixed max capacity range based on entry last modified time. See: Deleting Entries by Capacity |
By Cron Job
This class should be used via a cron job whose periodicity depends on the chance of your project to get filled! Example:
require 'vendor/autoload.php'; $maintenance = new Lucinda\DB\DatabaseMaintenance(XML_CONFIGURATION_PATH, DEVELOPMENT_ENVIRONMENT); // checks schema health and plugs out unresponsive ones $statuses = $maintenance->checkHealth(); foreach ($statuses as $schema=>$status) { if (in_array($status, [DatabaseMaintenance::STATUS_OFFLINE, DatabaseMaintenance::STATUS_UNRESPONSIVE])) { $maintenance->plugOut($schema); } } // performs delete of all entries older than one day $maintenance->deleteUntil(time()-(24*3600));
By Console Command
If maintenance involves just one operation it can be done without programming by calling client.php file bundled in API root. Console syntax:
php PATH_TO_CLIENT_PHP OPERATION ARGUMENTS
Where:
- PATH_TO_CLIENT_PHP: absolute location of client.php file (example: /var/www/html/mysite/vendor/lucinda/db/client.php)
- OPERATION: name of Lucinda\DB\DatabaseMaintenance method (example: deleteUntil)
- ARGUMENTS: arguments to call method above, separated by space (example: 3600)
The greatest advantage of this solution is that it allows non-programmers (devops) to perform maintenance from command line. Example:
php /var/www/html/mysite/vendor/lucinda/db/client.php plugIn /my/new/schema
Advanced Guide
Specializing Keys
Sometimes, different VALUEs need to be produced for same TAG combination. This requires us to have different keys, while abiding to principles described in KEY section at the same time. Solution is to add an extra specializer tag (eg: MD5 checksum of query) when creating key:
$object = new Lucinda\DB\Key(["users", "roles", md5($query)]); $key = $object->getValue(); // key will be "54ed347f362bb056e4d6db0477bf19c9_roles_users"
As a rule, specialization should be avoided as much as possible, since it enlarges database and has a duplication potential (for example a simple extra space in query above would generate another key)!
Handling Race Conditions
A complicated problem in all databases is managing race conditions. What happens when a increment or decrement operation is ran in parallel? Let's imagine DATA was 8 and increment is attempted at same moment Z by users X and Y:
# user X increments value at moment Z $value->increment(1); # user Y increments value at same moment Z $value->increment(1);
Will end result be 10, as expected? The answer is no, because both processes got 8 to increment at same time! This situation is called a "race condition" and the only solution is to stack writes on that entry instead of letting them run in parallel.
For increment/decrement VALUE operations respective file is locked for writes and unlocked only when value update completes. If a concurrent process tries to write to a still locked file, a Lucinda\DB\LockException is thrown. Instead of letting exception bubble, developers can catch it and retry after delay:
try { $value->increment(1); } catch(Lucinda\DB\LockException $e) { usleep(100); $value->increment(1); }
Checking Schemas Health
Method checkHealth @ class Lucinda\DB\DatabaseMaintenance is to be used in checking the state of each replica SCHEMA and produce a Lucinda\DB\SchemaStatus. The algorithm by which statuses are produced is:
- if folder (schema) doesn't exist, status is OFFLINE
- otherwise, if files can't be written to schema, status is UNRESPONSIVE
- otherwise, if file writes take longer than maximumWriteDuration, status is OVERLOADED
- otherwise, status is ONLINE
The end result of all checks will be returned as an array where key is schema and value is status found.
Plugging Schema In
Method plugIn @ class Lucinda\DB\DatabaseMaintenance is to be used for plugging a SCHEMA to replicas without database down times. The algorithm used is:
- if SCHEMA doesn't exist or it is already plugged, a Lucinda\DB\ConfigurationException is thrown
- copies all files from first replica to SCHEMA
- plugs in SCHEMA into XML
- copies all remaining files from first replica to SCHEMA that may have been inserted to former as initial copy process was running
Plugging Schema Out
Method plugOut @ class Lucinda\DB\DatabaseMaintenance is to be used for plugging out a SCHEMA to replicas without database down times. The algorithm used is:
- if SCHEMA doesn't exist or it is not plugged, a Lucinda\DB\ConfigurationException is thrown
- SCHEMA is removed from XML, which insures no further writes will occur
- all files are deleted in SCHEMA
Deleting Entries by Tag
Method deleteByTag @ class Lucinda\DB\DatabaseMaintenance is to be used for removing all VALUEs whose KEY matches TAG from all replicas. The algorithm used is:
Deleting Entries by Time
Method deleteUntil @ class Lucinda\DB\DatabaseMaintenance is to be used for removing all VALUEs whose last modified time is more than $secondsBeforeNow old. The algorithm used is:
- iterates entries in random SCHEMA replica
- if entry VALUE last modified time is more than $secondsBeforeNow old
Deleting Entries by Capacity
Method deleteByCapacity @ class Lucinda\DB\DatabaseMaintenance is to be used to insure number of entries in database doesn't exceed $maxCapacity and, if it does, shrink it to $minCapacity by having older entries removed. The algorithm used is:
- iterates entries in random SCHEMA replica
- records entry KEY to a fixed capacity SplMaxHeap sorted by VALUE's last modified time
- if heap reached $maxCapacity, pops head and deletes entry until $minCapacity is reached
Avoiding API Disadvantages
This API, being disk based, comes with its own disadvantages compared to standard RAM-based KV stores:
- slightly reduced speed: hard drives will always be slower than RAM, but considering how fast SSDs are this won't be a problem unless your app has very high thoroughput
- no expiration for entries: all entries inside, being separate files, persist forever unless specifically deleted. This can be solved by using Lucinda\DB\DatabaseMaintenance!
- not suitable for temporary/volatile data: if your app expects database entries to be volatile (changing randomly) and specializing keys is required at massive scale, standard RAM-based KV stores are highly recommended
- requires daemonized maintenance: in case volatility is expected, a program must periodically remove old files in order prevent disk(s) getting full. . This can be solved by using Lucinda\DB\DatabaseMaintenance!
It is always up for developers to decide which KV store model fits their application the best! More often than not you will need to employ multiple stores (eg: LucindaDB and Redis) to cover all usage cases (one for persistent query-able data, the other for volatile data).
Usage Examples
To see usage examples, these unit tests should be enough:
- ValueDriverTest: testing Lucinda\DB\ValueDriver operations
- SchemaDriverTest: testing Lucinda\DB\SchemaDriver operations
- DatabaseMaintenanceTest: testing Lucinda\DB\DatabaseMaintenance operations