nickjbedford/tagcache

Provides file-based caching of generated/fetched data or text using keys based on objects, allowing for unified key naming conventions and efficient clearing of caches.

Installs: 18

Dependents: 0

Suggesters: 0

Security: 0

Stars: 0

Watchers: 0

Forks: 0

Open Issues: 0

pkg:composer/nickjbedford/tagcache

0.5.0 2025-10-16 04:39 UTC

This package is auto-updated.

Last update: 2025-10-16 04:39:28 UTC


README

TagCache is a PHP data and text file-based caching library designed to make it simple to generate and store serialized data and text/HTML content for highly efficient retrieval when paired with high-performance memory-cached file systems such as on Linux. Cache retrieval times (including deserialization) can be submillisecond on modern SSD-based infrastructure using native file systems.

Cache names are created based on tagged keys, such as object type/ID or date ranges, making it easy to manage and invalidate caches as needed. Symbolic links are used to create a tag-based directory structure for easy clearing of canonical cache files.

Cache keys are created from a specific name combined with tags againsts objects or dates related to that cache. When an object is updated, all canonical cache files related to that object can be cleared easily via the symlink directory structure.

For example:

A cache key named "row-listing" related to "order #123" and "account #21" would create the canonical cache file of feb9ed37caf09e97ec0f49f65fccad64.cache which is typically stored as its md5 hash :

cache-dir/
└── cache/
    └── en/
        └── feb9ed37caf09e97ec0f49f65fccad64.cache
└── tags/
    └── en/
        └── order/    
            └── 123/
                └── feb9ed37caf09e97ec0f49f65fccad64 [symlink] -> <cache-dir>/cache/en/feb9ed37caf09e97ec0f49f65fccad64.cache
        └── account/    
            └── 21/
                └── feb9ed37caf09e97ec0f49f65fccad64 [symlink] -> <cache-dir>/cache/en/feb9ed37caf09e97ec0f49f65fccad64.cache

Performance Characteristics

The following performance benchmarks are based on tests run on an 2020 iMac Retina 5K with the following specifications:

Development Workstation

  • macOS Sequoia 15.7 (24G222)
  • 3.3GHz Intel Core i5 6-Core processor
  • Apple 1TB SSD (APPLE SSD AP1024N Media)
    • Benchmarked at ~2.5 GB/s read and write speed

Amazon EC2 Web Server

  • Ubuntu Linux 24.04.3 LTS
  • Amazon EC2 t3.small instance
  • 2 vCPUs, 2 GiB memory
  • EBS gp3 volume SSD
    • 3000 IOPS
    • 125 MiB/s

Cache Retrieval (Hit)

TagCache is optimized for read-heavy workloads where cache retrieval speed is critical. By leveraging modern file systems and fast storage, the cache retrieval process involves the following steps:

  1. Construct the cache key declaration.
  2. Generate the canonical MD5 cache key and full path.
  3. Check if the file exists.
  4. Check if the existing file modification time is within the valid cache duration.
  5. Acquire a shared lock on the file for reading.
  6. Read the file contents into a string variable.
  7. Close the file and release the lock.
  8. Deserialize the string into an object (if required).

This entire process for a small multi-property PHP object can take as little as 20-30 microseconds (including deserialization) when the operating system has the file cached in memory. See tests/PerformanceTests.php for benchmarks.

Cache Generation (Miss)

Cache generation introduces some overhead due to the need to create symbolic links as well as write the cache file. The process involves the following steps:

  1. Construct the cache key declaration.
  2. Generate the canonical MD5 cache key and full path.
  3. Acquire an exclusive lock on the file to be cached, to prevent race conditions.
  4. Generate and serialize the data value to be cached using the supplied generator callable.
  5. Write the serialized data to the cache file.
  6. Close the file and release the lock.
  7. For each tag, create the necessary directories and symbolic links to the cache file.

Despite all of these steps, cache storage and symbolic link creation can still be performed in as low as 0.5 milliseconds (500μs) for a small multi-property PHP object on modern SSD-based infrastructure with a modern file system. See tests/PerformanceTests.php for benchmarks.

Benchmark Results

The benchmark generates between 5000-5100 randomised cache files for testing cache hits. Each file is then read back in a loop to measure performance. This ensures that the operating system has the files cached in memory, simulating a real-world scenario where frequently accessed cache files are kept in memory by the OS. See tests/PerformanceTests.php for the benchmark code.

The results are as follows:

Development Workstation

5009x cache hits took 0.163396 seconds
Microseconds per cache hit: 32.62 μs

Amazon EC2 Web Server

5061x cache hits took 0.10656 seconds
Microseconds per cache hit: 21.055 μs

File Locking

TagCache uses file locking to ensure data integrity during cache writes. Shared locks are used for reading cache files, while exclusive locks are used for writing. This prevents race conditions and ensures that cache files are not corrupted during concurrent access. It also ensures that multiple processes do not attempt to generate the same cache simultaneously, which could lead to redundant work and potential inconsistencies.

Lock acquisition is performed using a non-blocking call to allow for a default timeout period of 30 seconds. This means that if a process cannot acquire the lock within this timeframe, it will fail gracefully rather than hanging indefinitely.