webandco/neos-asset-usage-cache

There is no license information available for the latest version (0.2) of this package.

Improve asset usage query performance

0.2 2023-04-03 14:19 UTC

This package is auto-updated.

Last update: 2024-05-03 16:46:20 UTC


README

The package provides a strategy to cache the asset usage results. Additionally, the database query to fetch the related nodes is simplified and can be cached by the database server too.

Installation

Install the package with composer. It is recommended to use the package only in development environments.

composer require webandco/neos-asset-usage-cache

Features

This package provides various features to improve performance of querying the asset usage.

The alternative package https://github.com/punktDe/elastic-assetusageinnodes also solves the performance problem using an elasticsearch index. The main differences to the elasticsearch package are:

  1. This package, webandco/neos-asset-usage-cache, only needs the database you use for NEOS. There is no need for elasticsearch.
  2. We assume the cache to be more short-lived, whereas the elasticsearch package is basically an index of the nodes and updated accordingly.
  3. We also use a different method to inject our own AssetUsageStrategy. Via the Settings.yaml you can then disable/enable Strategies as you like. Thus you are more maneuverable and create add/extend your own AssetUsageStrategy - which is currently more complicated by default in NEOS.
  4. You can disable the cache completely in Settings.yaml if you want and just enable it on purpose, e.g. in a long running command controller of yours.
  5. You can also disable the cache itself and just let the database cache the results. This basically means that the query to determine the asset usage is modified. This approach can be very fast and provides correct results all the time. The drawback is that you need to reconfigure your database to increase query caches which can cause decreased query performance because of internal DB locks. More details and links down below.

Configurable asset usage strategy

Currently the asset usage is generated by iterating over all classes implementing the Neos\Media\Domain\Strategy\AssetUsageStrategyInterface and sum up or merge the results from the implementations.

In a default setup, it is not possible to enable/disable strategies, making it more complicated to replace the current default strategy Neos\Neos\Domain\Strategy\AssetUsageInNodePropertiesStrategy

This package uses an aspect for Neos\Media\Domain\Service\AssetService->getUsageStrategies() and the configuration to disable or enable strategies. Thus one can easier add or remove an AssetUsageStrategy.

Simplified database query

The default asset usage query in AssetUsageInNodePropertiesStrategy adds multiple LIKE conditions for the properties column to query the nodeData table.

The more an asset is in use, the more image variants exist, the more LIKE conditions are added and this slows the query down.

In this package we implemented an alternative approach, by just querying all rows which contain __identifier or asset: and then filter the results using PHP's preg_match_all. So the workload is moved away from the database and into PHP.

To make use of database query caching you likely need to increase query cache sizes, e.g.

SET GLOBAL query_cache_size = 500*1024*1024;
SET GLOBAL query_cache_limit = 200*1024*1024;

Keep in mind, that by increasing the database cache sizes, the database can actually be slowed down, see https://dev.mysql.com/doc/refman/5.6/en/query-cache.html:

Be cautious about sizing the query cache excessively large,
which increases the overhead required to maintain the cache,
possibly beyond the benefit of enabling it.
Sizes in tens of megabytes are usually beneficial.
Sizes in the hundreds of megabytes might not be. 

For futher details, see: https://haydenjames.io/mysql-query-cache-size-performance/

If you increase database query cache sizes, the query used in this package is cached, and there is no need for php caching.

Cached asset usage

The database result is managed via a cache, which is updated on signals for node changes and by default flushed every hour.

Performance

The database query used in this package, takes around 4 sec for for around 300.000 nodes in the nodeData table. The more nodes the slower this query becomes.

This database query is independent of any given asset or usage count and performs the same for every asset.

If the cache is populated, the methods provided by AssetUsageCacheStrategy take around 6ms per asset nearly independent of the usage count.

Configuration

The default configuration, disables the AssetUsageInNodePropertiesStrategy and enables the provided AssetUsageCacheStrategy.

Additionally the queryCache is enabled and realTimeUpdate is enabled.

Webandco:
  AssetUsageCache:
    assetUsageStrategies:
      AssetUsageInNodePropertiesStrategy:
        className: 'Neos\Neos\Domain\Strategy\AssetUsageInNodePropertiesStrategy'
        disable: true
      AssetUsageCacheStrategy:
        className: 'Webandco\AssetUsageCache\Domain\Strategy\AssetUsageCacheStrategy'
        disable: false
    queryCache:
      disable: false
      realTimeUpdate: true

Use cases

Database query cache only

To make use of the database query cache but not using the PHP cache, just disable the cache in the configuration

Webandco:
  AssetUsageCache:
    queryCache:
      disable: true

Enable cache on purpose

In case the cache should be not used in the media browser, but during a custom command line action, the cache needs to be disabled via configuration, see above.

In the command line action you can then enable the cache


    /**
     * @Flow\Inject
     * @var AssetCacheService
     */
    protected $assetCacheService;

....

    public function reConfigureAssetCache(){
        $this->assetCacheService->setRealTimeUpdate(false);
        $this->assetCacheService->setCacheDisabled(false);
        // eventually just flush the cache - globally
        $this->assetCacheService->flush();
    }

Thus the cache is used during the run of the current command line action which can be handy if you need to iterate over all assets and check their usage count.