flowpack/elasticsearch-contentrepositoryqueueindexer

Neos CMS Elasticsearch indexer based on a job queue

5.2.2 2022-05-29 13:13 UTC

README

Latest Stable Version Total Downloads

This package can be used to index a huge amount of nodes in Elasticsearch indexes. This package use the Flowpack JobQueue packages to handle the indexing asynchronously.

Topics

Installation and Configuration

You need to install the correct Queue package based on your needs.

Available packages:

Please check the package documentation for specific configurations.

The default configuration uses the FakeQueue, which is provided by the JobQueue.Common package. Note that with that package jobs are executed synchronous with the flow nodeindexqueue:build command.

Check the Settings.yaml to adapt based on the Queue package, you need to adapt the className:

Flowpack:
  JobQueue:
    Common:
      presets:
        'Flowpack.ElasticSearch.ContentRepositoryQueueIndexer':
          className: 'Flowpack\JobQueue\Common\Queue\FakeQueue'

If you use the doctrine package you have to set the tableName manually:

Flowpack:
  JobQueue:
    Common:
      presets:
        'Flowpack.ElasticSearch.ContentRepositoryQueueIndexer':
          className: 'Flowpack\JobQueue\Doctrine\Queue\DoctrineQueue'
      queues:
        'Flowpack.ElasticSearch.ContentRepositoryQueueIndexer':
          options:
            tableName: 'flowpack_jobqueue_QueueIndexer'
        'Flowpack.ElasticSearch.ContentRepositoryQueueIndexer.Live':
          options:
            tableName: 'flowpack_jobqueue_QueueIndexerLive'

Indexing

Batch Indexing

How to build indexing jobs

flow nodeindexqueue:build --workspace live

How to process indexing jobs

You can use this CLI command to process indexing job:

flow nodeindexqueue:work --queue batch

Live Indexing

You can disable async live indexing by editing Settings.yaml:

Flowpack:
  ElasticSearch:
    ContentRepositoryQueueIndexer:
      enableLiveAsyncIndexing: false

You can use this CLI command to process indexing job:

flow nodeindexqueue:work --queue live

Supervisord configuration

You can use tools like supervisord to manage long running processes. Bellow you can find a basic configuration:

[supervisord]

[supervisorctl]

[program:elasticsearch_batch_indexing]
command=php flow nodeindexqueue:work --queue batch
stdout_logfile=AUTO
stderr_logfile=AUTO
numprocs=4
process_name=elasticsearch_batch_indexing_%(process_num)02d
environment=FLOW_CONTEXT="Production"
autostart=true
autorestart=true
stopsignal=QUIT

[program:elasticsearch_live_indexing]
command=php flow nodeindexqueue:work --queue live
stdout_logfile=AUTO
stderr_logfile=AUTO
numprocs=4
process_name=elasticsearch_live_indexing_%(process_num)02d
environment=FLOW_CONTEXT="Production"
autostart=true
autorestart=true
stopsignal=QUIT

Update Instructions

Breaking change after an upgrade to 3.0

  • Previously the Beanstalk queue package was installed by default, this is no longer the case.

Breaking change after an upgrade to 5.0

  • The beanstalk queue configuration is removed. The FakeQueue is used if not configured to another queuing package.

License

Licensed under MIT, see LICENSE